Using Biomart other than Ensembl
2
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.2 years ago
Hi guys, I'm new on R and Bioconductor packages so my question can sounds a little basics but I really could not figure out how to use a database from NCBI in BiomaRt. I'm working on RNA-Seq reads to perform DE analysis and I'm interested in Bos taurus database from NCBI version UMD3.1. So my question is: how to choose the bovine UMD3.1 from NCBI in BiomaRt? Or the best way to solve this would be to perform the aligment using the ensembl version? Just to make me clear I can't find any NCBI databases when I type: > library("biomaRt") > listMarts() If I take a look at ???ensembl??? [ensembl=useMart("ensembl")] so I can see the btaurus_gene_ensembl dataset. However, as I aligned my reads against a NCBI version when I tried count the reads, it did not work ('cause they have different identifiers I guess). The manual shows a short example using a wormDb but it did not help so much. -- output of sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq2_1.2.10 RcppArmadillo_0.4.000.2 Rcpp_0.11.0 Rsamtools_1.14.3 Biostrings_2.30.1 GenomicRanges_1.14.4 [7] XVector_0.2.0 IRanges_1.20.6 BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] AnnotationDbi_1.24.0 BSgenome_1.30.0 Biobase_2.22.0 DBI_0.2-7 GenomicFeatures_1.14.2 RColorBrewer_1.0-5 [7] RCurl_1.95-4.1 RSQLite_0.11.4 XML_3.98-1.1 annotate_1.40.0 biomaRt_2.18.0 bitops_1.0-6 [13] genefilter_1.44.0 grid_3.0.2 lattice_0.20-24 locfit_1.5-9.1 rtracklayer_1.22.3 splines_3.0.2 [19] stats4_3.0.2 survival_2.37-7 tools_3.0.2 xtable_1.7-1 zlibbioc_1.8.0 -- Sent via the guest posting facility at bioconductor.org.
Bos taurus biomaRt Bos taurus biomaRt • 1.3k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 16 hours ago
United States
Hi Daniela, On 2/26/2014 2:29 PM, Daniela Mor? [guest] wrote: > Hi guys, > I'm new on R and Bioconductor packages so my question can sounds a little basics but I really could not figure out how to use a database from NCBI in BiomaRt. > I'm working on RNA-Seq reads to perform DE analysis and I'm interested in Bos taurus database from NCBI version UMD3.1. I think you will need to give more information here. What exactly are you trying to do? Have you already done the DE analysis, and now are simply trying to annotate the results? If so, what type of gene/transcript IDs do you have? Best, Jim > > So my question is: how to choose the bovine UMD3.1 from NCBI in BiomaRt? Or the best way to solve this would be to perform the aligment using the ensembl version? > > Just to make me clear I can't find any NCBI databases when I type: > >> library("biomaRt") >> listMarts() > If I take a look at ???ensembl??? [ensembl=useMart("ensembl")] so I can see the btaurus_gene_ensembl dataset. However, as I aligned my reads against a NCBI version when I tried count the reads, it did not work ('cause they have different identifiers I guess). The manual shows a short example using a wormDb but it did not help so much. > > -- output of sessionInfo(): > > R version 3.0.2 (2013-09-25) > Platform: x86_64-redhat-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] DESeq2_1.2.10 RcppArmadillo_0.4.000.2 Rcpp_0.11.0 Rsamtools_1.14.3 Biostrings_2.30.1 GenomicRanges_1.14.4 > [7] XVector_0.2.0 IRanges_1.20.6 BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.24.0 BSgenome_1.30.0 Biobase_2.22.0 DBI_0.2-7 GenomicFeatures_1.14.2 RColorBrewer_1.0-5 > [7] RCurl_1.95-4.1 RSQLite_0.11.4 XML_3.98-1.1 annotate_1.40.0 biomaRt_2.18.0 bitops_1.0-6 > [13] genefilter_1.44.0 grid_3.0.2 lattice_0.20-24 locfit_1.5-9.1 rtracklayer_1.22.3 splines_3.0.2 > [19] stats4_3.0.2 survival_2.37-7 tools_3.0.2 xtable_1.7-1 zlibbioc_1.8.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 16 hours ago
United States
Hi Daniela, Please don't take things off-list (e.g., use Reply-all). On 2/26/2014 3:20 PM, Daniela Mor? wrote: > Hi Jim, > > Actually, this first step will give me the read counts through > summarizeOverlaps before the DE analysis. > > More specifically, I'm choosing a gene model to make a transcriptDb > using makeTranscriptDbFromBiomart (page 7) > I'm following the attached documentation available during the last > Bioconductor summer course in Brazil (to which the page number refers) If you prefer NCBI identifiers, you can use makeTranscriptDbFromUCSC() instead. library(GenomicFeatures) tx <- makeTranscriptDbFromUCSC("bosTau6", "refGene") Should do the trick. Best, Jim > > Thank you in advance > > Daniela > > > On Wed, Feb 26, 2014 at 4:55 PM, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Daniela, > > On 2/26/2014 2:29 PM, Daniela Mor? [guest] wrote: > > Hi guys, > I'm new on R and Bioconductor packages so my question can > sounds a little basics but I really could not figure out how > to use a database from NCBI in BiomaRt. > I'm working on RNA-Seq reads to perform DE analysis and I'm > interested in Bos taurus database from NCBI version UMD3.1. > > > I think you will need to give more information here. What exactly > are you trying to do? Have you already done the DE analysis, and > now are simply trying to annotate the results? If so, what type of > gene/transcript IDs do you have? > > Best, > > Jim > > > > So my question is: how to choose the bovine UMD3.1 from NCBI > in BiomaRt? Or the best way to solve this would be to perform > the aligment using the ensembl version? > > Just to make me clear I can't find any NCBI databases when I type: > > library("biomaRt") > listMarts() > > If I take a look at ???ensembl?? [ensembl=useMart("ensembl")] > so I can see the btaurus_gene_ensembl dataset. However, as I > aligned my reads against a NCBI version when I tried count the > reads, it did not work ('cause they have different identifiers > I guess). The manual shows a short example using a wormDb but > it did not help so much. > > -- output of sessionInfo(): > > R version 3.0.2 (2013-09-25) > Platform: x86_64-redhat-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] DESeq2_1.2.10 RcppArmadillo_0.4.000.2 Rcpp_0.11.0 > Rsamtools_1.14.3 Biostrings_2.30.1 GenomicRanges_1.14.4 > [7] XVector_0.2.0 IRanges_1.20.6 BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.24.0 BSgenome_1.30.0 Biobase_2.22.0 > DBI_0.2-7 GenomicFeatures_1.14.2 RColorBrewer_1.0-5 > [7] RCurl_1.95-4.1 RSQLite_0.11.4 XML_3.98-1.1 annotate_1.40.0 > biomaRt_2.18.0 bitops_1.0-6 > [13] genefilter_1.44.0 grid_3.0.2 lattice_0.20-24 > locfit_1.5-9.1 rtracklayer_1.22.3 splines_3.0.2 > [19] stats4_3.0.2 survival_2.37-7 tools_3.0.2 xtable_1.7-1 > zlibbioc_1.8.0 > > -- > Sent via the guest posting facility at bioconductor.org > <http: bioconductor.org="">. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi Jim, Actually, this first step will give me the read counts through summarizeOverlaps before the DE analysis. More specifically, I'm choosing a gene model to make a transcriptDb using makeTranscriptDbFromBiomart (page 7) I'm following the attached documentation available during the last Bioconductor summer course in Brazil (to which the page number refers) Thank you in advance Daniela On Wed, Feb 26, 2014 at 5:29 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Daniela, > > Please don't take things off-list (e.g., use Reply-all). > > On 2/26/2014 3:20 PM, Daniela Moré wrote: > >> Hi Jim, >> >> Actually, this first step will give me the read counts through >> summarizeOverlaps before the DE analysis. >> >> More specifically, I'm choosing a gene model to make a transcriptDb using >> makeTranscriptDbFromBiomart (page 7) >> I'm following the attached documentation available during the last >> Bioconductor summer course in Brazil (to which the page number refers) >> > > If you prefer NCBI identifiers, you can use makeTranscriptDbFromUCSC() > instead. > > library(GenomicFeatures) > tx <- makeTranscriptDbFromUCSC("bosTau6", "refGene") > > Should do the trick. > > > Best, > > Jim > > > >> Thank you in advance >> >> Daniela >> >> >> On Wed, Feb 26, 2014 at 4:55 PM, James W. MacDonald <jmacdon@uw.edu<mailto:>> jmacdon@uw.edu>> wrote: >> >> Hi Daniela, >> >> On 2/26/2014 2:29 PM, Daniela Moré [guest] wrote: >> >> Hi guys, >> I'm new on R and Bioconductor packages so my question can >> sounds a little basics but I really could not figure out how >> to use a database from NCBI in BiomaRt. >> I'm working on RNA-Seq reads to perform DE analysis and I'm >> interested in Bos taurus database from NCBI version UMD3.1. >> >> >> I think you will need to give more information here. What exactly >> are you trying to do? Have you already done the DE analysis, and >> now are simply trying to annotate the results? If so, what type of >> gene/transcript IDs do you have? >> >> Best, >> >> Jim >> >> >> >> So my question is: how to choose the bovine UMD3.1 from NCBI >> in BiomaRt? Or the best way to solve this would be to perform >> the aligment using the ensembl version? >> >> Just to make me clear I can't find any NCBI databases when I type: >> >> library("biomaRt") >> listMarts() >> >> If I take a look at â EURO oeensemblâ EURO [ensembl=useMart("ensembl")] >> so I can see the btaurus_gene_ensembl dataset. However, as I >> aligned my reads against a NCBI version when I tried count the >> reads, it did not work ('cause they have different identifiers >> I guess). The manual shows a short example using a wormDb but >> it did not help so much. >> >> -- output of sessionInfo(): >> >> R version 3.0.2 (2013-09-25) >> Platform: x86_64-redhat-linux-gnu (64-bit) >> >> locale: >> [1] C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] DESeq2_1.2.10 RcppArmadillo_0.4.000.2 Rcpp_0.11.0 >> Rsamtools_1.14.3 Biostrings_2.30.1 GenomicRanges_1.14.4 >> [7] XVector_0.2.0 IRanges_1.20.6 BiocGenerics_0.8.0 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.24.0 BSgenome_1.30.0 Biobase_2.22.0 >> DBI_0.2-7 GenomicFeatures_1.14.2 RColorBrewer_1.0-5 >> [7] RCurl_1.95-4.1 RSQLite_0.11.4 XML_3.98-1.1 annotate_1.40.0 >> biomaRt_2.18.0 bitops_1.0-6 >> [13] genefilter_1.44.0 grid_3.0.2 lattice_0.20-24 >> locfit_1.5-9.1 rtracklayer_1.22.3 splines_3.0.2 >> [19] stats4_3.0.2 survival_2.37-7 tools_3.0.2 xtable_1.7-1 >> zlibbioc_1.8.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org >> <http: bioconductor.org="">. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 923 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6