BSgenome: dm3 and panTro2
1
0
Entering edit mode
joseph ▴ 330
@joseph-1270
Last seen 10.2 years ago
Hi Are there any plans to add the most recent Drosophila and Chimpanzee genomes to the BSgenome list? The most recent UCSC versions are the Apr. 2006 assembly of the D. melanogaster genome (dm3) and the Chimpanzee Genome Mar. 2006 (panTro2). The Mac OS packages would be nice to have. Thanks Joseph [[alternative HTML version deleted]]
BSgenome BSgenome BSgenome BSgenome • 868 views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 16 hours ago
Seattle, WA, United States
Hi Joseph, Are you sure that the dm3 assembly provided by UCSC (based on BDGP Release 5) is different from the FlyBase r5.1 assembly? If not then you could just use the BSgenome.Dmelanogaster.FlyBase.r51 package which contains the FlyBase r5.1 assembly (I think that the differences between the various 5.y releases from FlyBase are on the annotation side only, but the chromosome sequences should be the same). Anyway I've started building a BSgenome package for dm3. Once it's ready it will be easy to verify that the chromosome sequences are indeed the same than in FlyBase r5.1 by doing something like: library(BSgenome.Dmelanogaster.FlyBase.r51) r51 <- BSgenome.Dmelanogaster.FlyBase.r51::Dmelanogaster library(BSgenome.Dmelanogaster.UCSC.dm3) dm3 <- BSgenome.Dmelanogaster.UCSC.dm3::Dmelanogaster r51$chr2L == unmasked(dm3$chr2L) I'll take this opportunity to add the same built-in masks to this new package than the ones I've already added to other BSgenome data packages (only Human, Mouse and Dog so far). Those built-in masks are new in Bioconductor 2.2 and some examples on how to use them are shown in the GenomeSearching vignette (this vignette has been moved from the Biostrings pkg to the BSgenome pkg). I will also make a BSgenome data pkg for Chimpanzee (with masks too) and post here again when this is ready. Cheers, H. joseph wrote: > Hi > Are there any plans to add the most recent Drosophila and Chimpanzee > genomes to the BSgenome list? > The most recent UCSC versions are the Apr. 2006 assembly of the D. > melanogaster genome (dm3) and the Chimpanzee Genome Mar. 2006 > (panTro2). The Mac OS packages would be nice to have. > Thanks > Joseph > >
ADD COMMENT
0
Entering edit mode
Hi Joseph, The source packages for dm3 (Fly) and panTro2 (Chimp) are now available. I've also put dm2 back (used to be part of the BSgenome family in previous versions of Bioconductor, but was temporarily broken). I can confirm now that the chromosomes sequences in dm3 are the same as in FlyBase.r51. The exact set of sequences provided and their exact names are a little bit different though: library(BSgenome.Dmelanogaster.FlyBase.r51) r51 <- BSgenome.Dmelanogaster.FlyBase.r51::Dmelanogaster library(BSgenome.Dmelanogaster.UCSC.dm3) dm3 <- BSgenome.Dmelanogaster.UCSC.dm3::Dmelanogaster Then: > seqnames(r51) [1] "2L" "2R" [3] "3L" "3R" [5] "4" "X" [7] "U" "dmel_mitochondrion_genome" [9] "2LHet" "2RHet" [11] "3LHet" "3RHet" [13] "XHet" "YHet" > seqnames(dm3) [1] "chr2L" "chr2R" "chr3L" "chr3R" "chr4" "chrX" [7] "chrU" "chrM" "chr2LHet" "chr2RHet" "chr3LHet" "chr3RHet" [13] "chrXHet" "chrYHet" "chrUextra" To compare chr2L, or chrM: > r51[["2L"]] == unmasked(dm3$chr2L) [1] TRUE > r51[["dmel_mitochondrion_genome"]] == unmasked(dm3$chrM) [1] TRUE The binary versions of the packages for Windows and Mac will follow soon. Cheers, H. Herve Pages wrote: > Hi Joseph, > > Are you sure that the dm3 assembly provided by UCSC (based on BDGP > Release 5) > is different from the FlyBase r5.1 assembly? If not then you could just use > the BSgenome.Dmelanogaster.FlyBase.r51 package which contains the > FlyBase r5.1 > assembly (I think that the differences between the various 5.y releases > from > FlyBase are on the annotation side only, but the chromosome sequences > should > be the same). > > Anyway I've started building a BSgenome package for dm3. Once it's ready it > will be easy to verify that the chromosome sequences are indeed the same > than > in FlyBase r5.1 by doing something like: > > library(BSgenome.Dmelanogaster.FlyBase.r51) > r51 <- BSgenome.Dmelanogaster.FlyBase.r51::Dmelanogaster > library(BSgenome.Dmelanogaster.UCSC.dm3) > dm3 <- BSgenome.Dmelanogaster.UCSC.dm3::Dmelanogaster > r51$chr2L == unmasked(dm3$chr2L) > > I'll take this opportunity to add the same built-in masks to this new > package > than the ones I've already added to other BSgenome data packages (only > Human, > Mouse and Dog so far). Those built-in masks are new in Bioconductor 2.2 and > some examples on how to use them are shown in the GenomeSearching vignette > (this vignette has been moved from the Biostrings pkg to the BSgenome pkg). > > I will also make a BSgenome data pkg for Chimpanzee (with masks too) and > post > here again when this is ready. > > Cheers, > H. > > > joseph wrote: >> Hi >> Are there any plans to add the most recent Drosophila and Chimpanzee >> genomes to the BSgenome list? >> The most recent UCSC versions are the Apr. 2006 assembly of the D. >> melanogaster genome (dm3) and the Chimpanzee Genome Mar. 2006 >> (panTro2). The Mac OS packages would be nice to have. >> Thanks >> Joseph >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Herve Pages wrote: [...] > > The binary versions of the packages for Windows and Mac will follow soon. The binary versions of all the BSgenome data packages are now online (in the release), ready for download and installation via biocLite() (for R-2.7 + BioC-2.2 users). H.
ADD REPLY

Login before adding your answer.

Traffic: 660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6