Entering edit mode
The Bioconductor provides the BSgenome.Ggallus.UCSC.galGal4
from the following command to be used on different approaches:
source("https://bioconductor.org/biocLite.R")
biocLite("BSgenome.Ggallus.UCSC.galGal4")
library(BSgenome.Ggallus.UCSC.galGal4)
We are looking to launch a new analysis approach of the reduced-genome. But for that, we need a Bioconductor "reduced-reference genome" on the same format as provided by the Bioconductor genome, in order to use the tools already available for analysis. Our idea is to create this genome from a merged alignment of 20 animals subjected to this reduced representation methodology.
Would it be possible?
What exactly does 'reduced-reference genome' mean?
Hi,
You can build a BSgenome data package from any set of DNA sequences as long as the sequences are available in a FASTA or 2bit file, or in a collection of FASTA files, and the sequences are named uniquely. See the BSgenomeForge vignette in the BSgenome software package for more information. Hopefully you'll end up with a BSgenome data package that can be used with the tools already available but keep in mind that for most analysis you also need access to annotations that match your BSgenome object i.e. that describe and report genomic features with respect to it. However, most annotation providers (e.g. NCBI, UCSC, Ensembl, etc...) only provide annotations for reference genomes. So depending on your analysis, you might also need to find annotations (or tweak and merge existing annotations) that match your "reduced" BSgenome.
H.