How to remove aberrant chromosomes from a BSgenome object?
1
0
Entering edit mode
tlgolan • 0
@tlgolan-10804
Last seen 8.4 years ago

Hi everyone,

How can I remove mitochondrial chromosome and aberrant chromosomes ("_" pattern) from a BSgenome object? I am trying to create such object that contains only chr1-20 and X,Y chromosomes (Rn6 genome).

Thanks!

bsgenome r • 3.3k views
ADD COMMENT
0
Entering edit mode

Hi

Depending on what you want to do, subsetting might help, eg:

seqinfo(BSgenome.Rnorvegicus.UCSC.rn6)[seqlevels(BSgenome.Rnorvegicus.UCSC.rn6)[1:22]]
getSeq(BSgenome.Rnorvegicus.UCSC.rn6, seqlevels(BSgenome.Rnorvegicus.UCSC.rn6)[1:22])

 

 

Regards, Hans-Rudolf

 

 
ADD REPLY
2
Entering edit mode
@herve-pages-1542
Last seen 10 hours ago
Seattle, WA, United States

Hi,

Unfortunately the seqlevels() setter for BSgenome objects only allows renaming the sequences at the moment. It doesn't allow removing sequences. It should at some point (it's on the TODO list). Here is a hack in the mean time:

keepBSgenomeSequences <- function(genome, seqnames)
{
    stopifnot(all(seqnames %in% seqnames(genome)))
    genome@user_seqnames <- setNames(seqnames, seqnames)
    genome@seqinfo <- genome@seqinfo[seqnames]
    genome
}

Then:

library(BSgenome.Rnorvegicus.UCSC.rn6)
genome <- BSgenome.Rnorvegicus.UCSC.rn6
sequences_to_keep <- paste0("chr", c(1:20, "X", "Y"))
genome <- keepBSgenomeSequences(genome, sequences_to_keep)
genome
# Rat genome:
## organism: Rattus norvegicus (Rat)
## provider: UCSC
## provider version: rn6
## release date: Jul. 2014
## release name: RGSC Rnor_6.0
## 22 sequences:
##   chr1  chr2  chr3  chr4  chr5  chr6  chr7  chr8  chr9  chr10 chr11 chr12
##   chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chrX  chrY             
## (use 'seqnames()' to see all the sequence names, use the '$' or '[[' operator
## to access a given sequence)

Hope this helps,

H.

 

 

ADD COMMENT
0
Entering edit mode

is there any update on this? It would be nice to have this working to be able to use the keepStandardChromosome function on a BSGenome object.

something along this line:

library(GenomeInfoDb) genome <- BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 genome <- keepStandardChromosomes(genome)

Best, Christian

ADD REPLY

Login before adding your answer.

Traffic: 413 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6