Using injectSNPs with a custom set of SNPs
1
0
Entering edit mode
@steve-pederson-23427
Last seen 8 months ago
Australia

Hi,

I'm sorry if this is a dumb question, but I have a set of SNPs obtained from a large cohort study which can't be released as a SNPlocs package. Is there any way (or alternative strategies) for using injectSNPs() to modify a BSGenome object without having to create a SNPlocs package? It's obviously quite trivial to create a GRanges object with this kind of information so I'd hoped that a simple method for a GRanges object might exist somewhere, but I haven't been able to find it yet.

I haven't been able to spot any alternative strategies for forging a local SNPlocs package either. Do these instructions exist somewhere?

Thanks in advance,

Stevie

SNPlocs BSgenome • 988 views
ADD COMMENT
1
Entering edit mode
@herve-pages-1542
Last seen 1 day ago
Seattle, WA, United States

Hi,

Unfortunately injectSNPs() can not handle a GRanges or GPos object at the moment.

However, as long as you have access to the positions and alleles of the SNPs for each chromosome, you should be able to make your own substitutions with replaceLetterAt() from the Biostrings package. For example, let's say your SNPs are stored in a GPos object my_snps similar to the one returned by snpsBySeqname() (that is, with the alleles stored as IUPAC ambiguity codes in the alleles_as_ambig metadata column), then you should be able to do something like this:

library(BSgenome.Hsapiens.UCSC.hg38)
genome <- BSgenome.Hsapiens.UCSC.hg38

## Make sure 'my_snps' and 'genome' use the same chromosome naming convention:
seqinfo(genome)
seqinfo(my_snps)

## Inject SNPs in chr1:
chr1snps <- my_snps[seqnames(my_snps) == "chr1"] 
chr1altered <- replaceLetterAt(genome$chr1, at=pos(chr1snps), mcols(chr1snps)$alleles_as_ambig)

See ?replaceLetterAt for more information about how to use replaceLetterAt().

As for creating your own SNPlocs package, the tools that are currently available for this are in the SNPlocsForge package. Unfortunately the tools in the package are hard to use and inefficient. Plus they only work with the huge JSON files provided by the dbSNP folks, which require an insane amount of computing power to parse. See some discussion about this here. In other words, the package would require a lot of improvements before it can be used by someone else. This is why I've not submitted it to Bioconductor yet.

Hope this helps,

H.

ADD COMMENT
0
Entering edit mode

Thanks Herve. Most helpful & informative

ADD REPLY

Login before adding your answer.

Traffic: 956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6