Hi All,
I recently had some task which I thought should be easy to accomplish. I wanted to subset a DNAStringSet (a whole genome) using the information in a GRanges object (output of tileGenome()). I searched quite a bit in the help and here but could not find a "bioconductor" way of doing it and ended up for-looping through the ranges.
There is an exampe below.
My question is: Am I missing something or is there really no connection between GRanges and BioStrings ?
Best Regards,
Stefan
Reproducible example:
testRanges <- GRanges(seqnames = c("seq1","seq1","seq2"), ranges = IRanges(start = c(10,20,10), width = c(5,5,5)) ) testSeqs <- DNAStringSet(c("tagtagctagctacgatgcatgctacgatgcatgc", "tatgctagctacgtacgtacgatcgtacgtagcta") ) names(testSeqs) <- c("seq1","seq2") #what I want #newTestSeqs <- subseq(testSeqs, by = testRanges) #what I do testMask <- data.frame(seq = as.character(seqnames(testRanges)), start = start(testRanges), end = end(testRanges)) newTestSeqs <- DNAStringSet() for (i in 1:nrow(testMask)){ temp <- testSeqs[testMask[i,"seq"]] temp <- subseq(temp, start = testMask[i,"start"], end = testMask[i,"end"]) newTestSeqs <- c(newTestSeqs,temp) }
Thanks Michael, that is exactly what is was looking for and could not find on any of the help files, (or I overlooked it).
Since this syntax is not obvious to me, do you mind explaining what is going on here. Suppose I want to isolate one subsequence from a DNAStringSet, I can put in a GRanges, but how can I do this using more low-level objects. For e.g. dataframes I do:
So for DNAStringSets it must be somethings like:
to achieve the same thing as the first range in the example above.
Best Regards,
Stefan
A DNAStringSet is one dimensional, so it should only take a single subscript. As a list, it is also hierarchical. The GRanges subscripting is essentially a shortcut for a "nested" subscript (like looping
x[seqnames]
, extracting the corresponding range).A hierarchical, one-dimensional subscript. That's exactly what I need to know to. I will figure it out from here.
Thanks for your great advice.