I have a couple PWMs for RNA binding proteins and I also have the sequences of candidate UTRs in different groups as a DNAStringSet
. I'd like to see how many of the UTRs (and which ones) in each group match a given PWM. However, it looks like the matchPWM
function in Biostrings
only supports a single sequence rather than a DNAStringSet
. Is there a way to do this besides sticking all of my sequences together, matching, breaking them apart or looping through each sequence?
Thanks
Hi Jake,
Have you tried the "sticking all of my sequences together, matching, breaking them apart" approach? It should be significantly faster than looping.
H.