Hello everyone,
I do have a data set of analysed RNAseq experiments as a summarized experiment. Every raw is one specific triplet on a certain gene. What I want to do is calculate the distance of that position to the next GGUC sequence. So: how far is the certain triplet away from the next GGUC sequence? I have 0 idea how to start. I do know how to get the sequence of the certain gene but I thought there might me a smart shortcut, a function that can calculate distances?
I would be very happy if someone has an idea.
coorsID gene_id annotated gene_name Seq
1:7821354-782136:+ ENSG00000237491 Y LINC01409 GAG
1:782189-782191:+ ENSG00000237491 Y LINC01409 AAA
1:783361-783363:+ ENSG00000237491 Y LINC01409 GAA
You can find the position of all GGTC in a given BSgenome object with something like:
...and then use these positions together with James suggestions to find the nearest ones per coorsID.