searching siRNA sequence (19bp) against human genome
2
0
Entering edit mode
stephen66 ▴ 50
@stephen66-7177
Last seen 3.6 years ago
United States

Dear All, I have a problem here that not sure can be easily addressed by using any bioconductor packages. I have a siRNA sequence with 19 nt, I want to search the guide sequence against human genome to generate the following outputs: 1. Any genes that are mapped to the sequence with max of for example 5 mis-matches; 2. Genome coordinate of the matches

Many Thanks! Steve

searching siRNA sequence human genome GenomicAlignment Biostrings matchPattern • 975 views
ADD COMMENT
1
Entering edit mode
ATpoint ★ 4.5k
@atpoint-13662
Last seen 6 hours ago
Germany

I wrote a wrapper around Biostrings::matchPattern which finds all occurrences of a DNA string (optionally with mismatches) in a given BSgenome. It returns a GRanges object with all match locations separated by strand (top/bottom strand). See on Github.

In your case it could be with your siRNA sequence:

FindSequenceOccurrence(Sequence = "AGCTAGCTAGTTGTGTACAGT",  
                       MaxMismatches = 5,
                       BSGenome = BSgenome.Hsapiens.UCSC.hg38)

Once you have the coordinates you can simply intersect them with any gene annotation.

ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States

Sounds like something you could do with Biostrings, particularly the matchPDict function. The first set of examples for that function seem to be pretty close to what you want to do, so I'd start there.

ADD COMMENT

Login before adding your answer.

Traffic: 860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6