Is there a simple way to map a list of probe sequences to a genome ? The use case is probes of a custom-designed NanoString assay. The RLF file provided by NanoString has a gene symbol and probe sequence, but not the genomic coordinates of the probe. I don't want the hassle of running bowtie
for a small set of sequences (about 200) which map to the genome with no mismatches and I'd like to include the mapping procedure in an R Markdown document without using a complex aligner through system()
calls.
For another similar question asked five years ago, it was recommended to map to the transcriptome with vwhichPDict
but I want the genomic coordinates, so I'd like to map with BSgenome.Hsapiens.UCSC.hg19
(both strands) and obtain a GRangesList
result.
This solution involves formatting the reads into a FASTA file, generating a genome index, and writing to disk BAM files and importing them into R. The other proposed answer solves the problem more directly and without reading and writing of results to and from the disk.