Figuring out what long decent Phred value unaligned reads are
0
0
Entering edit mode
@matthew-thornton-5564
Last seen 12 weeks ago
USA, Los Angeles, USC

Hello!

My question is about reads that don't align to the genome yet are long and have very good Phred scores. Currently, my workflow is FastQC > Cutadapt > Trimmomatic > RNA-STAR > HTSeq-count > edgeR (RUVSeq) I use gencode genomes with Ensembl IDs and even with the cleanest isolation of cells and excellent library production I still get about 80% alignment to the genome. I use the entire genocode genome and gtf files for the alignment and I collect the unaligned reads and sometimes there are a large number of long reads with good Phred scores and I am thinking that in a perfect reference genome that they would align. The reference genome is not perfect by any means and with a certainty there are some cell type differences and strain differences between the reference genome and the source of the total RNA. Is there a way to construct and extract contigs from unaligned reads and then blast them to see what they have homology to? or see if they are genetic rearrangements or even if they are simply un-annotated ORFs. Does anyone have experience with this? What software would you recommend? Any response is greatly appreciated. TIA.

rnaseq alignment • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 905 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6