Sorry if this is a simple question for some, I am flying blind on my first attempt at RNA seq with a demand for immediate deadlines.
I've used a process of Salmon (with FASTQC) -> tximport -> DESeq2 and I've thrown in a bit of Scater for some PCA calculations. however, I would like to plot the absolute number and percent of mapped reads to each experimental sample (6 controls, 6 exposures). Given the above workflow, is there anything available to yield this information?
Thanks
Isn't there a tool within Bioconductor suite that can do this rather than adding in further software?
To get the percentages, you need to know the number of total reads, which so far as I know salmon doesn't report. So you need to count the total number of reads in each FASTQ file by either reading each in using something like
readFastq
in the ShortRead package, or by using a system call likezcat fastq.gz | wc -l
and capturing the output (or maybe some other way that isn't occurring to me at this point), or by using MultiQC, which will collate the total number of reads into a nice table that you can read in usingread.delim
.I tend to use MultiQC, because it generates a really nice HTML page showing the collated QC data from FASTQC, and as a side effect gives me the total counts. So two birds, one stone.