Hello all,
I am a newbie to RNAseq analysis. I am comparing (treatment vs control). Briefly, I mapped the RNA-seq reads to the reference genome using STAR and I used featureCounts to quantify the reads mapped to the reference genes and found the gene counts. Before I start the deferentially expression analysis, I would like to find the expressed genes. I use the command below and found 15463 genes, is it right to say, out of 24321 genes 15463 genes are expressed.
dds<-dds[rowSums(counts(dds)) > 0,]
dds<-dds[rowSums(counts(dds)) > 5]
Hi Michael,
Thanks for your reply, I just learned, even in using the command below doesn't related to in filtering the expressed genes,
dds<-dds[rowSums(counts(dds)) > 5,]
According to DESeq2 vignette, "removing rows in which there are no reads or nearly no reads, we reduce the memory size of the dds data object and we increase the speed of the transformation and testing functions within DESeq2".
I have been using 'dds<-dds[rowSums(counts(dds)) > 5,]' command to find the expressed genes from raw counts.
Dear Michale,
As I mentioned, I found the gene level counts using featureCounts, can I use tximport to import the gene level count matrix which I found using featurCounts. After importing, I am thinking to use TPM to find the expressed genes.