Hello,
I have a raw count data matrix containing ~60 000 genes (rows) and ~200 samples (cols). I wish to run the DESeq() function for DE analysis using DESeq2. However, I am only interested in protein coding genes (~20 000) but my count original count matrix contains protein and non-protein coding genes. How should I complete the analysis?
subset the original count matrix to contain protein coding genes only and then run DESeq()?
run DESeq() with the original count matrix containing all genes and then subset the DE genes that are protein coding?
Thanks
2 is preferable.