I am trying to figure out the workflow when I want to select only a small part of the genome for DE-analysis.
We want to select only a small part of all genes (1000 of +/- 60.000 genes).
Following earlier posts I think the way to go would be to first normalize on all genes and then make a subset of these specific 1000 genes afterwards.
My question is whether I should keep the library sizes if I subset after normalization?
If I am correct I should keep lib.sizes after normalization, so keep.lib.sizes = T.
When subsetting before normalization, keep.lib.sizes should be False. Does that make sense?
As previously discussed, you must never reset the library sizes in an edgeR DGEList object after performing TMM nornalization. Just leave keep.lib.sizes at the default value.
If you want to focus on 1000 genes of special biological interest, then you can subset the edgeR object that is input to topTags. You should not be subsetting the edgeR objects any earlier than that in the pipeline.