Hey,
I'm using Seurat on an scRNA-seq dataset with two groups of mice - A and B. All of what I write next is true regardless of whether I've used normalized counts from the SCT assay or the normalized counts (obtained using the "LogNormalize" method in the "NormalizeData" function) from the RNA assay. Anyway, here's the thing: pathway enrichment analyses (with "clusterProfiler") - comparing group A to group B - suggest that (almost) only energy-related pathways are enriched in group A, while a variety of pathway types are enriched in group B. When looking at the "leading edge" genes, it turns out that the genes responsible for the enrichment of energy-related pathways in group A are highly expressed genes (in both groups), while the genes responsible for the enrichment of a variety of pathways in group B are expressed at substantially lower levels.
I'm worried about the following possibility: what if the number of transcripts per cell for non-energy-related genes is actually - in reality - similar between group A and group B, while the number of transcripts per cell of energy-related genes is indeed higher in group A than in group B? In such scenario, the normalization will make it look as if non-energy-related genes are upregulated in group B, while, in reality, they aren't.
If my worries are justified (i.e. if I'm not missing something), what can I do in order to alleviate the problem?
Cheers,
Omer