Hi Michael,
I have treated vs untreated(wt) samples. And I know a subset of genes are very lowly expressed in wt but will be up-regulated in treated samples. When I do the DEseq2 analysis, most of them are at the top if I rank them by adjust p values or by fold change which makes sense. But in this case, it looks like the genes ranking at the top (high pvalue or foldchange) will bias to genes lowly expressed in wt. Their baseMeans are intermediate since they consider all the samples (treated+wt). Thus I think shrinkage method will also not help if it is relative to baseMeans. So I wonder how DESeq2 deals with such bias??
Thank you in advance for your answer.
Hi ATpoint,
Thanks for your reply. Yes, I expect an intermedia baseMean. That is why I think shrinkage probably not helps much (correct me if I am wrong). And I expect them to be the most significant ones too but since the fold change X/Y is anti-correlated with Y (spurious correlation), I am worried that the high fold change they have is only due to the super small Y. How much should I trust them at the top list over the other significant genes if I want to rank all the significant genes?
You really should to show some data, I doubt that this can be answered based on textual descriptions.
Just a note:
The LFC shrinkage does not depend on the baseMean. It just uses the counts and the adaptive prior for LFC (looking across all genes). Unlike for dispersion estimation, our prior is experiment-wide for LFC, not specific to the gene's baseMean.
Hi Michael,
You said "uses the counts". Here the "counts" means the counts from (treated + wt) or just wt? How does the DEseq2 deal with the genes with very low counts only in wt but not in treated? Thx.
All counts, not specifically from one group.