I am currently using DESeq2 to analyze differential expression in some bulk RNA-seq samples, and am running into an issue that I assume is an outlier issue? I essentially have two groups with two conditions (in vitro and invivo, tbx and hcn). I perform DESeq analyses on the in vitro and in vivo samples separately (tbx vs hcn for both). The invivo group is rather small - 5 hcn samples and 4 tbx samples (the in vitro group is relatively larger, with about 40 samples each). For the in vivo sample, I find that I have significantly fewer differentially expressed genes (we'd expect them to be similar). A large number (about 12,000/25,000) of genes are marked with p-value NA, which I presume is because there was an outlier (detected by Cook's distance) throwing the error - and since the sample size is under 7 the trimmed means estimate isn't used (I should mention that this number of NA genes didn't notably change when I used the replaceOutliersWithTrimmedMeans function and minReplicates = 3).
However, even ignoring these, I am finding some strange results. For example, the CXCR4 gene has p-value 0.898 (padj 0.965), but as you can see from the attached boxplot (which plots log-transformed DESeq-scaled values), it appears by eye to be significantly different. There is, however, one major outlier in the in vivo HCN population (both noticed visually and by Cook's distance).
Is there a way to try and account for this in my small sample number dataset without going manually through genes of interest?
Thanks for the tip on plotCounts()!
This sample is not consistently an outlier. It seems most probable that that value is technical error/noise as opposed to real signal. I understand that n = 4, 5 is small, but for the sake of my understanding, is there a reason why replaceOutliersWithTrimmedMeans(minReplicates = 3) doesn't change the p-value here?
edit: Actually, I'm guessing it doesn't because that value isn't considered an outlier based on default criteria?
It could be because it is not an outlier. The outlier replacement procedure is designed for replacing likely technical outliers, e.g. a count of 100,000 when all the other samples have counts in the range 0-10 is likely a technical artifact. If you are showing natural log scale above, that's a count of 7 with the other samples 0. The outlier replacement, if it went after what I would call "inconvienient samples" would make the method lose control of the false positive rate and false discovery rate.