Question

DESeq2: Gene is DEG in one experiment but not in the other depite similar baseMean, L2FC, ...

0

Entering edit mode

fjrsa • 0

@a721367c

Last seen 4 months ago

Germany

Hi everyone,

I'm currently analyzing ~30 treatments vs mock using DESeq2 (v.1.44.0). I don't filter any low counts. I apply L2FC-shrinkage (ashr).

This is an experiment where we repeated the library prep from exactly the same RNA which was used for a first prep/experiment some weeks ago.

I will use synonym for treatments and genes but I think it's clear what I mean. Here, I extracted the results for a given gene A in treatment1 vs. mock frm two independent experiments.

enter image description here

Why is gene A not a DEG after multiple testing correction in experiment 1, but in experiment 2 although it has comparable baseMean and even stronger L2FC and lower p Value?

What are other potential features that can influence the adjustment of the p-val adjustment in such a way that I'm ending up with a DEG in one experiment but not in the other? Number of replicates is identical, total number of treatments in the experiment is identical.

Thanks!

DESeq2 • 1.0k views

ADD COMMENT • link updated 4 months ago by txema.heredia • 0 • written 4 months ago by fjrsa • 0

0

Entering edit mode

It would probably help if you broke out the raw and normalized read counts for all the samples.

ADD REPLY • link 4 months ago swbarnes2 ★ 1.4k

0

Entering edit mode

enter image description here

ADD REPLY • link 4 months ago fjrsa • 0

0

Entering edit mode

As said, literally identical on a continuous scale but different in categorical terms.

ADD REPLY • link 4 months ago ATpoint ★ 4.8k

0

Entering edit mode

I think you run into the common pitfall that you treat results as categorical (significant, not significant) whereas in reality the results are almost identical. Look at the baseMean, logFC and pvalue, they're close. The independent filtering removed this gene (reason unknown, maybe because counts / baseMean is relatively low. This makes the difference here. If you feel this is inappropriate you could run the analysis with some low count filtering (see vignette) and then turn off the independent filtering.

ADD REPLY • link 4 months ago ATpoint ★ 4.8k

0

Entering edit mode

I would say baseMean, read counts, L2FC, ... are very similar between the two experiments - that's why I was wondering about the results.

With "reason unknown" you mean it's not easily possible to trace back why exactly gene A is DEG in one experiment, but not in the other? So in the end it's a mix of baseMean, L2FC, replicates per treatment, deviation across replicates, number of samples in the experiment, etc?

I was also reading on the independent filtering and will give it a try. Thanks.

ADD REPLY • link 4 months ago fjrsa • 0

0

Entering edit mode

Unknown reason means that from the results you show one cannot easily infer why exactly the IF removed this gene in one but not in the other condition. If you turn IF off you will probably be fine, but still, hard cutoffs can induce differences even though data are very similar, that is known.

ADD REPLY • link 4 months ago ATpoint ★ 4.8k

0

Entering edit mode

You could check the reply I got to a similar issue here: Adjusted p-values become NA when sub-setting samples

In short, DESeq2 uses by default independentFiltering=TRUE. This creates a separate baseMean threshold for each pairwise comparison you run. Then, if you want to compare the results/DEG between multiple pairwise comparisons, some of them might have much higher baseMean thresholds than others. This leads to genes suddenly getting padj=NA.

AFAIK, the advised way is to disable independentFiltering and apply the same (manual) baseMean threshold to all comparisons.

ADD REPLY • link 4 months ago txema.heredia • 0