Question

Using p.nominal DE genes from limma

0

Entering edit mode

chris86 ▴ 420

@chris86-8408

Last seen 5.2 years ago

UCL, United Kingdom

Hi

My supervisor keeps telling me we should just use the p nominal genes from DE with limma as we don't get any FDR corrected ones. However if we do this, how do we know we are not just looking at gene expression noise?

Thanks,

Chris

limma • 902 views

ADD COMMENT • link updated 8.8 years ago by b.nota ▴ 370 • written 8.8 years ago by chris86 ▴ 420

0

Entering edit mode

I presume you meant that you "don't get any [DE genes from the] FDR corrected ones", rather than not getting any corrected p-values at all. If you're having trouble finding them, they're in the adj.P.Val column produced by topTable.

ADD REPLY • link 8.8 years ago Aaron Lun ★ 28k

0

Entering edit mode

yes i can find them, we just don't get any. data is very heterogenous between our groups of interest. supervisor says to analyse the non corrected but significant ones.

ADD REPLY • link 8.8 years ago chris86 ▴ 420

score 3 · Accepted Answer · 2016-04-15

3

Entering edit mode

b.nota ▴ 370

@bnota-7379

Last seen 4.4 years ago

Netherlands

Don't listen to your supervisor, FDR adjustment is necessary for DE analysis in limma!

ADD COMMENT • link 8.8 years ago b.nota ▴ 370

2

Entering edit mode

... and in any scenario where you have a moderate-to-large number of tests. For example, if you have 1000 genes that are not DE, then you can expect that 50 of them will have a p-value below 0.05, just by chance. The FDR adjustment protects you against this multiplicity effect and, as b.nota says, is mandatory for limma (and really, any genomics analysis with thousands of features). If you don't get any DE genes from the adjusted p-values, then so be it; your system either doesn't have any DE genes, or your study doesn't have enough power to detect them. Forgoing the multiplicity correction isn't going to change that.

If you want, you can relax the FDR threshold to, say, 20%, to identify more putative DE genes. This will give you a larger set of genes, albeit with a greater proportion of false positives. However, at least you're upfront about the fact that your discoveries are more likely to be false. This is preferable to trying to sneak in "DE" genes at the typical 5% threshold using unadjusted p-values, which is rather disingenuous to me. Say you give your collaborators a list of DE genes and tell them that the error rate is 5%, but it actually ends up being 50-100% after they waste a fortune on follow-up experiments - imagine how happy they would (not) be.