Question

limma voom: summary(decideTests(fit))

0

Entering edit mode

scheran02 • 0

@scheran02-20443

Last seen 5.9 years ago

Hi, I am running an RNAseq analysis with limma and voom. I am puzzled by the summary(decideTests(fit)) function. I am not sure what exactly it shows. According to my (limited) understanding, running my analysis should give 153 (104+49) genes with adjusted p-value<0.05 when looking at the summary(results(fit)) for coefficient 2 (GroupNonprogr):

        (Intercept) GroupNonprogr pairsB
-1           2           104                44    
0         9569         18789         18751  
1         9371           49              147

However, none of the genes meet this cutoff, when I extract coef 2. It is not an error, but I have problem understanding what the summary function actually shows, and how I can extract those values in an xls or txt file. So far I use:

tab <- topTable(fit, coef=2, n=Inf, sort.by='none', genelist=fData.short)
ix <- sort(tab$logFC, index.return=TRUE, decreasing=TRUE)$ix
tab.sorted <- tab[ix,]
write.table(tab.sorted, file="filename.txt", row.names=TRUE, quote=FALSE, sep="\t")

I would be very grateful if someone spent a few minutes to help me out here.

Andreas

limma • 3.0k views

ADD COMMENT • link updated 2.7 years ago by chm • 0 • written 5.9 years ago by scheran02 • 0

0

Entering edit mode

Hi Andreas,

Have you figured it out? I have the same problem.

I run

DEresults = decideTests(data.fit.eb,method='global',adjust.method="BH",p.value=0.05,lfc=1)
summary(DEresults)

for one of the coefficients i got:

  -1     0     1 
   15 54649    11

Then I concluded that 26 genes are differentially expressed. But looking in the DEresults table, none of the genes meet that criteria. None of the genes have a p.adj <0.05 I have 127 genes with absolute logFC > 1 Then 70 genes with absolute logFC > 1 and pval <0.05 but 0 genes with absolute logFC > 1 and adj pval <0.05

How does summary gets to 26 genes?

Hope you can help me. Suz

ADD REPLY • link updated 5.0 years ago by Gordon Smyth 52k • written 5.0 years ago by Suz • 0

0

Entering edit mode

Dear Suz,

If you want to ask a question, please post of question of your own rather than commenting on an unrelated question from a year ago.

I am guessing that you might be confused by "global" FDR adjustment. The limma documentation explains the difference between "global" and "separate" FDR adjustment.

ADD REPLY • link 5.0 years ago Gordon Smyth 52k

score 3 · Answer 1 · 2019-04-29

I think there must be an error in your code somewhere. But it's hard to say, since you just give us some snippets, and certainly are not giving us a reproducible, self contained example. So I'll give you one.

> set.seed(0xabeef)
> mat <- matrix(rnorm(10000), 1000)
> mat[1:20,1:5] <- mat[1:20,1:5] + 2
> design <- model.matrix(~gl(2,5))
> fit <- lmFit(mat, design)
> fit2 <- eBayes(fit)
> topTable(fit2,2,Inf,p.value = 0.05)
   logFC AveExpr     t  P.Value adj.P.Val     B
10 -2.91   0.853 -4.61 4.67e-06   0.00467 3.842
1  -2.75   1.176 -4.37 1.44e-05   0.00721 2.838
19 -2.63   1.051 -4.17 3.40e-05   0.01135 2.075
4  -2.57   0.974 -4.09 4.86e-05   0.01216 1.759
6  -2.53   1.188 -4.01 6.66e-05   0.01332 1.482
9  -2.39   1.518 -3.80 1.57e-04   0.02622 0.725
> rslt <- decideTests(fit2)

> head(rslt)
     (Intercept) gl(2, 5)2
[1,]           1        -1
[2,]           1         0
[3,]           1         0
[4,]           1        -1
[5,]           1         0
[6,]           1        -1
> which(rslt[,2] != 0L)
[1]  1  4  6  9 10 19

Which you can see are the same rows as from topTable. And anyway, there's this from ?decideTests

     The setting 'method="separate"' is equivalent to using 'topTable'
     separately for each coefficient in the linear model fit, and will
     identify the same probes as significantly differentially expressed
     if 'adjust.method' is the same. 'method="global"' will treat the
     entire matrix of t-statistics as a single vector of unrelated
     tests. 'method="hierarchical"' adjusts down genes and then across
     contrasts. 'method="nestedF"' adjusts down genes and then uses
     'classifyTestsF' to classify contrasts as significant or not for
     the selected genes. Please see the limma User's Guide for a
     discussion of the statistical properties of these methods.

The limma package has been around since the early 2000's, and if decideTests and topTable didn't give the same results, someone would have pointed that out long ago. So like I said, there's probably an error in your code somewhere (as an example, you say I am puzzled by the summary(decideTests(fit)), but then in the same paragraph you have this: when looking at the summary(results(fit)), which is different code).

score 1 · Answer 2 · 2019-04-30

Andreas, I find it hard to figure out what you are trying to do or what help you are seeking.

You say that you don't know what summary(decideTests(fit)) does, but you seem to understand well enough. The output tells you there are 153 DE genes (49 up and 104 down) in the Nonprogr group.

You say that you want to write results to a file, but you've obviously already written a tab-delimited text file. Why is that not what you want? What is it that you do want?

You clam that actually you have no DE genes for coef 2, but that just isn't true. How have you come that conclusion anyway? The code you show doesn't even look at DE genes.

You say you want to "extract those values", but what values exactly are you refering to? Do you want to see the DE genes? Why not just use

topTable(fit, coef=2, n=153)

Why are you sorting genes by fold change if you actually want genes with small p-values? If you did want to sort genes by fold-change, why didn't you use sort.by="logFC"? It is all quite puzzling.

You've framed your question about summary(decideTests()) but I am guessing the issue is more about how to extract DE genes.

Finally, please consider upgrading to the current version of Bioconductor and limma. The version you are using is from 2017 or earlier. If you used a recent version of limma, then the row names from summary(decideTests()) would be more informative.