DESeq2: Question about the alpha in summary() and results()
1
0
Entering edit mode
Jon Bråte ▴ 260
@jon-brate-6263
Last seen 4 months ago
Norway

I'm getting slight differences when setting the alpha to 0.05 with the results() function and the summary() function. Is this correct, or am I doing something wrong?

Thanks, Jon

 

> res <- results(dds)
> summary(res)

out of 29391 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)     : 2218, 7.5% 
LFC < 0 (down)   : 2607, 8.9% 
outliers [1]     : 0, 0% 
low counts [2]   : 11397, 39% 
(mean count < 5)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

> res.05 <- results(dds, alpha = 0.05)
> summary(res.05)

out of 29391 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)     : 1814, 6.2% 
LFC < 0 (down)   : 2210, 7.5% 
outliers [1]     : 0, 0% 
low counts [2]   : 12536, 43% 
(mean count < 7)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

> summary(res, alpha = 0.05)

out of 29391 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)     : 1797, 6.1% 
LFC < 0 (down)   : 2197, 7.5% 
outliers [1]     : 0, 0% 
low counts [2]   : 11397, 39% 
(mean count < 5)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.2

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.18.1              SummarizedExperiment_1.8.0 DelayedArray_0.4.1        
 [4] matrixStats_0.52.2         Biobase_2.38.0             GenomicRanges_1.30.0      
 [7] GenomeInfoDb_1.14.0        IRanges_2.12.0             S4Vectors_0.16.0          
[10] BiocGenerics_0.24.0       

loaded via a namespace (and not attached):
 [1] locfit_1.5-9.1          Rcpp_0.12.14            lattice_0.20-35        
 [4] tidyr_0.7.2             assertthat_0.2.0        digest_0.6.12          
 [7] R6_2.2.2                plyr_1.8.4              backports_1.1.1        
[10] acepack_1.4.1           RSQLite_2.0             ggplot2_2.2.1          
[13] zlibbioc_1.24.0         rlang_0.1.4             lazyeval_0.2.1         
[16] rstudioapi_0.7          data.table_1.10.4-3     annotate_1.56.1        
[19] blob_1.1.0              rpart_4.1-11            Matrix_1.2-12          
[22] checkmate_1.8.5         splines_3.4.3           BiocParallel_1.12.0    
[25] geneplotter_1.56.0      stringr_1.2.0           foreign_0.8-69         
[28] htmlwidgets_0.9         bit_1.1-12              RCurl_1.95-4.8         
[31] munsell_0.4.3           compiler_3.4.3          pkgconfig_2.0.1        
[34] base64enc_0.1-3         htmltools_0.3.6         nnet_7.3-12            
[37] tibble_1.3.4            gridExtra_2.3           htmlTable_1.11.0       
[40] GenomeInfoDbData_0.99.1 Hmisc_4.0-3             XML_3.98-1.9           
[43] dplyr_0.7.4             bitops_1.0-6            grid_3.4.3             
[46] xtable_1.8-2            gtable_0.2.0            DBI_0.7                
[49] magrittr_1.5            scales_0.5.0            stringi_1.1.6          
[52] XVector_0.18.0          genefilter_1.60.0       bindrcpp_0.2           
[55] latticeExtra_0.6-28     Formula_1.2-2           RColorBrewer_1.1-2     
[58] tools_3.4.3             bit64_0.9-7             glue_1.2.0             
[61] purrr_0.2.4             survival_2.41-3         yaml_2.1.15            
[64] AnnotationDbi_1.40.0    colorspace_1.3-2        cluster_2.0.6          
[67] memoise_1.1.0           knitr_1.17              bindr_0.1
deseq2 • 7.2k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 18 hours ago
United States

Take a look at what 'alpha' does in ?results:

   alpha: the significance cutoff used for optimizing the independent
          filtering (by default 0.1). If the adjusted p-value cutoff
          (FDR) will be a value other than 0.1, ‘alpha’ should be set
          to that value.

 

Whereas in ?summary.DESeqResults:

   alpha: the adjusted p-value cutoff. If not set, this defaults to the
          ‘alpha’ argument which was used in ‘results’ to set the
          target FDR for independent filtering, or if independent
          filtering was not performed, to 0.1.

So, in results(), alpha is used to change how the independent filtering is performed, which changes the adjusted p-values themselves. In summary(), alpha is just changing the threshold that is used to tally up the DE gene counts.

ADD COMMENT
0
Entering edit mode

I see, thanks!

ADD REPLY
0
Entering edit mode

I recently repeated an analysis with alpha set to 0.5 to match my FDR cutoff, as recommended on this thread. I expected this to change my results but it did absolutely nothing (see summary below). The stats appear to be identical for both analyses. Does this make sense? For some datasets, might this change not affect filtering AT ALL? Note that changing alpha to 0.01 did affect the results. I think this is telling me that I'm doing it right and that there is no difference if I run the analysis with alpha = 0.05 vs. 0.1. Would you agree?

Thanks!

sessionInfo()

R version 3.4.2 (2017-09-28) Platform: x8664-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 16299) other attached packages: [1] DESeq21.18.1

summary(res_p.05)

out of 23911 with nonzero total read count adjusted p-value < 0.05 LFC > 0 (up) : 1255, 5.2% LFC < 0 (down) : 1157, 4.8% outliers [1] : 0, 0% low counts [2] : 7925, 33% (mean count < 5)

summary(res)

out of 23911 with nonzero total read count adjusted p-value < 0.1 LFC > 0 (up) : 1590, 6.6% LFC < 0 (down) : 1567, 6.6% outliers [1] : 0, 0% low counts [2] : 7925, 33% (mean count < 5)

ADD REPLY
0
Entering edit mode

I don't follow you. These summary tables are different.

ADD REPLY
0
Entering edit mode

Ah, yes. They are different in that they give different numbers of DE genes. I assume this is simply due to the different p-value cutoff. The filtering is not different (as I thought it might be) and the actual statistics (lfcS, stat, pvalue, padj). As it is, I could simply filter my results from the default analysis (alpha = 0.1) and obtain the same result as I do when the analysis is done with alpha = 0.05. Does that make sense?

I was under the impression, from your comments here, that the alpha value would alter the filtering.

thanks!

ADD REPLY
0
Entering edit mode

*actual statistics are identical

ADD REPLY
0
Entering edit mode

I see - yes the filtering can be different but in this case it is not.

ADD REPLY
0
Entering edit mode

Gotcha. So it's good practice to use alpha at the same value as your FDR cutoff, but in my case it doesn't change the filtering. Thank you!

ADD REPLY
0
Entering edit mode

Here are the results with alpha = 0.01. This does change the filtering and it changes the statistics slightly. I think only the adjusted p-value though. I assume this is because of the number of genes being tested..?

summary(res_p.01)

out of 23911 with nonzero total read count adjusted p-value < 0.01 LFC > 0 (up) : 808, 3.4% LFC < 0 (down) : 598, 2.5% outliers [1] : 0, 0% low counts [2] : 7485, 31% (mean count < 4)

ADD REPLY
0
Entering edit mode

Hi Michael, I have 3 questions if that's ok.

1- I did DE and changed my alpha to 0.05 in results() to match my FDR cut-off. I understood that in summary(), alpha is set automatically to the same value I used when I ran results(). The question I have though, is when using MAplot() do I need to change alpha to match the value I used in results(), which in my case is 0.05?

2- I ran DESeq() with betaPrior=FALSE (which is default) then used results() with addMLE=FALSE (by default) followed by lfcShrink(). If I want to plot the shrunken values, I will need to run MAplot() with MLE=FALSE, right?

3-Finally, in MAplot() how will I know p-adj value that is used to show the significant genes (blue points in the plot)? is it the same as my alpha?

ADD REPLY
0
Entering edit mode

plotMA has its own argument alpha. You should set it to specify which threshold to show. It is 0.1 unless you change the value.

You don't need to run addMLE=FALSE when you run results, because it already has the MLE results. Likewise you don't have to specify MLE=FALSE it will plot the shrunken values by default.

ADD REPLY
0
Entering edit mode

Thank you Michael for the quick reply!

regarding addMLE=FALSE, if I don't add it, you stated that it has already the MLE results, so unshrunken values. But I want to plot the shrunken values.

On another thread you explained how to plot shrunken vs unshrunken values. After further reading DESeq2 documentation, what I understood is that:

1- We run DESeq() with default betaPrior=FALSE then results() with default addMLE=FALSE. Then shrunken can be obtained using lfcShrink()

2- If we run DESeq() with betaPrior=TRUE to get unshrunken values, we need to run results() with addMLE=TRUE

I went with option 1 and I am looking to use plotMA to plot shrunken vs unshrunken. Is this not possible?

ADD REPLY
0
Entering edit mode

Don't run DESeq() with betaPrior=TRUE. It's essentially deprecated. You won't see it mentioned in the vignette or documentation.

res1 <- results(dds) # unshrunken
res2 <- lfcShrink(dds) # shrunken
ADD REPLY
1
Entering edit mode

Thank you so much Michael! I really appreciate it :)

ADD REPLY

Login before adding your answer.

Traffic: 692 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6