Understanding edgeR differential expression results
1
0
Entering edit mode
Teresa • 0
@a420c4c0
Last seen 20 months ago
United States

Hi all,

I'm doing metatranscriptomics differential expression analysis in R studio. I have 1-3 replicates per site and am trying to use edgeR. I want to compare Sites J and F, both when water is flowing (On) and not flowing (Off) with sites A-D, which are constant. However, I'm somewhat confused by my output and how to interpret this biologically or if my method of analysis was just incorrect... (Rounded numbers used below). How can OnvConstant and OffvConstant be similar to each other when BothvConstant, which include both On and Off samples in "Both," be different? I tried a few On/Off Site v Constant Site constrasts, like FvA, FvB, JvA. JvB, and these result in a large number of genes labeled as downregulated... While constant vs constant, like AvB, AvC, result in about a quarter up, half notsig, a quarter down. FvJ is mostly NotSig, with a few up and down.

fit.glm <- glmFit(d, design.mat)
glm.contrasts<-makeContrasts(
  OnvSB=(J.On+F.On)-(A.Constant+B.Constant+C.Constant+D.Constant),
  OffvSB=(J.Off+F.Off)-(A.Constant+B.Constant+C.Constant+D.Constant),
  BothVConstant=(J.On+F.On+J.Off+F.Off)-(A.Constant+B.Constant+C.Constant+D.Constant),
  levels=design.mat)

OnVConstant.glm <- glmLRT(fit.glm, contrast=glm.contrasts[,"OnVConstant.glmt"])
OnVConstant.glmDE <- decideTestsDGE(OnVConstant.glm, adjust.method = "fdr", p.value=0.1)
summary(OnVConstant.glmDE)
#output 
Down 5
NotSig 300
Up 4600

#
OffVConstant.glm <- glmLRT(fit.glm, contrast=glm.contrasts[,"OffVConstant"])
OffVConstant.glmDE <- decideTestsDGE(OffVConstant.glm, adjust.method = "fdr", p.value=0.1)
summary(OffVConstant.glmDE)
#output:
Down 10
NotSig 695
Up 4200

#
BothVConstant.glm <- glmLRT(fit.glm, contrast=glm.contrasts[,"BothVConstant"])
BothVConstant.glmDE <- decideTestsDGE(BothVConstant.glm, adjust.method = "fdr", p.value=0.1)
summary(BothVConstant.glmDE)
#output:
Down 800
NotSig 1700
Up 2405


sessionInfo( )
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] forcats_0.5.1               stringr_1.4.0              
 [3] dplyr_1.0.8                 purrr_0.3.4                
 [5] readr_2.1.2                 tidyr_1.2.0                
 [7] tibble_3.1.6                ggplot2_3.3.5              
 [9] tidyverse_1.3.1             viridis_0.6.2              
[11] viridisLite_0.4.0           pheatmap_1.0.12            
[13] argparse_2.1.6              ape_5.6-2                  
[15] gplots_3.1.3                ctc_1.64.0                 
[17] amap_0.8-18                 DESeq2_1.30.1              
[19] SummarizedExperiment_1.20.0 Biobase_2.50.0             
[21] MatrixGenerics_1.2.1        matrixStats_0.61.0         
[23] GenomicRanges_1.42.0        GenomeInfoDb_1.26.7        
[25] IRanges_2.24.1              S4Vectors_0.28.1           
[27] BiocGenerics_0.36.1         edgeR_3.32.1               
[29] limma_3.46.0                BiocManager_1.30.16        

loaded via a namespace (and not attached):
  [1] colorspace_2.0-3       ellipsis_0.3.2        
  [3] XVector_0.30.0         fs_1.5.0              
  [5] rstudioapi_0.13        bit64_4.0.5           
  [7] lubridate_1.8.0        AnnotationDbi_1.52.0  
  [9] fansi_1.0.2            xml2_1.3.3            
 [11] codetools_0.2-16       splines_4.0.2         
 [13] cachem_1.0.6           geneplotter_1.68.0    
 [15] ade4_1.7-19            jsonlite_1.8.0        
 [17] phyloseq_1.34.0        broom_0.7.12          
 [19] annotate_1.68.0        cluster_2.1.0         
 [21] dbplyr_2.1.1           compiler_4.0.2        
 [23] httr_1.4.2             backports_1.4.1       
 [25] assertthat_0.2.1       Matrix_1.3-2          
 [27] fastmap_1.1.0          cli_3.1.0             
 [29] tools_4.0.2            igraph_1.2.11         
 [31] gtable_0.3.0           glue_1.5.0            
 [33] GenomeInfoDbData_1.2.4 reshape2_1.4.4        
 [35] Rcpp_1.0.8             cellranger_1.1.0      
 [37] vctrs_0.3.8            Biostrings_2.58.0     
 [39] zCompositions_1.4.0-1  rhdf5filters_1.2.1    
 [41] multtest_2.46.0        nlme_3.1-155          
 [43] iterators_1.0.14       rvest_1.0.2           
 [45] lifecycle_1.0.1        gtools_3.9.2          
 [47] XML_3.99-0.9           zlibbioc_1.36.0       
 [49] MASS_7.3-55            scales_1.1.1          
 [51] vroom_1.5.7            hms_1.1.1             
 [53] biomformat_1.18.0      rhdf5_2.34.0          
 [55] RColorBrewer_1.1-2     memoise_2.0.1         
 [57] gridExtra_2.3          NADA_1.6-1.1          
 [59] stringi_1.7.6          RSQLite_2.2.10        
 [61] genefilter_1.72.1      foreach_1.5.2         
 [63] permute_0.9-7          caTools_1.18.2        
 [65] BiocParallel_1.24.1    truncnorm_1.0-8       
 [67] rlang_1.0.2            pkgconfig_2.0.3       
 [69] bitops_1.0-7           lattice_0.20-41       
 [71] Rhdf5lib_1.12.1        bit_4.0.4             
 [73] tidyselect_1.1.2       plyr_1.8.6            
 [75] magrittr_2.0.1         R6_2.5.1              
 [77] generics_0.1.2         DelayedArray_0.16.3   
 [79] DBI_1.1.2              withr_2.5.0           
 [81] pillar_1.7.0           haven_2.4.3           
 [83] mgcv_1.8-39            survival_3.3-1        
 [85] RCurl_1.98-1.6         modelr_0.1.8          
 [87] crayon_1.5.0           KernSmooth_2.23-17    
 [89] utf8_1.2.2             tzdb_0.2.0            
 [91] ALDEx2_1.22.0          readxl_1.3.1          
 [93] locfit_1.5-9.5         grid_4.0.2            
 [95] data.table_1.14.2      blob_1.2.2            
 [97] vegan_2.5-7            reprex_2.0.1          
 [99] xtable_1.8-4           munsell_0.5.0
Transcriptomics edgeR DifferentialExpression RNASeqData RNASeq • 1.0k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

The contrasts you have created do not look correct. It makes no sense to compare sums of coefficients. You probably intended to compare averages:

  OnvSB = (J.On+F.On)/2 - (A.Constant+B.Constant+C.Constant+D.Constant)/4,
  OffvSB = (J.Off+F.Off)/2 - (A.Constant+B.Constant+C.Constant+D.Constant)/4,
  BothVConstant = (J.On+F.On+J.Off+F.Off)/4 - (A.Constant+B.Constant+C.Constant+D.Constant)/4,
ADD COMMENT
0
Entering edit mode

Thanks! The results m make more sense this way, and averages is what I was hoping to compare. Is it ok that each one has a different number of replicates (J.On has 3, F.On has 1, A.Constant has 2, etc)? I'm unsure how to control for that.

ADD REPLY
0
Entering edit mode

Yes, it is ok that different groups have different numbers of replicates. edgeR adjusts for that automatically. However you may get somewhat more significantly DE genes when comparing groups with more replicates because of the increased statistical power.

ADD REPLY

Login before adding your answer.

Traffic: 478 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6