Question

interpretation complex design limma

0

Entering edit mode

dfrtyu • 0

@grateshak-10586

Last seen 3.4 years ago

United Kingdom

Hi everyone, and prof Gordon Smyth

Pls help on how best to view two designs used for limma as below. The objective was to pool higher/secondary-level groups as well as first-level groups of samples within the design to get DGE.

So, with a design and the logCPM mean-variance output i.e. voom() function , Four people used the logic of normal designs and therefore added the 'higher/secondary-level' contrasts as below

ct<-makeContrasts(g2v1=(group2_dead+group2_alive) - (group1_dead+group1_alive),
g2v1dead=group2_dead - group1_dead , g2v1alive=group2_alive - group1_alive, status=(group1_dead+group2dead) - group1dead+group1alive, levels=design)

b<-eBayes( contrasts.fit( lmFit(data, design),  contrasts=ct))
summary(decideTests(b))

sessionInfo( )

My question is : Does this approach have any form of interpretation from the resulting DE or it should be discarded completely in favour of division by numbers as below

ct <- makeContrasts(g2v1=(group2_dead+group2_alive)/2 - (group1_dead+group1_alive)/2  ,
g2v1dead=group2_dead - group1_dead ,    g2v1alive=group2_alive - group1_alive, status=(group1_dead+group2dead)/2 - (group1dead+group1alive)/2 ,  levels=design)

b<-eBayes( contrasts.fit( lmFit(data, design),  contrasts=ct))
summary(decideTests(b))

sessionInfo( )
R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] GEOmetadb_1.52.0    RSQLite_2.2.7       GSA_1.03.1          sva_3.38.0         
 [5] BiocParallel_1.24.1 genefilter_1.72.1   mgcv_1.8-31         nlme_3.1-148       
 [9] oligo_1.54.1        Biostrings_2.58.0   XVector_0.30.0      IRanges_2.24.1     
[13] S4Vectors_0.28.1    oligoClasses_1.52.0 affy_1.68.0         forcats_0.5.1      
[17] stringr_1.4.0       dplyr_1.0.6         purrr_0.3.4         readr_1.4.0        
[21] tidyr_1.1.3         tibble_3.1.1        ggplot2_3.3.5       tidyverse_1.3.1    
[25] limma_3.46.0        GEOquery_2.58.0     Biobase_2.50.0      BiocGenerics_0.36.1

limma • 1.1k views

ADD COMMENT • link updated 3.4 years ago by Gordon Smyth 52k • written 3.4 years ago by dfrtyu • 0

score 1 · Answer 1 · 2021-12-08

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 5 days ago

United States

When you fit a linear model and make comparisons you are always computing the average for a group, and you make comparisons by calculating differences between those averages. In your first contrast you are computing sums, whereas the second you are computing averages. In other words, in

g2v1=(group2_dead+group2_alive) - (group1_dead+group1_alive)

That is the sum of group 2 minus the sum of group 1, which isn't something you would normally care to know.

g2v1=(group2_dead+group2_alive)/2 - (group1_dead+group1_alive)/2

is the average of group 2 minus the average of group 1, which is a readily interpretable quantity.

ADD COMMENT • link 3.4 years ago James W. MacDonald 68k

0

Entering edit mode

Very many thanks for the reply! @ James MacDonald

Indeed it is probably unnecessary to do g2v1=(group2_dead+group2_alive) hence the question about interpretation vis-a-vis the concept of DE. Part of why I asked about interpretability is because there was a 'non-expert' querying me about the input are all sum of log data

I guess you are indicating that such is not interpretable

ADD REPLY • link 3.4 years ago dfrtyu • 0

0

Entering edit mode

The two different contrast matrices you give will yield identical lists of DE genes, p-values and FDRs. The only difference will be in the log-fold-changes, which will differ by a factor of 2 for the third contrast. As long as you know what the logFCs mean, both choices lead to the same conclusions, but I would always myself use the mean-mean contrast instead of sum-sum.

ADD REPLY • link 3.4 years ago Gordon Smyth 52k