Good Day!
I have trouble interpreting if the design model I made are fetching the DE genes that affects only my factors.
I am doing an DE analysis in edgeR with two factors: salt (Na) and pH. Each factor has two levels high (H) and low (L). The goal of the experiment is to determine genes that are affecting tolerances in salt only, in pH only and that affects both salt and pH (pH-Na). Each with six replicates. Per my understanding of userguide, I made a no-intercept design model matrix as I feel there is no baseline.
>group<-factor(paste(targets$pH,targets$Salt,sep="_"))
>group
[1] LpH_LNa LpH_LNa LpH_LNa LpH_LNa LpH_LNa LpH_LNa
[7] LpH_HNa LpH_HNa LpH_HNa LpH_HNa LpH_HNa LpH_HNa
[13] HpH_LNa HpH_LNa HpH_LNa HpH_LNa HpH_LNa HpH_LNa
[19] HpH_HNa HpH_HNa HpH_HNa HpH_HNa HpH_HNa HpH_HNa
Levels: HpH_HNa HpH_LNa LpH_HNa LpH_LNa
>design <- model.matrix(~0+group)
For simplicity, the design model matrix looks like this:
HpH_Hna | HpH_Lna | LpH_Hna | LpH_Lna | |
A1 | 0 | 0 | 0 | 1 |
A2 | 0 | 0 | 0 | 1 |
B1 | 0 | 0 | 1 | 0 |
B2 | 0 | 0 | 1 | 0 |
C1 | 0 | 1 | 0 | 0 |
C2 | 0 | 1 | 0 | 0 |
D1 | 1 | 0 | 0 | 0 |
D2 | 1 | 0 | 0 | 0 |
Additionally I made a contrasts table for ease of comparison.
>basic.contrasts <- makeContrasts(
cBvsA = LpH_HNa - LpH_LNa, #(B-A)
cCvsA = HpH_LNa - LpH_LNa, #(C-A)
cDvsC = HpH_HNa - HpH_LNa, #(D-C)
cDvsB = HpH_HNa - LpH_HNa, #(D-B)
pHeffect = ((HpH_LNa + HpH_HNa)/2) - ((LpH_HNa + LpH_LNa)/2),
Naeffect = ((HpH_HNa + LpH_HNa)/2) - ((HpH_LNa + LpH_LNa)/2),
pHNaeffect = ((HpH_HNa-LpH_HNa)-(HpH_LNa-LpH_LNa)),
>basic.contrasts
Contrasts
Levels cBvsA cCvsA cDvsC cDvsB pHeffect Naeffect pHNaeffect
HpH_HNa 0 0 1 1 0.5 0.5 1
HpH_LNa 0 1 -1 0 0.5 -0.5 -1
LpH_HNa 1 0 0 -1 -0.5 0.5 -1
LpH_LNa -1 -1 0 0 -0.5 -0.5 1
My question is:
1) Is it correct to assume that the genes created from "pHeffect", which can be computed as [(HpH_HNa+HpH_LNa)/2 - (LpH_HNa+LpH_LNa)/2
], will yield the DE genes that affects pH regardless of salt?
2) Or, it is correct assume that the genes common for both DE in cDvsB [HpH_HNa - LpH_HNa]
and cCvsA [HpH_HNa - LpH_HNa]
will yield the DE genes that affects pH regardless of salt?
3) I would like also to confirm if my interpretation that the pHNaeffect [(HpH_HNa - LpH_HNa) - (HpH_LNa - LpH_LNa)]
are the genes that affects both pH and salt tolerance?
4) Or is there a better way to do these?
Adding onto Ryan's answer to #3: the contrast you've supplied will test whether the interaction between pH and salt is zero, i.e., is the combined effect of pH and salt equal to the sum of the individual effects?
If you want to find genes that are DE with both salt and pH, you could use Ryan's additive approach, provided that most genes do not have any significant interactions (otherwise the variance estimates would be inflated by a misspecified model). Alternatively, you could use your existing model with the one-way layout, and do something like:
LpH_LNa vs LpH_HNa
andLpH_LNa vs HpH_LNa
. This will identify genes that are DE upon increasing either salt or pH.pH_HNa vs LpH_HNa
andHpH_HNa vs HpH_LNa
. This will identify genes that are DE upon decreasing either salt or pH.The use of two intersection operations reflects the fact that the effect of the perturbations are dependent on the context, i.e., increasing salt in a low-pH context may not have the same effects as decreasing salt in a high-pH context.
Thank you!
I thought so! I'm wrong. The venn diagram I made when I match the intersection and my old "pH/Na effect" does not coincide well. I'll try to use an additive model. It pH and salt DEs there should correspond similarly with the genes that are common with
[HpH_HNa - LpH_HNa]
and cCvsA[HpH_LNa - LpH_LNa]
in my old model, right?Maybe they'll match up, maybe they won't. It depends on how many genes have significant interaction effects. This will inflate the variance estimates and reduce power in the additive model. Furthermore, if many genes do have significant interaction effects, this means the salt context is important when considering changes in pH (and vice versa). For such genes, the test results from the additive model would be uninterpretable.