I am trying to use the DESeq2 package to analyze RNASeq dataset in a very complex design and am having trouble wrapping my head around. I have a 4-factor experiment generated from phosphoTRAP protocol:
- age (with 4 levels): 1, 2, 3, 4
- sex (with 2 levels): F, M
- stimulus (with 2 levels): Ctrl, Exp
- fraction (with 2 levels): total, IP
We expect each factor and their interactions to be important in explaining gene expression. In order to analyze genes influenced by an individual component and those influences by multiple variables, I'm particularly interested in the following results/questions:
- which genes are DE for IP over total fraction at each
age*sex*stimulus
- which fraction-DE genes are DE for Exp over Ctrl at each
age*sex
(so I guess, a ratio of a ratio?) - are there differences between any
age*sex
in stimulus-DE from question 2
my model is design=~fraction*age*sex*stimulus
I run DESeq2 and output comes out I think correctly, but I am confused how to extract the contrasts of interest to answer my questions. For instance, if I want to know fraction-DE for age4/sexM/stimulusExp, I think I set my contrasts to "fraction_IP_vs_total","fractionIP.age4.sexM.stimulusExp"
, this seems obvious. But then if I want to answer question 2, do I set contrast to ("stimulus_Exp_vs_Ctrl", "fractionIP.age4.sexM.stimulusExp")
? Or ("fractionIP.stimulusExp", "fractionIP.age4.sexM.stimulusExp")
? Or some else entirely?
I guess another way to put is, for "fractionIP.age4.sexM.stimulusExp"
, does this interaction term for contrast mean it already contains the genes that are fraction-DE due to 'fractionIP' as part of the term, or does this need to fold into the other contrast term?
It is easy to wrap my head around the simple two factor designs in DESeq2 manual, but with more complicated designs, I am not so sure. Any guidance is much appreciated.
All the bests. Mike.