Question

DESeq2 interaction term in two-factor design, which contrasts?

3

Entering edit mode

ljcohen ▴ 30

@ljcohen-13620

Last seen 7.8 years ago

I have a 2-factor design with 2 levels each:

population: fish from MDPL or MDPP condition: freshwater (FW) or transfer experimental treatments

> ExpDesign
                population condition
MDPL_FW_1             MDPL        FW
MDPL_FW_2             MDPL        FW
MDPL_FW_3             MDPL        FW
MDPL_transfer_1       MDPL  transfer
MDPL_transfer_2       MDPL  transfer
MDPL_transfer_3       MDPL  transfer
MDPP_FW_1             MDPP        FW
MDPP_FW_2             MDPP        FW
MDPP_FW_3             MDPP        FW
MDPP_transfer_1       MDPP  transfer
MDPP_transfer_2       MDPP  transfer
MDPP_transfer_3       MDPP  transfer

Our question is:

Which genes change in expression due to condition (is the condition effect different across populations?) in each population (MDPP and MDPL)?

I’m not very familiar with using interaction terms. After consulting the vignette and the ?results section on 'Using interaction terms', I still can’t decide for our particular question, do we want to look at the main effects of population (and/or condition) PLUS the interaction term or ONLY the interaction term? This DESeq2 multiple interaction terms 3-factor design in particular helped to explain how to get at the interaction terms and main effects, but I’m still not sure what is the best approach for our question.

Here is our model:

dds <- DESeqDataSetFromTximport(txi.salmon, ExpDesign, ~population + condition + population:condition)
dds$population<-relevel(dds$population,ref="MDPP")
dds<-DESeq(dds,betaPrior=FALSE)

4 contrasts in results:

> matrix(resultsNames(dds))
     [,1]                              
[1,] "Intercept"                       
[2,] "population_MDPL_vs_MDPP"         
[3,] "condition_transfer_vs_FW"        
[4,] "populationMDPL.conditiontransfer"

I have tried this:

res<-results(dds, list=c("population_MDPL_vs_MDPP","populationMDPL.conditiontransfer”))

and this

res<-results(dds, name="populationMDPL.conditiontransfer")

Which comparison would be most appropriate to extract from the object?

I think it might be best to use the first one with main effects of population and the interaction term, but I’m not sure why. If we only look at the interaction term, the log fold change numerator population vs. denominator condition does not make sense to me.

Any insight or advice you might have would be greatly appreciated.

Thank you, Lisa

deseq2 experimental design interaction model • 11k views

ADD COMMENT • link updated 7.7 years ago by Michael Love 43k • written 7.8 years ago by ljcohen ▴ 30

score 3 · Answer 1 · 2017-07-29

hi Lisa,

Re:

"do we want to look at the main effects of population (and/or condition) PLUS the interaction term or ONLY the interaction term?"

Like in the diagram in the vignette, you have the main effect for condition for the reference level, and the interaction term which is the difference between the condition effect in the two groups. So if the question is: "is the condition effect different across populations?", then the null hypothesis is that the difference is 0, and so you would produce a results table for the interaction term.

You can also ask, "is there an effect of condition for population X", which is a different question than the one I just stated above. That would be, if X is the reference level, just the main effect for condition, or if X is not the reference level, it would be adding the main effect and the interaction term.