Question

Trying to design a model(s) for a two factor RNA-seq experimental analysis in EdgeR with interaction

0

Entering edit mode

jvera8888 • 0

@jvera8888-12594

Last seen 8.1 years ago

Dear forum,

I'm attempting to use EdgeR to analyze a two or three factor (Virus [virus | no virus]], Treatment [treatment | no treatment], and Family [3 family groups]) RNAseq experiment. I'm not interested in Family except to account for any variation it introduces, perhaps in a separate model (block design?). Here is a breakdown of the targets:

	Virus	Treat	TreatGroup	Fam
Fam1_TreatPlusVirus	Virus	Treat	Virus.Treat	1
Fam1_Treat	Control	Treat	Control.Treat	1
Fam1_Virus	Virus	Control	Virus.Control	1
Fam1_Control	Control	Control	Control.Control	1
Fam2_TreatPlusVirus	Virus	Treat	Virus.Treat	2
Fam2_Treat	Control	Treat	Control.Treat	2
Fam2_Virus	Virus	Control	Virus.Control	2
Fam2_Control	Control	Control	Control.Control	2
Fam3_TreatPlusVirus	Virus	Treat	Virus.Treat	3
Fam3_Treat	Control	Treat	Control.Treat	3
Fam3_Virus	Virus	Control	Virus.Control	3
Fam3_Control	Control	Control	Control.Control	3

Although I've gathered some vital clues in the excellent EdgeR tutorial, I cannot quite wrap my head around how exactly to go about answering the following two questions:

1) what genes are differentially expressed differently between the three TreatGoups (Virus.Treat, Virus.Control, and Control.Treat) while accounting for the control (Control.Control)? In the tutorial is seemed a nested design was the best bet, but I'm having trouble matching this question to the right contrast(s), as there are subtle differences between my experiment and the examples given (e.g. I'm interested mostly in how the virus.treatment combo group is different from the other two, but I'd like to get the other two groups perspective on different genes as well).

2) what genes are differentially expressed differently in a synergistic fashion between treatgroup 'Virus.Treat' and the other two treatgroups (Virus.Control and Control.Treat), all relative to Control.Control? I realize that the answer to this question will partly overlap with question 1, but I'm thinking they are not necessarily identical. A synergistic response to the interaction of virus and treatment (imagine it has strong side-effects) incudes the "very different" gene responses from question 1, but also can include genes responding much more strongly (but in the same direction) than a simple additive effect would account for. I imagine an interaction coeffeciant would be needed in the full model to make the proper contrast for this, but I'm not sure how to go about it.

3) if there is a strong family effect (large variation due to family), can the block design described in the tutorial be used to account for this variation while still answering the above two questions? I've already run an MDS plot and run the correlation used in example 4.2 and they seem to indicate some pretty strong variation.

Thanks in advance for any help offered!

Cris

EdgeR RNAseq experimental design glm • 1.8k views

ADD COMMENT • link updated 8.1 years ago by Aaron Lun ★ 28k • written 8.1 years ago by jvera8888 • 0

score 1 · Answer 1 · 2017-03-14

I'll start by mentioning the design matrix I would use:

design <- model.matrix(~0 + TreatGroup + Fam)

... which treats each treatment combination as a particular group, and blocks on the family.

Now, to answer your specific questions. For the first one; if you want to identify genes that are DE between any of the non-control groups (i.e., all but control/control), you would run an ANODEV like so:

con <- makeContrasts(Virus.Treat - Control.Treat, 
                     Virus.Treat - Virus.Control, levels=design)

... taking some liberties with the column names for the design matrix, for simplicity. The control/control group doesn't get involved at all if you're only looking for differences between the treatment groups (it would just cancel out anyway if you forced it in). For example:

# Computing differences in DE log-fold changes between treatments:
(Virus.Treat - Control.Control) # log-fold change in VT vs control
- (Virus.Control - Control.Control) # log-fold change in VC vs control
= Virus.Treat - Virus.Control # control group cancels out

For the second question: I'm guessing that you're looking for some non-additive effect of virus and treatment. In which case:

con <- makeContrasts((Virus.Treat - Virus.Control) - (Control.Treat - Control.Control),
                     levels=design)

... is what you want. This tests whether the virus/treatment interaction term is non-zero; if the responses were additive, the effect of the treatment with virus should be the same as the effect without the virus, i.e., the two things in parentheses above would cancel out.

For the third question: well, that's why I have the family blocking factor in the design matrix above.