Removing batch effect on edgeR when interactons are looked at
1
0
Entering edit mode
David R ▴ 90
@david-rengel-6321
Last seen 25 days ago
European Union

Hi,

I am running some RNASeq analysis with two experimental factors, namely time (two-level factor) and genotype (three-level factor). In cases like this, I like using edgeR cause, among other reasons, it makeContrasts feature is very handy. Thus, I would create a design factor combining all levels of time and genotype. This would allow me to easily look, for instance, for interactions between Genotype 2 and 1 over time by doing. Like this:

design.factor <- rep (c("Gtype1.Time1","Gtype1.Time2", "Gtype2.Time3","Gtype2.Time2", "Gtype3.Time1","Gtype3.Time2"), each=4)

design.1 <- model.matrix(~0+design.factor)

colnames(design.1)=sub("design.factor","",colnames(design.1))

my.contrasts=makeContrasts(

Inter =  (Gtype2.Time2  – Gtype2.Time1)  –  (Gtype1.Time2  - Gtype1.Time1),

Levels=design.1)

However, in this particular experiment there is a strong replicate effect. Biological replicates where carried out over time in four independent experiments. I would like to correct this effect, but this is where things get a little bit complicated when I run the code here below:

reps <- rep(c("R1","R2","R3","R4"),6)

design.factor <- rep (c("Gtype1.Time1","Gtype1.Time2", "Gtype2.Time3","Gtype2.Time2", "Gtype3.Time1","Gtype3.Time2"), each=4)

design.2 <- model.matrix(~0+reps+design.factor)

colnames(design.2)=sub("reps|design.factor","",colnames(design.2))

colnames(design.1)

[1] “R1” “R2” “R3” “R4” “

colnames(design.1)

[1] "R1”  "R2"  "R3"  "R4"  "Gtype1.Time2" "Gty

pe2.Time2" "Gtype2.Time3"  "Gtype3.Time1" "Gtype3.Time2"

As you cans see, my “Gtype1.Time1” level has disappeared due, I guess, to the way the degrees of freedom have been handled. Then my questions are:

1-Does all that mean that, for instance, the "Gtype1.Time2" coefficient will be in fact the difference "Gtype1.Time2" - "Gtype1.Time1".

2-Can I run interaction-type contrasts as described above? An dif this is the case, how?

3-Should I try tools such as SVASeq instead?

I thank you in advance for any help on this.

Best,

David

 

 

rnaseq edger statistics model matrix • 1.4k views
ADD COMMENT
2
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 12 minutes ago
The city by the bay

You can do everything you want to do by applying a bit of arithmetic to the terms in design.2. However, the simplest solution is just to set up your design matrix in a friendlier manner:

design.3 <- model.matrix(~0 + design.factor + reps)
colnames(design.3) <- sub("reps|design.factor","",colnames(design.3))

Printing colnames(design.3) gives me:

[1] "Gtype1.Time1" "Gtype1.Time2" "Gtype2.Time2" "Gtype2.Time3" "Gtype3.Time1"
[6] "Gtype3.Time2" "R2"           "R3"           "R4"

... where the first 6 coefficients represent the fitted expression for each group in R1. The last three coefficients represent the log-fold change of R2/3/4 over R1. You can then perform comparisons between groups as you did for design.1.

ADD COMMENT
0
Entering edit mode

Hi Aaron,

Thanks a lot for your helpful reply, I had not thought about this way out. Let me ask: do you think I am "losing" any power from the fact that I would use only R1-coefficients or this is not the case cause the contrasts are going to be made on the fitted values along al 4 replicates.

Thaks again,

David.

ADD REPLY
0
Entering edit mode

No, there is no loss of power, information is used from all four replicates. It's an additive model, so there is an implicit assumption that the differences between groups for R1 are the same for R2, etc.

ADD REPLY
0
Entering edit mode

Thanks a lot Aaron,

Best wishes,

David

ADD REPLY

Login before adding your answer.

Traffic: 424 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6