Question

design for edgeR

0

Entering edit mode

simarsidhu25 • 0

@simarsidhu25-12016

Last seen 8.1 years ago

I am doing edgeR glmLRT() test because I don't have any replicates. I have three groups and I made the design as:

design <- model.matrix(~0+group+group:plant+treatment:group,data=y$samples)

I want to compare both between and within subjects. I am following section 3.5 of this edgeRUsersGuide. But I am getting the same DE genes with opposite expression for groupR:treatmentM and groupS:treatmentM.

Please help me why it happens?

edger design matrix glmlrt • 2.3k views

ADD COMMENT • link updated 8.3 years ago by Aaron Lun ★ 28k • written 8.3 years ago by simarsidhu25 • 0

0

Entering edit mode

Please provide enough information in the post to reconstruct the design matrix; explain what the group, plant and treatment terms represent; and describe what the DE contrasts are intended to actually compare.

ADD REPLY • link 8.3 years ago Aaron Lun ★ 28k

0

Entering edit mode

Group is the two types of plant lines like: resistant and susceptible. Plant is like:

1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10

Same plant one is treated and another is not.

ADD REPLY • link 8.3 years ago simarsidhu25 • 0

0

Entering edit mode

Group is R,R,R,R,R,R,R,R,R,R,S,S,S,S,S,S,S,S,S,S

Treatment is V,M,V,M,V,M,V,M,V,M,V,M,V,M,V,M,V,M,V,M

I want to compare effect of treatment in groupR vs groupS

ADD REPLY • link 8.3 years ago simarsidhu25 • 0

score 1 · Answer 1 · 2016-12-28

1

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 9 hours ago

The city by the bay

The model parametrization you describe in your original post is very strange, so I'm just going to ignore it. Rather, it seems that the only possible design for edgeR is to do something like this:

Group <- rep(c("R", "S"), each=10)
Plant <- factor(rep(1:10, each=2))
Treatment <- rep(c("V", "M"), 10)
design <- model.matrix(~0 + Plant + Group:Treatment)
colnames(design) <- sub(":", "_", colnames(design)) # get rid of colon
design <- design[,-grep("TreatmentV", colnames(design))] # get to full rank

The first 10 coefficients are plant-specific blocking factors and aren't of interest for inference. The last two coefficients represent the effect of treatment (M over V) in each group. You can drop them separately to test for the effect of treatment within each group, or you can compare them to each other to identify group-specific responses to treatment:

con <- makeContrasts(GroupR_TreatmentM - GroupS_TreatmentM, levels=design)

... and feed that to glmLRT. This will test for whether the effect of treatment in the resistant group is the same as the effect of treatment in the susceptible group, which seems to be closest to what you're trying to do. On the plus side, the design I've described here has plenty of residual degrees of freedom for dispersion estimation; you actually do have replicates here.

ADD COMMENT • link 8.3 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thank you so much.

I have tried it. But still I am getting the same genes in both GroupR_TreatmentM - GroupS_TreatmentM and GroupR_TreatmentV - GroupS_TreatmentV and expression of these genes are opposite in these contrasts. That means the genes which shows upregulation in one shows downregulation in anoter.

I have got 52 DE genes only. Is this good results or not.

Thanks again.

ADD REPLY • link 8.3 years ago simarsidhu25 • 0

0

Entering edit mode

You must have made some mistake. It should not be possible to fit a design matrix with GroupR_TreatmentV and GroupS_TreatmentV coefficients along with GroupR_TreatmentM and GroupS_TreatmentM. You need to remove either the *_TreatmentV coefficients or the *_TreatmentM coefficients to get to full rank.

ADD REPLY • link 8.3 years ago Aaron Lun ★ 28k

0

Entering edit mode

I am doing separately and then comparing the genes of both the analysis.

ADD REPLY • link 8.3 years ago simarsidhu25 • 0

0

Entering edit mode

I have made the design matrix as well as contrast as suggested by you and find the DE genes in GroupR_TreatmentM - GroupS_TreatmentM:

> y<-estimateDisp(y,design,robust=TRUE)
> fit <- glmFit(y, design,robust=TRUE)
> summary(fit$df.prior)
> lrt <- glmLRT(fit, contrast=con) # To find genes responding to the treatment
> topTags(lrt)
> summary(de.HL<-decideTestsDGE(lrt,p.value=0.05))
   [,1]
-1    32
0 49814
1     20

Then I have did another analysis where I have made changes as:

design <- design[,-grep("TreatmentM", colnames(design))]

and find results:

> summary(de.HL<-decideTestsDGE(lrt,p.value=0.05))
   [,1] 
-1    20
0  49814
1     32

ADD REPLY • link 8.3 years ago simarsidhu25 • 0

0

Entering edit mode

This is an obvious consequence of how the coefficients are interpreted. If you remove *_TreatmentV, the remaining *_TreatmentM coefficients represent the log-fold changes of M over V in each group. Conversely, if you remove *_TreatmentM, the *_TreatmentV coefficients represent the log-fold changes of V over M. In both cases, the last two coefficients represent the treatment effect within each group, only differing in the sign of the log-fold change. It is a mathematical inevitability that you'll get the same DE genes with reversed signs.

ADD REPLY • link 8.3 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thank you so much. It really helps a lot.

ADD REPLY • link 8.3 years ago simarsidhu25 • 0

0

Entering edit mode

If I want to compare effect of treatment on only R group then I have used

con <- makeContrasts(GroupR_TreatmentM, levels=design)

am I right?

Thanks.

ADD REPLY • link 8.3 years ago simarsidhu25 • 0

0

Entering edit mode

Yes. Or you can just set coef="GroupR_TreatmentM" in glmLRT directly.

ADD REPLY • link 8.3 years ago Aaron Lun ★ 28k

0

Entering edit mode

I have another question. As I am using different contrasts and getting following results:

con <- makeContrasts(GroupR_TreatmentM, levels=design)

-1  5809
0  37894
1   6163

con <- makeContrasts(GroupS_TreatmentM, levels=design)

-1  5527
0  38098
1   6241

Identical DE genes in R and S groups after treatment of M: 7262

DE genes only in R group after treatment of M: 4710

DE genes only in S group after treatment of M: 4506

Is this right way to check effect of treatment?

ADD REPLY • link 8.1 years ago simarsidhu25 • 0

0

Entering edit mode

Yes, dropping each coefficient separately will test the treatment effect in each group. However, for your "DE genes upon treatment only in group R/S" statements, it is worth checking that all DE genes in the group-specific sets also exhibit differential treatment effects between groups. For example, consider a gene with the same (weak) treatment effect in both groups. In one group, the effect is just significant, while in another group, the effect is just below the significance threshold. If I identified group-specific genes with a set difference operation, this gene would be detected as being specifically DE in the first group, which is not true. In such cases, checking for significant differences in the treatment effects (as was done in my original post) provides extra protection.

ADD REPLY • link 8.1 years ago Aaron Lun ★ 28k

0

Entering edit mode

Thanks.

When I am doing this contrast

con <- makeContrasts(GroupR_TreatmentV - GroupS_TreatmentV, levels=design)

-1    20
0  49814
1     32

then I am getting only these genes.

ADD REPLY • link 8.1 years ago simarsidhu25 • 0

0

Entering edit mode

That's exactly why you should use this contrast. These results are telling you that you don't have a lot of group-specific treatment effects. It is not safe to rely on the absence of detection to infer specificity via set difference operations on DE lists - just because you don't observe a treatment effect for a gene doesn't mean that an effect is not truly present.