ANOVA like approach of edgeR
2
0
Entering edit mode
Talip ▴ 10
@talip-zengin-14290
Last seen 20 hours ago
Türkiye

Hi,
I am trying to determine differentially expressed genes between different statuses using edgeR with the ANOVA-like approach. Our experiment is very similar to the mouse mammary gland experiment described in the edgeR user's guide. I can compare the three statuses for either the L cell type or the B cell type separately, as shown below:

> targets
CellType    Status
B   virgin
B   virgin
B   pregnant
B   pregnant
B   lactate
B   lactate
L   virgin
L   virgin
L   pregnant
L   pregnant
L   lactate
L   lactate

> group <- factor(paste0(targets$CellType, ".", targets$Status))
> design <- model.matrix(~ 0 + group)
> fit <- glmQLFit(y, design, robust=TRUE)
> contrast <- makeContrasts(L.PvsL = L.pregnant - L.lactate, 
                          L.VvsL = L.virgin - L.lactate, 
                          L.VvsP = L.virgin - L.pregnant, levels=design)
> anova <- glmQLFTest(fit, contrast=contrast)
> topTags(anova, n=Inf, adjust.method = 'BH', sort.by = 'PValue')

However, I also want to find genes that are differentially expressed between the three statuses regardless of cell type. To do this, I would like to compare the 4 virgin, 4 pregnant, and 4 lactate samples together. How can I achieve this using the design above?

I have tried the following design, but it sets one of the statuses as a reference, and none of the statuses should serve as a reference or control:

> design <- model.matrix(~cell_type + status)
> fit <- glmQLFit(y, design)
> anova <- glmQLFTest(fit, coef=3:4)
> topTags(anova, n=Inf, adjust.method = 'BH', sort.by='PValue')
edgeR • 153 views
ADD COMMENT
1
Entering edit mode
Yunshun Chen ▴ 880
@yunshun-chen-5451
Last seen 7 hours ago
Australia

You could try the followings:

> design <- model.matrix(~ 0 + group)
> contrast <- makeContrasts(PvsL = 0.5*(L.pregnant + B.pregnant) - 0.5*(L.lactate + B.lactate), 
                            VvsL = 0.5*(L.virgin + B.virgin) - 0.5*(L.lactate + B.lactate), levels=design)
> anova <- glmQLFTest(fit, contrast=contrast)
ADD COMMENT
0
Entering edit mode

Simply remove the cell type from the model.

design <- model.matrix(~targets$Status)
ADD REPLY
0
Entering edit mode

If you do it this way, the dispersion estimates would be much higher than they should (as the cell type difference is not accounted for in the design).

ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

You need to clarify whether interactions are present for the two factors in your study. In the two factors do not interact, then the code you show is correct:

> design <- model.matrix(~cell_type + status)
> fit <- glmQLFit(y, design)
> anova <- glmQLFTest(fit, coef=3:4)

The fact that one of the statuses is parametrized as the reference is irrelevant, because the anova calculations are invariant to how the factors are parametrized. You will get the same anova tests regardless of which if any of the levels are set as the reference.

If interactions are present, however, then you need to use code like Yunshun has given. That does not give you "differential expression between the three statuses regardless of cell type" because you cannot ignore cell type in the presence of interactions. It is rather a test for "differential expression between the three statuses for either cell type".

ADD COMMENT

Login before adding your answer.

Traffic: 571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6