Experiment design contrasts any time edgeR
2
1
Entering edit mode
Pietro ▴ 40
@pietro-13029
Last seen 6 months ago
Italy

Dear all,

I am working with a RNASeq data from an Arabidopsis experiment. The plants have been subjected to nitric oxide treatment for 3 hours and for 8 hours. I would like to find differentialy modulated genes not only in the classic between 2 comparison but also at any time in one test, following the section "3.3.3 Treatment effects over all times" of edgeR user guide.

The design of the experiment is 3 biological replicates for each of the 3 samples:

Sample Condition Time
untreated_1 ctrl 0
untreated_2 ctrl 0
untreated_3 ctrl 0
NO3_1 NO 3
NO3_2 NO 3
NO3_3 NO 3
NO8_1 NO 8
NO8_2 NO 8
NO8_3 NO 8

I managed to compare samples at 3 hours with control (3h vs ctrl) and samples at 8 the hour treatment against control (8h vs ctrl); now I'd like to make a whole comparison to find genes responding to the treatment any time in a single test.

I am using edgeR.

This is the code I am using:

targets <- is the table above
design <- model.matrix(~Time, data=targets)
fit <- glmFit(y, design)
lrt <- glmLRT(fit, coef=2)
topTags(lrt)

Coefficient:  Time 
              logFC   logCPM        LR        PValue           FDR
AT2G33270 1.3171504 4.287261 1136.0571 4.810643e-249 1.616472e-244
AT3G03270 0.9024100 8.335613 1116.2615 9.650825e-245 1.621435e-240
AT5G58840 1.1536560 4.831535 1108.5933 4.479132e-243 5.016927e-239
AT1G01460 1.8124852 3.853430 1079.3572 1.012809e-236 8.508099e-233
AT1G13480 1.3325911 5.177147 1062.6741 4.281367e-233 2.877250e-229
AT2G35730 1.1750776 5.391916  952.5724 3.660705e-209 2.050117e-205
AT4G02650 1.6227408 4.214342  913.3378 1.236992e-200 5.937916e-197
AT4G26470 0.9443015 4.350554  911.2471 3.522495e-200 1.479536e-196
AT2G02340 1.7263559 3.112108  890.8545 9.548953e-196 3.565155e-192
AT1G70440 1.6371184 5.024619  882.3160 6.857418e-194 2.304230e-190

Can you please tell me if this is correct (and in your opinion if it makes sense)?

Thanks a lot

Pietro

DGE rnaseq RNA edger • 1.5k views
ADD COMMENT
3
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 1 hour ago
The city by the bay

Your approach is okay, but assumes a linear response with respect to time (if Time is a real-valued covariate, which I presume it is by default). You have plenty of replicates, so I would rather use an ANODEV:

design <- model.matrix(~factor(Time), data=targets)
# ... other edgeR stuff...
lrt <- glmLRT(fit, coef=2:3)

The second coefficient of design will represent the log-fold change at 3 hours vs zero hours, while the third coefficient represents the log-fold change at 8 hours vs zero hours. When you drop both of them simultaneously in glmLRT, your null hypothesis is that both log-fold changes are equal to zero. A gene with any response at either time point will be more likely to reject the null hypothesis.

ADD COMMENT
0
Entering edit mode

Thanks a lot, you were very clear!

Can you briefly explain to humans like me what's the difference between

design <- model.matrix(~Time, data=targets)
# ​and
design <- model.matrix(~factor(Time), data=targets)

?

What would be the equivalent commands to do this in DESeq2?

Thanks again

ADD REPLY
1
Entering edit mode

If Time is numeric, ~Time will assume that log-expression changes linearly with time, i.e., you're fitting a line of best fit to log-expression against time. With ~factor(Time), you're treating each time point as a separate group, so the log-expression isn't required to follow a linear relationship with respect to time.

DESeq2 also uses the formula notation so I presume it will work the same, but you'd have to ask the maintainers.

ADD REPLY

Login before adding your answer.

Traffic: 756 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6