Question

Differential Expression Analysis in edgeR using Anova

0

Entering edit mode

ilovesuperheroes1993 • 0

@ilovesuperheroes1993-17038

Last seen 5.9 years ago

Hi, I have 5 samples, namely the following: (1) Not transfected, untreated (2) Transfected but untreated (3) Transfected, treated, analyzed after 5 mins (4) Transfected, treated, analyzed after 60 mins (5) Transfected, treated, analyzed after 4 hrs [By transfected I mean a particular vector is present, and treated means treated with an antibody]

I do not have any replicates for any of the conditions. I am looking to perform an Anova test using edgeR, to see the gene expression at different time points wrt sample 2 (as given).

Could anyone tell me the edgeR code I should run to do the test? Normally, I define the groups, normalize the library, create a design matrix with the group and normalized DGEList, followed by estimating dispersion and performing the GLMQL tests. I am confused as to how to proceed in this case,as I have no replicates. I don't know how to define the groups or estimate dispersion in this case.

I would be very grateful if someone could help me with the edgeR codes. Thank you

edger anova gene expression Dispersion • 2.2k views

ADD COMMENT • link updated 5.8 years ago by onlychainsaw2019 • 0 • written 6.3 years ago by ilovesuperheroes1993 • 0

score 2 · Answer 1 · 2019-01-15

I suppose you've realized how much of a pain it is to not have replicates, so I won't harp on that. Suffice to say that you should have some strong words with whoever designed the experiment.

The obvious answer to your question is, as @timedreamer suggested, to read the relevant section of the edgeR user's guide. In this case, the most promising approach may be to manufacture some residual degrees of freedom by assuming that the expression is a smooth function of time (i.e., that can be modelled with a spline with few degrees of freedom). Specifically:

time <- c(0, 0, 5, 60, 240) # treat sample 2 as time '0'
transfected <- c("N", "Y", "Y", "Y", "Y")
spl <- splines::ns(time, df=2)
design <- model.matrix(~transfected + spl)

Lo and behold, this gives us a design matrix with 5 rows and 4 columns, i.e., one residual degree of freedom to estimate the dispersion. You can then proceed with a quasi-likelihood edgeR analysis to either identify the transfection effect (coef=2) or the time effect (coef=3:4). Note that the latter refers to any effect of time; it is not possible to compare specific time points with the above model.

If you want to compare specific time points, then you have no choice but to follow Option 3 in Section 2.11 to the letter, i.e., take the dispersion estimates from the above model and plug it into glmFit (followed by glmLRT) with a design matrix where each sample is its own group. This is not as good because glmLRT does not control the type I error rate correctly.

score 1 · Answer 2 · 2019-01-15

1

Entering edit mode

timedreamer ▴ 10

@timedreamer-18140

Last seen 6.0 years ago

New York University

Hi, I think you can find the answer in edgeR manual section 2.11 What to do if you have no replicates. Good luck.

ADD COMMENT • link 6.3 years ago timedreamer ▴ 10