Hi,
I have an RNA-seq experiment that is similar to section 3.5 in the edgeR user’s guide, i.e. a nested paired approach, and I have used this approach to analyze my own data. Briefly, the experiment involves 60 RNA samples, corresponding to two groups of bacterial strains (commensal (C) and disease-causing (D)); each group consisting of 15 different strains; either strain treated (IND) and not treated with a chemical (CTR). My questions are about estimating dispersion in this type of scenario (which is skipped in the user’s guide):
- Can I correctly estimate common/trended and tagvise dispersion using estimateGLMCommonDisp /estimateGLMTrendedDisp and estimateGLMTagwiseDisp (relative to a design matrix), even though there are no true biological replicates?
- How do I calculate the prior degrees of freedom in this case?
Any help will be greatly appreciated.
See Section 2.10 of the edgeR User's Guide "What to do if you have no replicates"
So in the section 3.5 example, which option was used?
The Section 3.5 example has replicates. There are 18 samples, and the design matrix has only 12 columns, so there are 6 residual df for estimating the dispersion. Hence all the edgeR glm dispersion estimation methods work.
PS. Please be careful to post follow-up questions as comments rather than answers. I have moved our interchange so far to be comments on your original question.