Hello Bioconductor community,
I am trying to find differentially expressed genes, between placebo P and drug M, as well as within placebo and drug groups at different time points. My experimental design is multi-factorial where some subjects receive placebo P and others receive drug M. I have gene expression data from samples collected for both drug and placebo, at three time points (1,2,3). Following edgeR manual section 3.3.1, I have generated the following targets data frame.
> targets subjects treatment time 1 1 P 1 2 1 P 2 3 1 P 3 4 2 M 1 5 2 M 2 6 2 M 3 7 3 M 1 8 3 M 2 9 7 M 1 10 7 M 2 11 7 M 3 12 10 M 1 13 10 M 2 14 10 M 3 15 11 P 1 16 11 P 2 17 11 P 3 18 12 P 1 19 12 P 2 20 12 P 3
In order to identify DEGs between groups of my interest, I have made contrasts that I want to incorporate in the glmQLF line of code. However, I also want to carry out a paired-analysis mentioned in edgeR manual section 3.4.2 in order to adjust for baseline differences between the subjects. Hence, using a suggestion from this post (differential expressed genes, paired samples and multiple factors) I have built a design matrix using
design <- model.matrix(~0+ treatment:time + subjects)
The columns of this design matrix contain columns for subjects and treatmentM:time1, treatmentP:time1, treatmentM:time2, treatmentP:time2, treatmentM:time3 and treatmentP:time3. However, when I get the following error when I estimate gene dispersion using
y <- estimateDisp(gene_filt, design) Error in glmFit.default(sely, design, offset = seloffset, dispersion = 0.05, : Design matrix not of full rank. The following coefficients not estimable: treatmentM:time3 treatmentP:time3
I would really appreciate suggestions whereby I can compare between my conditions of treatment:time, while adjusting for differences between the subjects.
Thank you
See Section 3.5 of the edgeR User's Guide. In your experiment, treatment is a "between subject" factor while time is a "within subject" factor. Paired analyses are only for within subject factors.