Hi,
I am working with a multi variable RNAseq dataset with a total of 18 samples (experimental design below). I am interested in comparing different groups under Treatment_Category
variable and add co-variates as Gender
and Age
, however, even before adding these covariates, I encounter an error message while estimating dispersion estimateDisp
(see below).
Thank you,
Toufiq
Sample_metadata:
dput(Sample_metadata)
#> Samples Donor Treatment Age_years Gender Category Treatment_Category
#> 1 D1_mock D1 mock 65 F O mock_O
#> 2 D2_mock D2 mock 62 F O mock_O
#> 3 D3_mock D3 mock 20 F Y mock_Y
#> 4 D4_mock D4 mock 20 F Y mock_Y
#> 5 D5_mock D5 mock 24 M Y mock_Y
#> 6 D6_mock D6 mock 21 M Y mock_Y
#> 7 D1_Low D1 Low 65 F O Low_O
#> 8 D2_Low D2 Low 62 F O Low_O
#> 9 D3_Low D3 Low 20 F Y Low_Y
#> 10 D4_Low D4 Low 20 F Y Low_Y
#> 11 D5_Low D5 Low 24 M Y Low_Y
#> 12 D6_Low D6 Low 21 M Y Low_Y
#> 13 D1_Hi D1 Hi 65 F O Hi_O
#> 14 D2_Hi D2 Hi 62 F O Hi_O
#> 15 D3_Hi D3 Hi 20 F Y Hi_Y
#> 16 D4_Hi D4 Hi 20 F Y Hi_Y
#> 17 D5_Hi D5 Hi 24 M Y Hi_Y
#> 18 D6_Hi D6 Hi 21 M Y Hi_Y
y.Treatment <- calcNormFactors(y.Treatment, method = "TMM")
Design
Patient_ID_v1 <- factor(Sample_metadata$Donor)
Treatment.Category <- factor(Sample_metadata$Treatment_Category, levels=c("mock_Y", "mock_O", "Low_Y", "Hi_Y", "Low_O", "Hi_O"))
design.Treatment.Category <- model.matrix(~Patient_ID_v1+Treatment.Category)
colnames(design.Treatment.Category)
rownames(design.Treatment.Category)
colnames(design.Treatment.Category)
design.Treatment.Category
Dispersion estimation
y.Treatment.v1 <- estimateDisp(y.Treatment,design.Treatment.Category, robust=TRUE)
Error in glmFit.default(sely, design, offset = seloffset, dispersion = 0.05, :
Design matrix not of full rank. The following coefficients not estimable:
Treatment.CategoryHi_O
James W. MacDonald thank you very much for the suggestions. The
design
suggested by you, I have used it earlier for comparing theTreatment
column. In addition to this, we are also interested in comparing pairwise groups under theTreatment_Category
column in the analysis. Is there a way to modify the samedesign
formula and run on the same or create another design like below:If you just want to compare e.g., mock_o vs mock_y, you should just fit
And compute the contrasts of interest. That's not a great model though, as you have only females in the old group and both males and females in the young group, so gender and Treatment_category are somewhat correlated.
You don't want to include Age, as it's already part of the Treatment_category. And you don't want/need to include the Donor because you are not making any comparisons that use each donor more than once. FYI, your design.mod.2 is likely failing because you have Age coded as a factor. It won't fail if Age is continuous, but you are including age in that model twice (continuous and as a factor as part of Treatment_category), which I wouldn't do.