Removing Effect of Covariate in Only a Subset of Samples in a Model Containing All Samples Loses Full Rank if Factor but not Numerical
1
1
Entering edit mode
@33bd02b2
Last seen 3 months ago
United Kingdom

I previously made a post asking about how to model my data using limma (limma: Genes Associated with Continuous Predictor in Subsets of Samples). In brief, I have samples from ~130 people, approximately half male and half female. For most people, there is a sample from tissue 1 and a sample for tissue 2. One primary focus of our experiment is to find genes associated with fat distribution. Because males and female differ in regards to this fatness trait, and so too do the two tissues, I thought it appropriate to model gene expression in each of the four tissues separately (male-tissue 1, female-tissue 1, male-tissue 2, female-tissue 2). I have also corrected for some covariates: age, sequencing batch, overall fatness. I also accounted for sample pairings using duplicateCorrelation. The code is shown below.


design.AGR.fmi <- model.matrix(~0 + 
                             sex.depot + # A factor combining sex.tissue
                             sex.depot:Android.Gynoid.Ratio + 
                             machine + 
                             age.at.Biopsy + 
                             Fatmass.Index, 
                           data = targets.multiomics) 

corfit.agr.fmi <- duplicateCorrelation(log2.cpm.filtered.norm, design.AGR.fmi, block = targets.multiomics$Individual.ID)

counts_voom.agr.fmi <- voomWithQualityWeights(myDGEList.filtered.norm, design.AGR.fmi)

corfit.agr.fmi <- duplicateCorrelation(counts_voom.agr.fmi, 
                               design.AGR.fmi, 
                               block=targets.multiomics$OBB.ID,
                               corfit.agr.fmi$consensus)

counts_voom.agr.fmi <- voomWithQualityWeights(myDGEList.filtered.norm, 
                                          design.AGR.fmi, 
                                          block=targets.multiomics$Individual.ID,
                                          correlation=corfit.agr.fmi$consensus)

fit.agr.fmi <- lmFit(counts_voom.agr.fmi, design.AGR.fmi)

ebFit.agr.fmi <- eBayes(fit.agr.fmi, robust = TRUE)

results.agr.fmi <- decideTests(ebFit.agr.fmi, 
                           method="separate", 
                           adjust.method="BH", 
                           p.value=pval_threshold, 
                           lfc=0)

The above has been working well. However, the age distribution of the females covered likely pre-, peri-, and post-menopausal ages. We tested FSH levels for the women and have now confidently labelled each women as either pre-, peri-, or post-menopausal. However, when I include menopausal status (as a factor) as a covariate, I receive a warning letting me know the matrix is no longer full rank. This is the case whether males are all labelled as "male" or by their individual ID code. What especially confused me is that if I include the FSH measurement as a continuous value, the matrix remains full rank. I'm not exactly sure how to deal with this as the correction needs to be done using menopausal status, which is somehow loses full rank.

Greatly appreciate some wisdom here! I am a little out of my depth with this project I feel.

Thanks in advance

limma • 1.5k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States

Do you end up with a column of all zeros when you include menopausal status as a factor? Also, please note that voomLmFit is the preferred method rather than iterating between voomWithQualityWeights and duplicateCorrelation.

Login before adding your answer.

Traffic: 475 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6