limma paired design with continous covariate
4
0
Entering edit mode
meeta.mistry ▴ 30
@meetamistry-7355
Last seen 2.5 years ago
United States

Hello;

I have a simple design, but can't seem to find anything on Bioconductor support that addresses the error I am getting. Any help would be much appreciated.

I have a paired design as shown below:

> pData(data.norm)
        Treatment Patient Age Gender
HD 3.1    control       3  25      F
HD 3.2     miR92a       3  25      F
HD 4.1    control       4  41      F
HD 4.2     miR92a       4  41      F
HD 19.1   control      19  22      F
HD 19.2    miR92a      19  22      F
HD 24.1   control      24  35      F
HD 24.2    miR92a      24  35      F
HD 25.1   control      25  23      F
HD 25.2    miR92a      25  23      F

I am trying to find gene expression associated with Treatment while also controlling for age. I used the following design but end up getting NAs for the age coefficient.  Is there a better way of modeling this?

# Setup design matrix
age <- pData(data.norm)$Age
treat <- pData(data.norm)$Treatment
sample <- pData(data.norm)$Patient
design <- model.matrix(~ sample + treat)

# Fit model
fit <- lmFit(exprs(data.norm), design)

Coefficients not estimable: age 
Warning message:
Partial NA coefficients for 47323 probe(s) 

Many thanks,

Meeta

> sessionInfo()
R version 3.2.5 (2016-04-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu precise (12.04.5 LTS)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] splines   parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pheatmap_1.0.8             gProfileR_0.5.3            treemap_2.4                gridExtra_2.2.1           
 [5] knitr_1.12.3               CHBUtils_0.1               dplyr_0.4.3                reshape_0.8.5             
 [9] arrayQualityMetrics_3.26.1 RColorBrewer_1.1-2         lattice_0.20-33            limma_3.26.8              
[13] beadarray_2.20.1           ggplot2_2.1.0              Biobase_2.30.0             BiocGenerics_0.16.1   

 

 

limma paired covariate • 1.8k views
ADD COMMENT
3
Entering edit mode
@ryan-c-thompson-5618
Last seen 3 months ago
Icahn School of Medicine at Mount Sinai…

The code you have shown would not include an Age coefficient in the design at all, so I'm not sure where that warning is coming from. However, controlling for inter-patient differences already controls for the age differences between the patients, so there's no need to add an Age coefficient to your model.

ADD COMMENT
1
Entering edit mode
@steve-lianoglou-2771
Last seen 22 months ago
United States

Take a look at the design matrix to ensure you have several columns designated for patient ID. If Patient is an integer/real value, you'll get something like this:

treatment <- c('ctrl', 'trt', 'ctrl', 'trt', 'ctrl', 'trt', 'ctrl', 'trt')
patient <- c(1, 1, 2, 2, 3, 3, 4, 4)
model.matrix(~ patient + treatment)
#   (Intercept) patient treatmenttrt
#             1       1            0
#             1       1            1
#             1       2            0
#             1       2            1
#             1       3            0
#             1       3            1
#             1       4            0
#             1       4            1

But you really want

patient <- factor(patient)
model.matrix(~ patient + treatment)
# (Intercept) patient2 patient3 patient4 treatmenttrt
#           1        0        0        0            0
#           1        0        0        0            1
#           1        1        0        0            0
#           1        1        0        0            1
#           1        0        1        0            0
#           1        0        1        0            1
#           1        0        0        1            0
#           1        0        0        1            1

 

ADD COMMENT
0
Entering edit mode

Ah, damn, you beat me to it. I should mention, though, that treatment is confounded with patient in your example.

ADD REPLY
0
Entering edit mode

Thankfully! You and Mike Love are eating up all the points around here these days ... ;-)

Thanks for pointing out the confounded treatment. Too focussed on showing differences between integer and factor patient id rather than whipping up a fully legit example. I've updated my answer now to take that into account for posterity ... for the children.

ADD REPLY
0
Entering edit mode

Thanks for the help. My design is setup like the latter with the different patients as separate factor levels.  

ADD REPLY
0
Entering edit mode
meeta.mistry ▴ 30
@meetamistry-7355
Last seen 2.5 years ago
United States

Sorry, that code is incorrect, it shoudl read:

 

# Setup design matrix
age <- pData(data.norm)$Age
treat <- pData(data.norm)$Treatment
sample <- pData(data.norm)$Patient
design <- model.matrix(~ sample + age + treat)

# Fit model
fit <- lmFit(exprs(data.norm), design)
ADD COMMENT
0
Entering edit mode

For future reference, please make amendments to your original post rather than making a new answer. Also, use the "add comment" button to respond to answers; this is because the order of the answers change upon voting.

ADD REPLY
0
Entering edit mode

Ok will do, thanks!

ADD REPLY
0
Entering edit mode
meeta.mistry ▴ 30
@meetamistry-7355
Last seen 2.5 years ago
United States

I just realized what you meant. Age is already being controlled for within each pair since the samples for treatment and control come from the same individual. Should have seen that, sorry, Thanks!

ADD COMMENT

Login before adding your answer.

Traffic: 302 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6