Hello;
I have a simple design, but can't seem to find anything on Bioconductor support that addresses the error I am getting. Any help would be much appreciated.
I have a paired design as shown below:
> pData(data.norm) Treatment Patient Age Gender HD 3.1 control 3 25 F HD 3.2 miR92a 3 25 F HD 4.1 control 4 41 F HD 4.2 miR92a 4 41 F HD 19.1 control 19 22 F HD 19.2 miR92a 19 22 F HD 24.1 control 24 35 F HD 24.2 miR92a 24 35 F HD 25.1 control 25 23 F HD 25.2 miR92a 25 23 F
I am trying to find gene expression associated with Treatment while also controlling for age. I used the following design but end up getting NAs for the age coefficient. Is there a better way of modeling this?
# Setup design matrix age <- pData(data.norm)$Age treat <- pData(data.norm)$Treatment sample <- pData(data.norm)$Patient design <- model.matrix(~ sample + treat) # Fit model fit <- lmFit(exprs(data.norm), design)
Coefficients not estimable: age Warning message: Partial NA coefficients for 47323 probe(s)
Many thanks,
Meeta
> sessionInfo() R version 3.2.5 (2016-04-14) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu precise (12.04.5 LTS) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines parallel stats graphics grDevices utils datasets methods base other attached packages: [1] pheatmap_1.0.8 gProfileR_0.5.3 treemap_2.4 gridExtra_2.2.1 [5] knitr_1.12.3 CHBUtils_0.1 dplyr_0.4.3 reshape_0.8.5 [9] arrayQualityMetrics_3.26.1 RColorBrewer_1.1-2 lattice_0.20-33 limma_3.26.8 [13] beadarray_2.20.1 ggplot2_2.1.0 Biobase_2.30.0 BiocGenerics_0.16.1
Ah, damn, you beat me to it. I should mention, though, that
treatment
is confounded withpatient
in your example.Thankfully! You and Mike Love are eating up all the points around here these days ... ;-)
Thanks for pointing out the confounded treatment. Too focussed on showing differences between integer and factor patient id rather than whipping up a fully legit example. I've updated my answer now to take that into account for posterity ... for the children.
Thanks for the help. My design is setup like the latter with the different patients as separate factor levels.