Entering edit mode
Massimo Pinto
▴
390
@massimo-pinto-3396
Last seen 10.2 years ago
Greetings all,
I am writing to enquire about the meaning of some elements that are
provided in an limma::MArray-LM object resulting from a call to
limma::lmFit();
To avoid a verbose output, I tried to minimize the number of genes to
be fitted, by selecting just two genes from my expression set:
>eset.2167 <- eset.more[c(848,849)]
>fit.2167 <- lmFit(eset.2167, designo)? # where 'designo' is my design
matrix
I did try fitting a selection from my expression set containing just
one gene, but I understand that lmFit() does not like this, as it
yields:
"Error in? fit$effects[(fit$rank + 1):narrays, , drop = FALSE] :
incorrect number of dimensions"
My concern, specifically, is that I am not sure how the standard
errors on the lmFit's coefficients are calculated, as they are
returned with a value that differs from what I get from a separate run
in MSExcel, using the built-in function linest(), with the same design
matrix and vector of y kept the same. Convincingly, the output of best
values of the fitted coefficients is identical in lmFit and MSExcel's
linest().
Take for example $stdev.unscaled which is defined (from ?MArrayLM) as
"$stdev.unscaled
matrix containing unscaled standard deviations of the coefficients or
contrasts"
> fit.2167$stdev.unscaled
?????????? (Intercept)?? Dose1Gy Ageing6mo Dose1Gy:Ageing6mo
Ageing6mo:LabLNGS Dose1Gy:Ageing6mo:LabLNGS
A_23_P8812???????? 0.5 0.7071068 0.7071068???????????????? 1
0.7071068???????????????????????? 1
A_24_P7642???????? 0.5 0.7071068 0.7071068???????????????? 1
0.7071068???????????????????????? 1
Now to MSExcel, focusing on the first of the two genes (Gene ID 2167
mapped by probe A_23_P8812). Here are the standard errors on the
fitted coefficients, as results from a call to linest()
Dose1Gy:Ageing6mo:LabLNGS Ageing6mo:LabLNGS
Dose1Gy:Ageing6mo Ageing6mo Dose1Gy (Intercept)
best estimates 0,50034025 -1,75687825 -0,3867455
3,055038 0,21681525 8,00081275
std errors
0,262705064 0,185760532 0,262705064 0,185760532
0,185760532 0,131352532
As one can see, the stderr on the (intercept), for example, is
0,131352532, whereas limma:lmFit() gave my a 0.5.
What is the reason for such apparent discrepancy?
Is there any a-priori mistake made here, since Excel fitted just one
gene while lmFit() focused on two genes in this specific example?
Thanking you in advance, here is my sessionInfo()
Yours,
Massimo
> sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-apple-darwin9.8.0
locale:
[1] C
attached base packages:
?[1] grid????? tcltk???? tools???? stats???? graphics? grDevices utils
??? datasets? methods?? base
other attached packages:
?[1] GOstats_2.12.0????????? graph_1.22.2??????????? Category_2.12.0
????? affy_1.24.2???????????? gplots_2.7.3??????????? caTools_1.10
?[7] bitops_1.0-4.1????????? gdata_2.6.1???????????? gtools_2.6.1
????? hgug4112a.db_2.3.5????? org.Hs.eg.db_2.3.6????? RSQLite_0.7-3
[13] DBI_0.2-4?????????????? Agi4x44PreProcess_1.6.0 genefilter_1.28.0
????? annotate_1.24.0???????? AnnotationDbi_1.7.20??? limma_3.2.1
[19] Biobase_2.6.0?????????? svGUI_0.9-46??????????? svSocket_0.9-48
????? svMisc_0.9-56
loaded via a namespace (and not attached):
[1] GO.db_2.3.5????????? GSEABase_1.8.0?????? RBGL_1.20.0
XML_2.6-0??????????? affyio_1.13.5??????? preprocessCore_1.7.9
splines_2.10.0
[8] survival_2.35-7????? xtable_1.5-6
>
Massimo Pinto
Post Doctoral Research Fellow
Enrico Fermi Centre and Italian Public Health Research Institute
(ISS), Rome
http://claimid.com/massimopinto