Hello.
I'm trying to analyze some affymetrix human gene st 1.0 arrays with oligo and limma and I receive an error when I try to generate a toptable.
To keep things simple for the moment, I would just like to see differential expression on the base of the dichotomous variable Diagnosis of my pheno file. Is it correct to generate the model matrix the way I tried?
Any help would be much appreciated.
Here are my code, the error and the output of sessionInfo.
library(oligo) celFiles <- list.celfiles() pheno <- read.AnnotatedDataFrame("phenodata.txt", header = TRUE, row.name="Name",sep="\t") oligoRaw <- read.celfiles(filenames=celFiles, phenoData=pheno) eset <- oligo::rma(oligoRaw) library(limma) dm<-model.matrix(~pheno$Diagnosis) fit<-lmFit(eset, dm) fitE<-eBayes(fit) toptable(fitE, coef=pheno$Diagnosis) Error in fit$coefficients[, coef, drop = FALSE] : subscript out of bounds In addition: Warning message: In toptable(fitE, coef = pheno$Diagnosis) : Treat is for single coefficients: only first value of coef being used
> sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods [9] base other attached packages: [1] limma_3.24.13 pd.hugene.1.0.st.v1_3.14.1 RSQLite_1.0.0 [4] DBI_0.3.1 oligo_1.32.0 Biostrings_2.36.1 [7] XVector_0.8.0 IRanges_2.2.5 S4Vectors_0.6.1 [10] Biobase_2.28.0 oligoClasses_1.30.0 BiocGenerics_0.14.0 loaded via a namespace (and not attached): [1] affxparser_1.40.0 GenomicRanges_1.20.5 splines_3.2.1 [4] zlibbioc_1.14.0 bit_1.1-12 foreach_1.4.2 [7] GenomeInfoDb_1.4.1 tools_3.2.1 ff_2.2-13 [10] iterators_1.0.7 preprocessCore_1.30.0 affyio_1.36.0 [13] codetools_0.2-14 BiocInstaller_1.18.3
You have fit a model with an intercept, which is estimating the mean of one of your groups. When you test this coefficient, the hypothesis you are testing is whether or not the coefficient is equal to zero. Since you are estimating the mean of a sample type, the estimates are almost always much different from zero, so you get very small p-values.