Entering edit mode
Paul Boutros
▴
340
@paul-boutros-371
Last seen 10.2 years ago
Hi again,
For an experiment I'm analyzing, I do not have a series of factors.
Rather, I
have a pair of test-score (numerical) for each replicate animal. This
is
Affymetrix data, and for my initial pass I tried with only a single
test-score.
The commands I used are below:
> library(gcrma);
Welcome to Bioconductor
Vignettes contain introductory material. To view,
simply type: openVignette()
For details on reading vignettes, see
the openVignette help page.
> library(limma);
> cel.files <- c(
+ 'RAE230_2_060104_LH_IM07T.CEL',
+ 'RAE230_2_060104_LH_IM08T.CEL',
+ 'RAE230_2_060104_LH_IM09T.CEL',
+ 'RAE230_2_060104_LH_IM10T.CEL',
+ 'RAE230_2_060204_LH_IM07T.CEL',
+ 'RAE230_2_060204_LH_IM08T.CEL',
+ 'RAE230_2_060204_LH_IM09T.CEL',
+ 'RAE230_2_060204_LH_IM10T.CEL'
+ );
> eset <- ReadAffy(filenames=cel.files, phenoData="phenodata.txt");
> eset <- ReadAffy(filenames=cel.files, phenoData="phenodata.txt");
> eset;
AffyBatch object
size of arrays=834x834 features (43476 kb)
cdf=Rat230_2 (31099 affyids)
number of samples=8
number of genes=31099
annotation=rat2302
> pData(eset);
TestScore1
RAE230_2_060104_LH_IM07T.CEL 0.58
RAE230_2_060104_LH_IM08T.CEL -2.36
RAE230_2_060104_LH_IM09T.CEL -12.24
RAE230_2_060104_LH_IM10T.CEL -14.84
RAE230_2_060204_LH_IM07T.CEL 0.15
RAE230_2_060204_LH_IM08T.CEL -3.23
RAE230_2_060204_LH_IM09T.CEL -11.66
RAE230_2_060204_LH_IM10T.CEL -12.91
> eset <- rma(eset);
Background correcting
Normalizing
Calculating Expression
> design <- model.matrix(~-1 + TestScore1, pData(eset));
> design;
TestScore1
RAE230_2_060104_LH_IM07T.CEL 0.58
RAE230_2_060104_LH_IM08T.CEL -2.36
RAE230_2_060104_LH_IM09T.CEL -12.24
RAE230_2_060104_LH_IM10T.CEL -14.84
RAE230_2_060204_LH_IM07T.CEL 0.15
RAE230_2_060204_LH_IM08T.CEL -3.23
RAE230_2_060204_LH_IM09T.CEL -11.66
RAE230_2_060204_LH_IM10T.CEL -12.91
attr(,"assign")
[1] 1
> fit1 <- lmFit(eset, design);
> fit3 <- eBayes(fit1);
All proceeds well without any error-messages, so I believed I had
successfully
fit my model. When I extract the data, however, I get some unexpected
results:
> topTable(fit3, coef=1, number=20, adjust="fdr");
ID M A t P.Value
B
104 1367555_at -1.212175 14.77173 -6.808742 3.912627e-07
14.75482
105 1367556_s_at -1.203275 14.67175 -6.762709 3.912627e-07
14.48816
3411 1370862_at -1.185363 14.42457 -6.674599 3.912627e-07
13.98156
2777 1370228_at -1.184768 14.46704 -6.666414 3.912627e-07
13.93475
549 1368000_at -1.175866 14.33987 -6.622929 3.912627e-07
13.68683
2697 1370148_at -1.174858 14.32747 -6.617795 3.912627e-07
13.65764
837 1368288_at -1.170412 14.25957 -6.596453 3.912627e-07
13.53649
710 1368161_a_at -1.169621 14.27702 -6.589664 3.912627e-07
13.49801
420 1367871_at -1.166552 14.09481 -6.588461 3.912627e-07
13.49120
2558 1370009_at -1.162241 14.18252 -6.552347 3.912627e-07
13.28707
147 1367598_at -1.161072 14.19550 -6.543620 3.912627e-07
13.23787
2576 1370027_a_at -1.158955 14.13682 -6.536068 3.912627e-07
13.19533
1136 1368587_at -1.154447 14.04036 -6.517046 3.912627e-07
13.08836
196 1367647_at -1.154999 14.07951 -6.516673 3.912627e-07
13.08627
31081 AFFX-r2-P1-cre-5_at -1.153568 14.05140 -6.510380 3.912627e-07
13.05094
19150 1387082_at -1.153501 14.05431 -6.509674 3.912627e-07
13.04697
2898 1370349_a_at -1.153164 14.07047 -6.505935 3.912627e-07
13.02599
1235 1368686_at -1.152660 14.04966 -6.504796 3.912627e-07
13.01960
8200 1375651_at -1.152557 14.04428 -6.504674 3.912627e-07
13.01892
19359 1387291_at -1.151666 14.03699 -6.499744 3.912627e-07
12.99127
The near-identical M, A, and p-values indicate a problem, and none of
the genes
on this list are very plausible biologically for our test-system.
Based on that
I'm pretty sure I've gone astray somewhere.
Is it possible to use numeric scores in fitting a linear model with
limma? If
so, am I asking the question in the right manner? If not, are there
any
BioConductor tools appropriate for this kind of question?
Any help very much appreciated,
Paul