Entering edit mode
Guido Hooiveld
★
4.1k
@guido-hooiveld-2020
Last seen 7 days ago
Wageningen University, Wageningen, the …
Hi,
I am doing an analysis of a colleague's dataset, but I have troubles
reproducing the results on fold changes he obtained by manually
calculating these in Excel.
The design of the experiment is as follows:
- affymetrix arrays, RMA normalized
- subjects got 3 treatments (T1, T2, T3), for now we focus on the
differentially expressed genes between T2 and T1.
- since multiple samples were obtained from the same subjects, a
paired analysis is performed. Therefore I included 'Subject' as factor
in the design matrix, in addition to Treatment.
- the model includes both 'Treatment' and 'Subject'
- in the contrast matrix (only) 'Treatment' is defined as contrast of
interest.
- limma runs without any problems.
So far, so good.
However, when comparing the mean of all individual FC, thus the
coefficient of contrast 'T2vsT1' from fit2, we noticed this differs
from the mean FC as manually calculated in Excel (by subtracting for
each subject the T1 expression level from T2 [i.e. T2-T1]; subtract
because data is on log2 scale). Example: for probeset 230_at limma
coefficient (log FC) for 'T2vsT1' is -1.1007688, whereas by hand we
find the mean log FC is -1.088240375. Not a large difference, but I
expected it to be identical... The A value (Ave Expr) is identical
between limma and manual calculation.
Next i checked the average group expression values calculated by limma
(coefs from object 'fit1' obtained when fitting the model; before
doing the contasts). Then i noticed that the mean expression values
(coefs) are different from the ones calculated by hand, and this
*likely* causes the slightly differences. Moreover, i noticed a
coefficient was also calculated for each Subject.
Q1: why do the average expression levels differ between limma (from
fit1) and excel? Are these 'adjusted' for 'Subject'. If so, why this
wanted?
Q2: how to biologically interpret the coefficients that are returned
for each subject? Is this a measure of how each of them deviate from
the mean?
Any insights are appreciated, :)
Guido
> affy.data <- ReadAffy(cdfname="hugene11stv1hsentrezg")
> x.norm <- rma(affy.data)
>
> targets <- readTargets("targets_FL.txt")
> Treatment <- as.factor(targets$Treatment)
> Subject <- as.factor(targets$Subject)
>
> design <- model.matrix(~0+Treatment + Subject)
>
> fit1 <- lmFit(x.norm, design)
>
> cont.matrix <- makeContrasts(
+ #compareT2 vs T1 (=TreatmentT2-Treatment T1)
+ T2vsT1=(TreatmentT2-TreatmentT1),
+ T3avsT2=(TreatmentT3a-TreatmentT2),
+ T3bvsT2=(TreatmentT3b-TreatmentT2),
+ levels=design)
>
>
> fit2 <- contrasts.fit(fit1, cont.matrix)
> fit2 <- eBayes(fit2)
> topTable(fit2,coef=1)
ID logFC AveExpr t P.Value
adj.P.Val
5076 230_at -1.1007688 10.479305 -11.944841 5.355630e-18
1.050882e-13
2396 129804_at 0.5968911 7.007159 11.576944 2.140830e-17
2.100368e-13
19071 9415_at -1.6726710 9.445871 -11.450699 3.457049e-17
2.261141e-13
7193 285754_at -1.0808572 7.133550 -10.945659 2.395265e-16
1.174997e-12
11167 51_at -0.5467982 9.673153 -10.853833 3.416245e-16
1.340671e-12
9286 3952_at -0.6370817 10.000592 -10.484375 1.438641e-15
4.704836e-12
13983 6319_at -1.0432746 12.180506 -10.236676 3.802185e-15
1.065807e-11
9494 4017_at -0.5091157 9.288021 -10.198771 4.414266e-15
1.082709e-11
8582 368_at -0.5483289 7.093417 -10.059173 7.658057e-15
1.669627e-11
13660 595_at -0.5011958 9.767599 -9.970447 1.087936e-14
2.134748e-11
B
5076 30.42098
2396 29.09107
19071 28.63062
7193 26.76836
11167 26.42637
9286 25.04037
13983 24.10246
9494 23.95835
8582 23.42630
13660 23.08709
###########
## Output fit1
###########
> fit1 <- lmFit(x.norm, design)
> fit1
An object of class "MArrayLM"
$coefficients
TreatmentT1 TreatmentT2 TreatmentT3a TreatmentT3b
Subject7
100009676_at 7.084174 7.101001 7.029655 7.090152
-0.21206182
10000_at 9.295713 9.338025 9.339492 9.359302
-0.06561657
10001_at 7.604867 7.636235 7.655008 7.665815
0.02216949
10002_at 4.375210 4.299118 4.393576 4.374674
-0.01616572
100033413_at 4.492079 4.453446 4.413175 4.522281
-0.39793930
Subject12 Subject20 Subject24 Subject25
Subject26
100009676_at -0.120245077 -0.051913214 0.152116448 -0.30216929
-0.0817054427
10000_at 0.006550643 -0.095030864 -0.006489267 -0.13069982
0.0515787097
<snip>
$design
TreatmentT1 TreatmentT2 TreatmentT3a TreatmentT3b Subject7 Subject12
1 1 0 0 0 0 0
2 0 0 1 0 0 0
3 1 0 0 0 0 0
4 0 1 0 0 0 0
5 0 0 1 0 0 0
Subject20 Subject24 Subject25 Subject26 Subject29 Subject30
Subject32
1 0 0 0 0 0 0
0
2 0 0 0 0 0 0
0
3 0 0 0 0 1 0
0
4 0 0 0 0 1 0
0
5 0 0 0 0 1 0
0
Subject34 Subject40 Subject43 Subject45 Subject47 Subject48
Subject51
1 0 0 0 0 0 0
0
2 0 0 0 0 0 0
0
3 0 0 0 0 0 0
0
4 0 0 0 0 0 0
0
5 0 0 0 0 0 0
0
Subject52 Subject61 Subject64 Subject65 Subject66 Subject69
Subject71
1 0 0 0 0 0 1
0
2 0 0 0 0 0 1
0
3 0 0 0 0 0 0
0
4 0 0 0 0 0 0
0
5 0 0 0 0 0 0
0
<snip>
#######
## contrast matrix
#######
> cont.matrix
Contrasts
Levels T2vsT1 T3avsT2 T3bvsT2
TreatmentT1 -1 0 0
TreatmentT2 1 -1 -1
TreatmentT3a 0 1 0
TreatmentT3b 0 0 1
Subject7 0 0 0
Subject12 0 0 0
Subject20 0 0 0
Subject24 0 0 0
Subject25 0 0 0
Subject26 0 0 0
<snip>
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] genefilter_1.32.0 annaffy_1.22.0
[3] KEGG.db_2.4.5 GO.db_2.4.5
[5] RSQLite_0.9-4 DBI_0.2-5
[7] AnnotationDbi_1.12.1 qvalue_1.26.0
[9] multtest_2.8.0 limma_3.6.9
[11] gcrma_2.23.0-1 hugene11stv1hsentrezgcdf_13.0.0
[13] affy_1.29.1 Biobase_2.10.0
loaded via a namespace (and not attached):
[1] affyio_1.19.2 annotate_1.28.0 Biostrings_2.18.0
[4] IRanges_1.8.9 MASS_7.3-8 preprocessCore_1.12.0
[7] splines_2.12.0 survival_2.35-8 tcltk_2.12.0
[10] tools_2.12.0 xtable_1.5-6
>
[[alternative HTML version deleted]]