Entering edit mode
Guido Hooiveld
★
4.1k
@guido-hooiveld-2020
Last seen 7 days ago
Wageningen University, Wageningen, the …
Hi,
I am doing an analysis on a dataset, and have a question on whether or
not to include the paired structure (sibship) in the model fitted by
limma if this is not explicitly needed. We discussed this internally,
but didn't get consensus. Hence, I would like to ask for opinions on
this list... :)
Let me explain:
- Affymetrix data, RMA normalized.
- samples from 20 subjects were analyzed on arrays, obtained from 10
'young' and 10 'old' individuals.
- samples were taken at baseline, and after treatment with a drug
(WY).
Part of target file:
>targets
Filename Treatment Sibship Category
1 G068_B07_05_05_CTRL.CEL Ctrl 5 old
2 G068_B09_06_06_CTRL.CEL Ctrl 6 old
3 G068_C05_07_07_CTRL.CEL Ctrl 7 old
4 G068_C09_09_09_CTRL_2.CEL Ctrl 9 young
5 G068_D05_10_10_CTRL.CEL Ctrl 10 young
6 G068_D07_11_11_CTRL.CEL Ctrl 11 young
7 G068_F07_17_05_WY.CEL WY 5 old
8 G068_F09_18_06_WY.CEL WY 6 old
9 G068_G05_19_07_WY.CEL WY 7 old
10 G068_G09_21_09_WY.CEL WY 9 young
11 G068_H05_22_10_WY.CEL WY 10 young
12 G068_H07_23_11_WY.CEL WY 11 young
It is obvious that for the treatment effect a paired analysis should
be performed (using sibship info). This could be done for the whole
group, or for young and old separately.
> TC <- as.factor(paste(targets$Treatment, targets$Category, sep="."))
> design <- model.matrix(~0+TC+Sibship)
>
> fit <- lmFit(x.norm, design)
Coefficients not estimable: Sibship8 Sibship12
Warning message:
Partial NA coefficients for 19682 probe(s)
>
>#test for effect of WY in old and young
> cont.matrix <- makeContrasts(
+ WYold=(TCWY.old-TCCtrl.old),
+ WYyoung=(TCWY.young-TCCtrl.young),
+ levels=design)
>
> fit2 <- contrasts.fit(fit, cont.matrix)
> fit2 <- eBayes(fit2)
If I would like to identify the probesets that are differentially
expressed between old and young under either control or treatment
conditions, I am essentially performing an unpaired t-test. Hence,
info on sibship is thus not required.
> cont.matrix <- makeContrasts(
+ ctrlold_ctrlyoung=(TCCtrl.old-TCCtrl.young),
+ WYold_WYyoung=(TCWY.old-TCWY.young),
+ levels=design)
>
However, I noticed that the results of these 2nd set of comparisons
(the old vs young) are strongly affected by including or not the
sibship in the design. In other words, if I define this design:
> design <- model.matrix(~0+TC+Sibship)
I get a completely different set of top regulated probesets for the
before mentioned contrasts when compared to this design (without
sibship):
> design <- model.matrix(~0+TC)
I noticed that also the p-values are much smaller when including the
sibship.
As an example,
WITH sibship:
> topTable(fit2,coef=1)
ID logFC AveExpr t P.Value adj.P.Val
B
15031 671_at -4.763308 5.591752 -51.58016 4.457435e-18 6.583366e-14
29.06475
9938 4317_at -5.545104 4.893897 -50.17288 6.689732e-18 6.583366e-14
28.82092
7454 2944_at -7.992487 5.861613 -43.89646 4.747228e-17 3.114498e-13
27.56297
14844 654433_at 5.136677 5.807219 40.89921 1.337134e-16 6.579367e-13
26.84567
1115 1088_at -4.762000 5.650997 -39.67501 2.085766e-16 8.210408e-13
26.52704
656 10321_at -6.458380 5.445312 -37.74764 4.319761e-16 1.417026e-12
25.99187
4162 1991_at -4.080680 6.171290 -36.76955 6.339135e-16 1.627014e-12
25.70344
3784 1669_at -6.130596 5.171480 -36.66315 6.613207e-16 1.627014e-12
25.67134
9557 4057_at -6.079963 5.789428 -32.16516 4.460724e-15 9.755107e-12
24.17063
2584 1359_at 3.730962 6.155636 30.49084 9.709508e-15 1.911025e-11
23.53108
>
WITHOUT sibship:
> topTable(fit2,coef=1)
ID logFC AveExpr t P.Value adj.P.Val
B
14844 654433_at 5.320186 5.807219 9.802842 2.297163e-09 2.577632e-05
11.534050
18751 9173_at 3.052422 4.715249 9.439677 4.453505e-09 2.577632e-05
10.917557
6220 2624_at 2.443030 5.771388 9.281647 5.970493e-09 2.577632e-05
10.643512
13773 59340_at 4.154059 4.967316 9.221019 6.686698e-09 2.577632e-05
10.537431
2584 1359_at 3.572039 6.155636 9.095515 8.466416e-09 2.577632e-05
10.316169
16233 79608_at 1.774024 5.678042 9.073712 8.822506e-09 2.577632e-05
10.277501
2040 1232_at 2.911866 4.211083 9.053443 9.167473e-09 2.577632e-05
10.241491
17108 83478_at 1.522142 6.045796 8.750724 1.635842e-08 4.024580e-05
9.696605
1855 1178_at 3.134552 8.481612 8.619180 2.111666e-08 4.304048e-05
9.455663
6733 2766_at -2.157836 5.223548 -8.568223 2.332607e-08 4.304048e-05
9.361647
>
Thus, although it is not required, would it be recommended to include
for the 2nd set of contrasts the paired structure of the data in the
design?
I would argue to do so (since intuitively I feel it would be good to
always include as much info as possible on the correlation structure
of the data), but as said not everyone in the project team agrees with
me.
So any opinions/comments are much appreciated.
Regards,
Guido
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
LC_TIME=English_United States.1252
attached base packages:
[1] splines stats graphics grDevices utils datasets
methods base
other attached packages:
[1] hugene11stv1hsentrezg.db_15.1.0 org.Hs.eg.db_2.7.1
RSQLite_0.11.1 DBI_0.2-5
hugene11stv1hsentrezgcdf_15.1.0 AnnotationDbi_1.18.3
[7] qvalue_1.30.0 multtest_2.12.0
affy_1.34.0 limma_3.12.3
pamr_1.54 survival_2.36-14
[13] cluster_1.14.2 bladderbatch_1.0.3
Biobase_2.16.0 BiocGenerics_0.2.0
sva_3.2.1 mgcv_1.7-20
[19] corpcor_1.6.4 BiocInstaller_1.4.7
loaded via a namespace (and not attached):
[1] affyio_1.24.0 grid_2.15.1 IRanges_1.14.4
lattice_0.20-10 MASS_7.3-21 Matrix_1.0-9
nlme_3.1-104 preprocessCore_1.18.0
[9] stats4_2.15.1 tcltk_2.15.1 tools_2.15.1
zlibbioc_1.2.0
>
---------------------------------------------------------
Guido Hooiveld, PhD
Nutrition, Metabolism & Genomics Group
Division of Human Nutrition
Wageningen University
Biotechnion, Bomenweg 2
NL-6703 HD Wageningen
the Netherlands
tel: (+)31 317 485788
fax: (+)31 317 483342
email: guido.hooiveld@wur.nl
internet: http://nutrigene.4t.com
http://scholar.google.com/citations?user=qFHaMnoAAAAJ
http://www.researcherid.com/rid/F-4912-2010
[[alternative HTML version deleted]]