Twin analysis using limma
2
0
Entering edit mode
@mahesmuniandy-7955
Last seen 7 months ago
Helsinki

Hello,

My name is Mahes Muniandy and I am a doctoral student working on twin data.  I would like to compare the gene expression (Affymetrix HGU133 plus 2) between lean and obese people and see how sex (Male/Female) affects the gene expression. So, I would like to do within twin-pair differences (26 twin pairs) and see which genes are expressed differently in obesity in males and females. Can someone advise me if I am on the right track with my design matrix and limma approach below?

Here is what I have.

LG <- paste(targets$Gender, targets$LeanStatus, sep=".")
LG <- factor(LG, levels=c("F.O","F.L","M.O","M.L"))
Pairs <- factor(c("T1","T1","T2","T2","T3","T3","T4","T4","T5","T5","T6","T6","T7","T8","T8","T7","T9","T9","T10","T10","T11","T11","T12","T12","T13","T13","T14","T14","T15","T15","T16","T16","T17","T17","T18","T18","T19","T19","T20","T20","T21","T21","T22","T22","T23","T23","T24","T24","T25","T26", "T26", "T25"))
design <- model.matrix(~Pairs+LG)
colnames(design) <- levels(LG)
fit <- lmFit(eset, design)

cont.matrix <- makeContrasts(OvsLinF=F.O-F.L, OvsLinM=M.O-M.L, Diff=(M.O-M.L)-(F.O-F.L), levels=design)
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)

 

Is this the best design for this kind of comparison?

Many Thanks,

Mahes Muniandy,

Obesity Research Unit,

University of Helsinki

limma limma paired analysis • 1.7k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 14 hours ago
United States

It's hard to say without looking at the data. You might want to start out with some PCA or MDS plots to see how similar the twins are to each other. Blocking on twins like that is in essence saying that you think that there is a twin-specific mean difference in expression that you want to adjust for. That may well be true for some of the genes, but a more plausible scenario is that the twins are simply more correlated in their gene expression than the unrelated subjects, in which case you would instead use duplicateCorrelation() to estimate the intra-twin correlation and then fit a mixed model.

Of course that has its own set of assumptions, namely that the intra-twin correlation is relatively consistent between the twins, as you will be fitting a block diagonal correlation matrix. I don't think either way is 'wrong', there are just different assumptions being made, and you would do well to try to see which set of assumptions are more plausible.
 

ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 3 hours ago
The city by the bay

Blocking factors for each twin pair may not be appropriate, depending how the levels of LG are arranged across your 50 samples. For example, if you had a situation where the twins in each pair were always both obese or both lean, then you wouldn't be able to do your first two contrasts. Conversely, if both twins are of the same sex and differ only in their obesity status, then I suspect the design matrix will not be of full rank. Also, this command:

colnames(design) <- levels(LG)

... makes no sense, because you also have additional coefficients from the twin-pair blocking factors.

An alternative approach is to treat the twin pair as a random effect. This is probably reasonable, given that you're testing twins sampled from some larger population of twins. You could then do something like:

design <- model.matrix(~0 + LG)
colnames(design) <- levels(LG)
dupcor <- duplicateCorrelation(eset, design)
fit <- lmFit(eset, design, block=Pairs, correlation=dupcor$consensus)

... and so on, with contrasts.fit and eBayes. This set-up ensures that you can compare between levels of LG, regardless of how they're arranged with respect to Pairs.

ADD COMMENT

Login before adding your answer.

Traffic: 821 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6