contrast matrix to compare 3 groups in limma
1
0
Entering edit mode
@mahesmuniandy-7955
Last seen 7 months ago
Helsinki

Hi,

My name is Mahes Muniandy and I am a doctoral student working on twin data. I have gene expression (Affymetrix HGU133 plus 2) data for monozygotic twins discordant for obesity. I clustered my gene expression data using heavy/lean ratios for each twin pair and obtained 3 groups and would like to compare all three groups of twins. This is to enable me to see if there are different sub-types to obesity. In other words, I would like to do within twin-pair differences (26 twin pairs) and see which genes are expressed differently in obesity across all 3 groups.
Can someone advise me if I am on the right track with my design matrix and limma approach below?

library(limma)
#twins
Pair <- factor(c("T1","T1","T2","T2","T3","T3","T4","T4","T5","T5","T6","T6","T7","T8","T8","T7","T9","T9","T10","T10","T11","T11","T12","T12","T13","T13","T14","T14","T15","T15","T16","T16","T17","T17","T18","T18","T19","T19","T20","T20","T21","T21","T22","T22","T23","T23","T24","T24","T25","T26", "T26", "T25"))
#obese vs lean
HeavyLean<-factor(c("O","L","L","O","L","O","O","L","O","L","L","O","L","L","O","O","O","L","O","L","L","O","L","O","O","L","O","L","L","O","O","L","O","L","L","O","L","O","L","O","O","L","O","L","O","L","O","L","L","L","O","O"))
#my 3 clusters
Cluster<-factor(c("c2","c2","c2","c2","c2","c2","c3","c3","c2","c2","c2","c2","c3","c2","c2","c3","c2","c2","c2","c2","c2","c2","c2","c2","c2","c2","c2","c2","c3","c3","c2","c2","c2","c2","c1","c1","c2","c2","c3","c3","c2","c2","c3","c3","c2","c2","c2","c2","c1","c2","c2","c1"))

LG <- paste(Cluster, HeavyLean, sep=".")
LG <- factor(LG, levels=c("c1.O","c1.L","c2.O","c2.L","c3.O","c3.L"))

design <- model.matrix(~0 + LG)
colnames(design) <- levels(LG)

dupcor <- duplicateCorrelation(limma_eset, design)
fit <- lmFit(limma_eset, design, block=Pair, correlation=dupcor$consensus)

contrast.matrix <- makeContrasts(C2vsC1inO = "c2.O-c1.O", C2vsC1inL = "c2.L-c1.L", C3vsC1inO = "c3.O-c1.O", C3vsC1inL = "c3.L-c1.L", C3vsC2inO = "c3.O-c2.O", C3vsC2inL = "c3.L-c2.L", C2vsC1 = (c2.O-c2.L)-(c1.O-c1.L), C3vsC1 = (c3.O-c3.L)-(c1.O-c1.L), C3vsC2 = (c3.O-c3.L)-(c2.O-c2.L),levels=design)

fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)

I am getting adjusted p values of 1 and wonder if it is because my contrast matrix is not full rank? Can anyone suggest a better approach to me? I tried doing simple anova with Tukey - should I just use that?

Many Thanks,

Mahes Muniandy

Department of Public Health,

University Helsinki

limma • 3.4k views
ADD COMMENT
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen just now
The city by the bay

You should specify block=Pair in the duplicateCorrelation call, otherwise the correlation won't be calculated correctly. In particular, the function will assume that you're calculating correlations between duplicate spots on the same array. This is not the case here, as you want the correlations between paired arrays.

Also, the comparisons between clusters will be more conservative than the comparisons within clusters, as the correlation will increase the standard error of the log-fold change for the former (whereas it decreases it for the latter). So, you might get some more DE genes if you focus on the contrasts between lean and obese in the same cluster. Contrasts like C3vsC1 should also be okay, as the intra-cluster log-fold change is being compared.

ADD COMMENT
0
Entering edit mode

Dear Aaron,

This now works well, thank you for your help.

Mahes

ADD REPLY

Login before adding your answer.

Traffic: 567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6