Question

limma design and contrast matrix for paired samples

0

Entering edit mode

joseph ▴ 50

@joseph-5658

Last seen 7.5 years ago

Hello,

I am analysing Affy microarray data performed on human samples, in which the gene expression were measured before (C) and after (T) treatment. And these samples have been classified into three catalogues. The list as:

Sample   Treatment   Dnonor    Status
Ac   C   A   1
At   T   A   1
Bc   C   B   2
Bt   T   B   2
Cc   C   C   3
Ct   T   C   3

The final goal is to find out the differences among status #1,2 and 3. And, actually, the treatment C can be considered as background, so I want to use the T minus C, then to perform the comparisons btw status 1 2 3.
Two parameters and with paired samples, I wonder how to design the matrix. Thanks a lot.

Best wishes,

Joe

limma • 2.8k views

ADD COMMENT • link updated 8.8 years ago by Aaron Lun ★ 28k • written 8.8 years ago by joseph ▴ 50

score 0 · Answer 1 · 2016-07-21

It would seem that the only way to do this would be to do:

Treatment <- factor(rep(c("C", "T"), 3))
Status <- factor(rep(1:3, each=2))
design <- model.matrix(~0 + Status + Treatment)

If you have a look at the column names of design, you'll get:

[1] "Status1"    "Status2"    "Status3"    "TreatmentT"

The first three coefficients represent the average log-expression of the control sample at each status. The last coefficient represents the log-fold change of treatment over control, averaged across all status levels. You can then test for DE between status levels with:

con <- makeContrasts(Status1 - Status2, levels=design)

... and then feed that to contrasts.fit to test for DE between status 1 and 2. (I'm not sure how much this is worth, though; it seems that you only have one donor individual for each status, so you'd be testing for DE between specific individuals rather than DE between the status levels in general.) You can also test for DE upon treatment by simply dropping the TreatmentT term in topTable.

However, you cannot test for differences in the "T minus C" value between status levels. This is because you simply don't have enough residual d.f. in your experimental design. If you allowed each status to have its own treatment effect (i.e., a status-specific "T minus C"), you'd end up with 6 parameters in your design matrix, which is the same as your number of samples.