Question

Paired factorial design model in limma

0

Entering edit mode

chris86 ▴ 420

@chris86-8408

Last seen 5.3 years ago

UCL, United Kingdom

Hi

I have three pathotypes any single patient may belong to at baseline, then I have two timepoints for each patient, baseline and six-month. I am interested in what changes within a pathotype from baseline to six-months, i.e. a paired comparison. Then, I am also interested in what is different between pathotypes over time, i.e. I have a factorial design. My best guess was to use a random effects model using duplicatecorrelation in limma together with the factorial design code from the limma manual. Can you please tell me if I am doing the right thing, if not what should I do? Thanks. Chris.

My design matrix:

> head(design)
Baseline.Fibroid Baseline.Lymphoid Baseline.Myeloid Six.month.Fibroid Six.month.Lymphoid Six.month.Myeloid
1                0                 0                0                 0                  0                 1
2                0                 0                0                 1                  0                 0
3                0                 0                0                 0                  1                 0
4                0                 0                0                 0                  1                 0
5                0                 0                0                 1                  0                 0
6                0                 0                1                 0                  0                 0

My code:

Patient <- factor(justclinical3$hosp_num)
Treat <- justclinical3$Timepoint
Treat <- gsub('-','.',Treat)
Pathotype <- justclinical3$Pathotype
Treat2 <- factor(paste(Treat,Pathotype,sep=".")) # try pasting together

design <- model.matrix(~0+Treat2)

colnames(design) <- levels(Treat2)

matrix <- data.matrix(transcriptomic2)
dge <- DGEList(counts=matrix)
dge <- calcNormFactors(dge, method='TMM')
v <- voom(dge, design, plot=TRUE, normalize="quantile") # can apply quantile normalise if noisy data
corfit <- duplicateCorrelation(v,design,block=Patient)
corfit$consensus
fit <- lmFit(v,design,block=Patient,correlation=corfit$consensus)
cont.matrix <- makeContrasts(lymphoid=Baseline.Lymphoid-Six.month.Lymphoid,
                             fibroid=Baseline.Fibroid-Six.month.Fibroid,
                             myeloid=Baseline.Myeloid-Six.month.Myeloid,
                             DiffLF=(Baseline.Lymphoid-Six.month.Lymphoid)-(Baseline.Fibroid-Six.month.Fibroid),
                             DiffFM=(Baseline.Fibroid-Six.month.Fibroid)-(Baseline.Myeloid-Six.month.Myeloid),
                             DiffLM=(Baseline.Lymphoid-Six.month.Lymphoid)-(Baseline.Myeloid-Six.month.Myeloid),
                             levels=design)

fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)

# single pathotype changes
top2 <- topTable(fit2,coef='lymphoid',number=Inf,sort.by="P")
top2 <- topTable(fit2,coef='fibroid',number=Inf,sort.by="P")
top2 <- topTable(fit2,coef='myeloid',number=Inf,sort.by="P")

# between them
top2 <- topTable(fit2,coef='DiffLF',number=Inf,sort.by="P")
top2 <- topTable(fit2,coef='DiffFM',number=Inf,sort.by="P")
top2 <- topTable(fit2,coef='DiffLM',number=Inf,sort.by="P")

limma • 1.1k views

ADD COMMENT • link updated 7.3 years ago by Aaron Lun ★ 28k • written 7.3 years ago by chris86 ▴ 420

score 4 · Accepted Answer · 2017-12-14

4

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 3 hours ago

The city by the bay

Looks fine to me, though:

Make sure you have filtered out low-abundance genes.
You'll get better results by iterating voom and duplicateCorrelation, see A: using duplicateCorrelation with limma+voom for RNA-seq data.
Quantile normalization may be the best option when your data are noisy, but there is a price to pay. If you have unbalanced DE (most genes changing in one direction), quantile normalization will not be appropriate as there is no reason that the distributions of different libraries should be the same.