Question

Limma with paired samples and SVA

0

Entering edit mode

wliao • 0

@wliao-10768

Last seen 7.5 years ago

I have an experimental design as follows:

Patient	Disease	Treatment
1	Unaffected	Treated
1	Unaffected	Untreated
2	Unaffected	Treated
2	Unaffected	Untreated
3	Unaffected	Treated
3	Unaffected	Untreated
4	Affected	Treated
4	Affected	Untreated
5	Affected	Treated
5	Affected	Untreated
6	Affected	Treated
6	Affected	Untreated

We'd like to test:

1) Untreated, Affected vs. Unaffected

2) Unaffected-only, Treated vs. Untreated

3) Affected-only, Treated vs. Untreated

We are trying to use SVA to also control for cell composition and the include the SV's in the model.

mod <- model.matrix(~0+Treatment:Disease, data=pd)
mod0 <- model.matrix(~1, data=pd)
svobj <- sva(M, mod, mod0)

design <- model.matrix(data=pd, ~0+Treatment:Disease+svobj$sv)

colnames(design) <- c(paste0('SV',seq(ncol(svobj$sv))),'Untreat_Unaff','Treat_Unaff','Untreat_Aff','Treat_Aff')
contrast.matrix <- makeContrasts('Treat_Aff-Untreat_Aff',
                                 'Treat_Unaff-Untreat_Unaff',
                                 'Untreat_Aff-Untreat_Unaff',
                                 levels=design)

Then we'd like to account for pairing of patient samples without and with treatment using duplicateCorrelation().

dupCor <- duplicateCorrelation(M, design, block=factor(pd$Patient))

But we get this error:

Warning message in atanh(pmax(-1, rho)):
"NaNs produced"

It works when I don't add the SV's to the model so just confused as to what's going on.

design <- model.matrix(data=pd, ~0+Treatment:Disease)

Is there something I'm not understanding about the design matrix, problem there causing this problem? Not at all confident we've set this up correctly.

limma sva duplicatecorrelation estimatecellcounts • 2.0k views

ADD COMMENT • link updated 7.5 years ago by Aaron Lun ★ 28k • written 7.5 years ago by wliao • 0

score 1 · Accepted Answer · 2017-10-24

I daresay that at least some of the surrogate variables represent the patient effect, given that you didn't tell sva about the patient factor. This may be responsible for funny behaviour in duplicateCorrelation, as the component of variance explained by the blocking factor is not well-defined when the blocking factor is effectively in the design matrix already.

Probably the simplest solution is to tell sva about the patient effect so that it doesn't try to detect it as an unknown factor of variation. This can be done by respecifying mod:

pd$Patient <- factor(pd$Patient)
mod <- model.matrix(~0+Patient+Treatment:Disease, data=pd)
mod <- mod[,!grepl("Untreated", colnames(mod))] # get to full rank

... to ensure that the patient effect is present. Hopefully this will encourage sva to look for other surrogate variables that are not specified in the design matrix. Note that the treatment effect is implicitly modeled - there is no need for an explicit Treatment column, as that column is directly made up of the corresponding patient effects.

Of course, there is no guarantee that the SVs will correspond to cell composition. If you have a priori information about differences in cell composition between samples, you should consider using that information directly rather than hoping that sva will discover it for you.