Question

Difficulties with probable batch effect in paired blocked (treatment and placebo) experiment

1

Entering edit mode

Vladimir Krasikov ▴ 90

@vladimir-krasikov-5097

Last seen 5.4 years ago

Dear Bioconductor Experts.

I have RNAseq data with somewhat difficult experimental design and need advice how to deal with it.

Briefly, there are 10 paired samples before and after treatment.Samples are derived from human beings with some disorder which were participating in proof of concept clinical study (initially there were more patients involved, but at the end of the experiment only 10 valid pairs left). Things would be easy for simple paired design experiment, but 4 of the patients were treated at 'baseline' level starting treatment at week 0 till week 12, another 6 patients were treated with placebo for the same 12 weeks and then treated with the drug for another 12 weeks.

One of the pairs in 'baseline' group is a possible outlier (sample pair 3 on the MDS and PCA plots), reducing the number of pairs to three for this subgroup (also both samples in this pair have much lower library sizes of around 13 mils as compared to other samples, which are in a range of 18-22 mills).

MDS shows no separation according to the treatment - all samples are mixed, except sample 3 pair stands far apart. And I also don't see any clear batch effect between 'baseline' and 'placebo' groups. Below are MDS (open circles = wk0, filled circles = wk12) before and after removeBatchEffect():

Q1: Is it justified to remove this sample pair from the analysis based only on the fact of the smaller library sizes?

Some R code:

Week    <- factor(exp.design$time, levels = c('wk0', 'wk12'))
Batch0  <- factor(exp.design$batch0, levels = c('baseline', 'placebo'))
Donor   <- factor(exp.design$ind)
Donor.n <- factor(exp.design$ind.n)   # nested within 'Batch0'
Exp.Design <- data.frame(Donor, Donor.n, Week, Batch0)

y <- DGEList(count = Count.Data, genes = gene.annot, samples =  Exp.Design)
keep <- rowSums(cpm(y) > 0.5) >= 2
table(keep)
y <- y[keep, , keep.lib.sizes = F]
y <- calcNormFactors(y)

design <- model.matrix( ~ Donor + Week, Exp.Design)
y <- estimateDisp(y, design, robust = T)  
fit <- glmQLFit(y, design, robust = T)
res <- glmQLFTest(fit)  # no DEGs

Biologist responsible for the project insists that all 10 pairs should be treated together as one group, however not surprisingly there were no DEGs (not in deseq2 and not in edgeR in both QLF and LRT pipelines), so I included batch effect in the model (actually a placebo effect):

design1 <- model.matrix(~ Batch0 + Batch0 : Donor.n + Batch0 : Week, Exp.Design)
all.zero <- apply(design1, 2, function(x) all(x==0))
idx <- which(all.zero)
design1 <- design1[ , -idx]
unname(design1)
y <- estimateDisp(y, design1, robust = T)
colnames(design1)
fit <- glmQLFit(y, design1, robust = T)
plotQLDisp(fit)
res <- glmQLFTest(fit, coef = "Batch0.baseline:Weekwk12")      # no DEGs
res <- glmQLFTest(fit, coef = "Batch0placebo:Weekwk12")        # no DEGs
res <- glmQLFTest(fit, coef = 11:12)                           # no DEGs
res <- glmQLFTest(fit, contrast = c(0,0,0,0,0,0,0,0,0,0,-1,1))   # no DEGs

However, the same workflow with LRT model produces some DEGs:

fit <- glmFit(y, design1)
res <- glmLRT(fit, coef = "Batch0baseline:Weekwk12")
is.de <- decideTestsDGE(res, p.value = 0.1)
summary(is.de)
       Batch0baseline:Weekwk12
Down                        89
NotSig                   17398
Up                          84

res <- glmLRT(fit, coef = "Batch0placebo:Weekwk12")
is.de <- decideTestsDGE(res, p.value = 0.1)
summary(is.de)
       Batch0placebo:Weekwk12
Down                       18
NotSig                  17526
Up                         27

res <- glmLRT(fit, coef = 11:12)
is.de <- decideTestsDGE(res, p.value = 0.1)
summary(is.de)
       Batch0baseline:Weekwk12+Batch0placebo:Weekwk12
NotSig                                          17196
Sig                                               375

res <- glmLRT(fit, contrast = c(0,0,0,0,0,0,0,0,0,0,-1,1))
is.de <- decideTestsDGE(res, p.value = 0.1)
summary(is.de)
       -1*Batch0baseline:Weekwk12 1*Batch0placebo:Weekwk12
Down                                                   219
NotSig                                               16989
Up                                                     363

Here more questions arise:

Q2: Why QLF produces no DEGs, while LRT gives some DEGs?

Q3: How to interpret results? I wanted to account for the batch effect and to answer the question, which genes are regulated in response to treatment irrespective of the 'placebo' effect. Instead, I have DEGs in the baseline group, DEGs in the placebo group, DEGs which respond to treatment in any of the two groups (by the way Q4: why the number is much higher than a sum of the two?), and a difference between 'placebo' versus 'baseline'.

deseq2 edger limma batch effect paired design • 1.7k views

ADD COMMENT • link 7.1 years ago Vladimir Krasikov ▴ 90

0

Entering edit mode

comment removed

ADD REPLY • link 7.1 years ago Vladimir Krasikov ▴ 90

score 2 · Answer 1 · 2018-03-15

Question 1: Sample 3 looks fine to me, not much different from 1 or 2 in terms of outlier-ness. And 13 million reads instead of 18-22 is not particularly odd to me, I would expect order-of-magnitude differences in the library sizes before throwing something out. I routinely analyze data with 2-5-fold differences in library size.

Question 2: The LRT is anticonservative as it does not properly account for the uncertainty of dispersion estimation. The QL F-test is more accurate and is the one I would trust when the two methods disagree.

Question 3: Well, the obvious interpretation is that there is no evidence for DEGs from the QL F-test (at the specified FDR threshold). But assuming you did have DEGs in each comparison, if you want to find genes that are upregulated regardless of Batch, then you should look at the intersection of DEGs from the placebo and baseline groups (i.e., your first and second contrasts). Alternatively, you can use your third contrast, or you can test whether the average of the placebo/baseline effects is equal to zero (see the edgeR user's guide). Though in both cases, the results only make scientific sense when the log-fold changes in placebo and baseline are in the same direction.

Question 4: There's no reason for the numbers of DEGs to be additive. Why would you expect that? You get more evidence against the null hypothesis when you use information from more samples.

On an unrelated note, your design matrix is very awkwardly formulated, I would instead do:

Week <- rep(c("wk0", "wk12"), 10)
Batch  <- rep(c("baseline", "placebo"), c(8, 12))
Donor <- gl(10, 2)
design <- model.matrix(~Donor + Week:Batch)
design <- design[,!grepl("Weekwk0", colnames(design))] # get to full rank