Merge Replicates vs Mixed Model
0
0
Entering edit mode
abadgerw • 0
@5088ef59
Last seen 18 hours ago
United Kingdom

I am performing an analysis to look at the impact of 3 types of stimulation on 6 stem cell lines (each derived from 1 individual) and I ran the following code:

exprdata<-read.csv("Replicates Merged.csv", row.names=1)
metadata<-read.csv("Metadata.csv",row.names=1)
metadata$ID<-as.factor(metadata$ID)
precursors<-read.csv("Counts.csv",row.names=1)

form <- ~ 0 + Stim + (1 | ID)

compare <- makeContrastsDream(form, metadata, contrasts = c(compare3_1 = "StimFib - StimCon", compare3_2 = "StimFib - StimPDL", compare2_1 = "StimPDL - StimCon"))

fit <- dream(exprdata, form, metadata, compare, ddf = "Kenward-Roger")
fit2 <- variancePartition::eBayes(fit, robust=T, trend=log(precursors$Counts))

FibxCon<-variancePartition::topTable(fit, coef = "compare3_1", number=3469)
FibxPDL<-variancePartition::topTable(fit, coef = "compare3_2", number=3469)
PDLxCon<-variancePartition::topTable(fit, coef = "compare2_1", number=3469)

Of note, each cell line was plated in triplicate before stimulation. I see the following when running a PCA plot: enter image description here

The most variation is attributed to the different cell lines. However, some of the technical replicates do not cluster well.

I thought it was best to merge the triplicates via median before running the model above. However, I started to wonder whether it be better to keep the triplicates separate? If so, would I have to change the structure of my random effect (ie nested)?

Any insights would be much appreciated!

limma DifferentialExpression variancePartition dream • 305 views
ADD COMMENT
0
Entering edit mode

gabriel.hoffman any insight?

ADD REPLY
0
Entering edit mode

It is almost always better to retain the full observed data, if the statistical software is able to model the repeated measures. This is a good application for a mixed model.

With count data, it is not clear how to collapse technical replicates, while allowing the statistical model to model count-error, varying library size and overdispersion.

1) In cases where there are multiple sources of expression variation, the variation partitioning plot can give insight not found in the PCA plot.

2) its not clear if you are using a voom-step here. dream() is designed to take the result of voom() or voomWithDreamWeights()

3) Fitting eBayes() with a trend is not recommended. Instead, incorporating precision weights from a voom-step into dream() is the best way to handle counts.

4) It looks like you have biological expression variation due to Donor and Stimulus

ADD REPLY

Login before adding your answer.

Traffic: 820 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6