Dear All,
I have a question about taking into account technical replication in my model. I will greatly appreciate any help from your side!
About my data set:
-
These are proteomics data (originating from TMT-labeling experiment), so each row corresponds to a particular protein and columns contain log-transformed protein intensities for each replicate.
-
There are in total 8 biological replicates, 4 treated and 4 control.
-
4 samples (2 treated and 2 control) were processed and measured in one experiment and 4 other samples in another experiment, thus creating a batch effect that has to be taken into account.
-
Furthermore, each sample was measured twice on two different mass spectrometers using two different acquisition methods. First method (MS2) is more sensitive resulting in a greater number of proteins quantified as compared to another method (MS3) which can quantify less proteins but with higher accuracy. It means, in each biological replicate > 70% proteins will have intensities originating from MS2 and MS3 acquisition methods, and < 30% proteins being quantified in either of those two methods.
I would like to incorporate the technical replication in my model taking into account the batch effect but I am not sure how to properly set up the design matrix for this.
Say, I have following levels in the data:
Acquisition | MS2 | MS3 |
Experiment | 1 | 2 | 1 | 2 |
Treatment | Tr | Con | Tr | Con | Tr | Con | Tr | Con |
Replicate | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
tr <- as.factor(rep(c(2,2,1,1), 4)) # treatment, 2 = Treated, 1 = Control
ex <- as.factor(c(rep(1,4), rep(2,4), rep(1,4), rep(2,4))) # experiment
ms <- as.factor(c(rep(1,8), rep(2,8))) # acquisition method, 1 = MS2, 2 = MS3
Is it correct to define block variable and design matrix as follows?
block <- c(1:8, 1:8)
design <- model.matrix(~ ex + ms + tr)
dupcor = duplicateCorrelation(dat, design = design, block = block)
fit <- lmFit(dat, design, block = block, correlation = dupcor$consensus)
I am not sure if I have to put both ex and ms into the model and if I have to use the design matrix in duplicateCorrelation function. Could you please correct me if I am wrong?
Thank you very much for your help!
Best wishes,
Ivan
Thank you very much for your reply!
I think one can expect a decent amount of correlation between the two acquisition methods, so I think it should be possible to incorporate this into a model. However, thank you for the suggestions, I will see if I can use them for other data sets :)
Best wishes,
Ivan