Hi there,
I googled the question, but could not find an answer that can solve my question, so I post it here.
Thank you so much in advance!!
I recently received a dataset that has already been sequenced. The idea was: for each batch of cell, transfected with one control vector and a bunch of gene overexpression vector. So, in batch one, I have one control vector and six overexpression vector, each with three replicates. In batch two, I have the same control vector but different six overexpression vector. The purpose is to see what genes are DE comparing overexpression with control samples. I know it's not a good design, but unfortunately, it has already been made. BTW, the batch effect is very obvious on PCA. I currently use edgeR
for analysis. The previous analysis was done in DESeq2
by a colleague.
Question 1 is: if I use the model design <- model.matrix(~batch+plasmid)
. Since there is only control vector was repeated, does this model make sense? Or in another word, do I combine all batch together and use ~batch+plasmid
OR separate each batch to call DE genes using ~plasmid
? I'm not sure statistically which one is slightly better.
Question 2 is: if I repeat the experiment with vectors random picked six vectors, two batches. Will it help? If so, does the help come from simply more replicates?
Question 3 is: if I re-do the experiment, do you recommend put each replicate in separate batch, trying to fit a Balanced Incomplete Block (BIB) design or something like that? I can't do one replicate for all TFs in one batch (limited material).
A simple case would be like this:
plasmid <-factor(c(rep("control",3),rep("tf1",3),rep("control",3),rep("tf2",3)))
batch <- factor(c(rep("1",6),rep("2",6)))
design <- model.matrix(~batch+plasmid)
design
(Intercept) batch2 plasmidtf1 plasmidtf2
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 1 0 1 0
5 1 0 1 0
6 1 0 1 0
7 1 1 0 0
8 1 1 0 0
9 1 1 0 0
10 1 1 0 1
11 1 1 0 1
12 1 1 0 1
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$`batch`
[1] "contr.treatment"
attr(,"contrasts")$plasmid
[1] "contr.treatment"
I updated the tag. Thank you Aaron!