Hi everyone,
I have a question regarding how to correctly set up a design matrix for DESeq2 with high heterogeneity between study subjects inside a group.
So we are looking at data from two different groups of mice, "WT" and "KO" for a certain gene and we have 3 individual mice per group. We are interested in expression differences between those two groups. We have not only RNA seq data but also ribosome profiling data, which means that we have the type of sequencing as an additional factor for the design next to the genotype. We use the deltaTE framework (doi.org/10.1002/cpmb.108), which is basically exactly designed to analyse this type of data and uses DESeq2 functionality.
A typical design matrix would look like the following:
design=∼ Genotype + SeqType + Genotype:SeqType
However, we have now the case that the individual mice inside the two groups show high heterogeneity and don't cluster very well together in a PCA plot for example. So I wanted to ask, if it is possible to somehow correct for this individual mouse factor in the design matrix. I know that batch effects are common which could be accounted for with an additional Batch
factor, however, since this is not really a batch effect in the sense of for example sampling timepoint or similar, I'm not sure how to properly account for that individual heterogeneity or if it is even possible to dinstinguish the effect of genotype from that of the individual contribution.
I know it would be better to have more than 3 mice per group but this was unfortunately not possible and now I'm wondering if it's still possible to get something useful out of the data.
Thank you very much!
Thank you, I'll have a look!