Hi,
I have data with a wt control, knock outs of two genes (geneA, geneB), and a double knockout (KO of both genes). From what I can understand of design matrices, the following should give me the interaction between geneA and geneB:
model.matrix(~ geneA + geneB + geneA:geneB)
However, I have the same data under two different conditions, and I want to find genes which interact (i.e. change more than expected if they are assumed to be additive) upon changes in condition. More specifically, I want to see how geneA and geneB interact when the cells are in a given condition (if any). I feel knowing which genes only change/change more than expected when both genes are KO whilst in conditionB vs conditionA would give me the best chance at answering this.
model.matrix(~ genotype + condition + genotype:condition)
gives me genes which are specific to that genotype interacting with the condition. But in this model, double knockout is counted as a genotype of its own, as opposed to geneA knockout + geneB knockout
. In simple 1's and 0's, I would like to represent the genotypes as follows, where the columns are my samples, and the rows indicate presence (1) and absence (0) of the gene:
geneA_KO. geneB_KO. double_KO.
geneA. 0 1 0
geneB. 1 0 0
The closest I got to was model.matrix(~0 + (geneA*geneB)^2 + condition + (geneA*geneB)^2:condition))
but I'm not sure if this would correctly model the data - there are a lot of terms and not all of them are relevant. The idea was (geneA*geneB)^2
would model the interactions between the two genes, condition
would take care of the effect of going from conditionA to conditionB, and the last term (geneA*geneB)^2:condition
would model how each genotype interacts with the change in condition (and I'd be mostly focussed on the one with the double knockout).
Can anyone help me model this, please?
I'm not in favor of support site posts that trigger emails to multiple package authors (unnecessarily). Here you've tagged 'limma' so that those package authors are automatically emailed, although the title of your post is DESeq2 design matrix. The approaches for the different packages are sometimes different, so then you're just asking many people to reply to your request at the same time.
Thanks for letting me know. I'll be more careful in the future not to tag more than one package!
Thanks for letting me know. I'll be more careful in the future not to tag more than one package!
I've removed the limma tag because this appears to be a DESeq2 question.
I will say that squaring the formula term
(geneA*geneB)^2
has no meaning for factors. It is identical togeneA*geneB
. The formula you've written is just a very complicated way of writinggeneA*geneB*condition
.