Hi,
I have expression data from patients that underwent treatment with a drug, before and after treatment. I also have a measurement of something in their blood, which I call "score". The samples metadata looks something like this:
patient . | treatment. | score |
---|---|---|
A | before | 0.31 |
A | after | 0.44 |
B | before | 0.39 |
B | after | 0.28 |
I expect that gene expression will be related to the "score", i.e. for some genes higher scores mean higher expression and vice versa. Nonetheless, I am interested in those cases where this relationship changes before in after treatment, i.e. before treatment higher scores implies higher expression but after treatment there is no such relationship. To my understanding I need the interaction term between treatment and score.
If score would be a categorical variable, then I guess (but not sure) that I would try the steps below, based on the limma manual:
design <- model.matrix(~0+treatment+score+treatment:score, data=sample_metadata)
corfit <- duplicateCorrelation(eset, design, block=sample_metadata$patient)
limma_fit <- lmFit(eset, design, block=sample_metadata$patient, correlation=corfit$consensus)
fit <- eBayes(limma_fit)
Does this sound reasonable? Is it the same for the continuous case as well? Finally, if I would like to look at the results from a single gene, how should I visualize them?
Thank you,
Gil
Thank you Aaron for the fast and helpful reply! One question: I thought that the "patients" is more of "random effects" than "fixed effect" (although, to be honest, I never really understood the difference between the two). This is why I though of using
duplicateCorrelation
andblock
. What do you suppose is better?I would treat
patient
as a fixed effect here - see the first point in this thread.