Hi,
I have an experimental design of treated versus control samples across 4 time points (A, B, C, and D) and I would like to use the LRT test to find compound-specific DE genes across these time points. However, I do not have a time point 0, meaning all 4 time points, consist of treated and control samples.
How can I use the LRT test to find DEGs for all time points including timepoint A?
Thank you, Denise
Hi Michael,
Thank you for getting back to me!
I am unsure if my question is very clear as I am new to this. I would like to use an LRT of compound + time + compound:time versus compound + time to capture compound-specific genes across time. I get the following coefficients
Intercept, compound_vs_CONTROL, timepoint_B_vs_A, timepoint_C_vs_A, timepoint_D_vs_A, compound.timepointB, compound.timepointC, and compound.timepointD
So there is no result for compound:timepointA! How can I get a coefficient/genes for this combination as well?
Thank you.
I understood your question above (at least I think so) and my response above remains relevant.
I recommend to walk through the question and my response above with a local statistician or someone familiar with linear models in R.
Hi,
Unfortunately, I don't know anyone who could help me with this particular question.
Would calling the results() function with a contrast of contrast=list(c(compound_vs_CONTROL, compound.timepointB, compound.timepointC, compound.timepointD)) give me the compound-specific genes across all time points?
Thank you.
An LRT has nothing to do with any contrast, but instead is comparing the model fit with and without certain coefficients. So if you do
Then you are comparing the likelihood for the full model (~ compound + time + compound:time) versus the reduced model (~ compound + time), and if the full model fits the data better, you will get a significant p-value for the likelihood ratio test. In effect what that means is that the time-resolved gene expression is different at one or more time points for at least one of the compounds. This is an omnibus test (similar to an F-test) that tells you things are different without saying what the differences are.
You could come back with Wald tests for each timepoint to nail that down, but that is pretty boring and complex all at the same time. You would probably be better off taking the set of significant genes, using K-means clustering to resolve into multiple patterns and then plot the data for each cluster. You will have to play with K to get reasonable clusters, but that allows you to say things about each cluster, and how the treatments affect the time-dependent resolution of gene expression within each cluster.
Anyway, to show further what I mean about LRT not being contrast-dependent, consider this:
The results with and without a contrast are identical!
Specifying the contrast shouldn't affect the results for an LRT test because it's not a Wald test, so a contrast doesn't make much sense in that context. Which is why I suspect Mike asked you to find somebody local to help you. An alternative, if you need to do the analysis yourself is to educate yourself about linear regression. In the end, you will have to explain what you have done and would presumably want your explanation to make sense, particularly if you are explaining to someone with a statistical background. There's only two ways that can happen - either you know what you are talking about, or you have somebody who does know what she's talking about to provide the explanation. If you can't do the latter, you should probably do the former.