I wish to know genes that are significantly expressed between knockout and wild type mice as they age from 3 months, 6 months, 12months and 24months. I wonder the best way to analyze such data. I combined Age and condition (genotype) and performed DESeq2 as follows:
> coldata$Age_condition <- factor(paste0(coldata$Age, "-", coldata$condition))
> dds <-DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~ Age_condition)
> dds
> dds <- DESeq(dds)
I am stuck at this point and having issues extracting differential expression results to see genes that significantly differ between 3-6, 6-12, 12-24, 3-24, 3-12, 6-24 months, upon knockdown of the gene. What might be the best way to go about this? Thank you
Thanks Mike for the quick response, I am interested in performing both pairwise comparisons and significant genes at each timepoint between knockout and wild type
Take a look at this example in the workflow, we show how to build results tables for both kinds of tests. I'm not a fan of doing all pairwise comparisons, but if you are going to perform so many tests (any timepoint, and additionally these time pairs), you need to perform multiple test correction across all the comparisons in addition to all the genes. There are some support site posts where I provide code for doing this additional multiple test correction. Note that you may lose power from performing these additional tests. (And you should not snoop and find which time points show differences, and then only present those pairwise comparisons.)
Hey Mike,
After running the first chunk of code in the Time course experiments chapter, here are results I received:
[1] "Intercept" "Age_2YR_vs_1YR" "Age_3M_vs_1YR" "Age_6M_vs_1YR"
[5] "condition_WT_vs_KO" "Age2YR.conditionWT" "Age3M.conditionWT" "Age6M.conditionWT"
>
How do I extract meaningful data from this where I wish to find out genes expressed upon KO of my genes per timepoint? What do these results mean? I'm off to the second batch of code...
Read over this section first which explains interaction models:
https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions
Hey Mike,
I have performed LRT using the following three sets of codes exempting time and even genotype:
> dds <- DESeq(dds, test = "LRT", reduced = ~ Age + condition)
> dds <- DESeq(dds, test = "LRT", reduced = ~ Age)
> dds <- DESeq(dds, test = "LRT", reduced = ~ condition)
I believe these identify significant gene expression with and without considering time and genotype respectively. How does this help answer my question of gene knockout effects over time?
I have also performed tests using the following codes (in brief) where I combine Age and Genotype (condition) variables:
> coldata$Age_condition <- factor(paste0(coldata$Age, "-", coldata$condition))
> dds <-DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~ Age_condition)
dds <- DESeq(dds)
I intend extracting results using the contrast function as follows:
> res <- results(dds, contrast = list("Age_condition3M.KO", "Age_condition3M.WT"))
> res <- results(dds, contrast = list("Age_condition2YR.KO", "Age_condition3M.WT"))
Am I on the right track to compare genotype effects per time point? What additional steps do you think I might take for this analysis? Thanks a lot Mike
Also even after reading the excerpt, my initial question still stands regarding the results:
[1] "Intercept" "Age_2YR_vs_1YR" "Age_3M_vs_1YR" "Age_6M_vs_1YR"
[5] "condition_WT_vs_KO" "Age2YR.conditionWT" "Age3M.conditionWT" "Age6M.conditionWT"
If after reading over the documentation for time series analysis and the interaction material, these coefficients are still not making sense, I'd recommend that you collaborate with a local statistician who can help you to interpret the results of the analysis.
The type and names of coefficients here are not specific to DESeq2, but would be the same coefficients you would generate for any kind of regression or linear model of data with an interaction design.