Hello guys,
I am doing some RNA-seq analysis using DESeq2 and following the vignette from here, where they use time points 0,15,30,60 ... 180 min. and two groups (strains)
http://www.bioconductor.org/help/workflows/rnaseqGene/
In the time series chapter it says:
"Genes with small p values from this test are those which, at one or more time points after time 0 showed a strain-specific effect."
As an example the following is shown:
## log2 fold change (MLE): strainmut.minute180
## LRT p-value: '~ strain + minute + strain:minute' vs '~ strain + minute'
## DataFrame with 4 rows and 7 columns
## baseMean log2FoldChange lfcSE stat pvalue padj symbol
## <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <character>
## SPBC2F12.09c 174.6712 -2.65763737 0.7498270 99.23199 7.671942e-20 5.184698e-16 atf21
Question 1)
As I am understanding it- this gene has a strain specific expression profile over time. This effect occures over the other time points (from 0 to 180).
If I had picked the strainmut.minute60 then we would have seen a list of DE genes from 0 to 60 minutes, right?
Just for clarification- here are the options of which one can choose:
resultsNames(ddsTC)
## [1] "Intercept" "strain_mut_vs_wt" "minute_15_vs_0" "minute_30_vs_0"
## [5] "minute_60_vs_0" "minute_120_vs_0" "minute_180_vs_0" "strainmut.minute15"
## [9] "strainmut.minute30" "strainmut.minute60" "strainmut.minute120" "strainmut.minute180"
Question 2)
Imagine a gene that behaves in the same way in all timepoints (between the two groups) but time point 0 to 15.
Will this be detected by deseq?
Cheers,
john
Dear Michael, many thanks for your help and time.
Do I understand correctly, that by default the results() function would use test LRT in this case. Hence, for each time point comparison the results fold change would change but the p-value is the same e.g. it is the p-value of the comparison across all time points.
However, in case I am not only interested in the log fold change between two time points but also the p-values of that comparison would it be correct to use the Wald test?
Kind regards, Florian
Correct: https://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#i-ran-a-likelihood-ratio-test-but-results-only-gives-me-one-comparison.
I will say it is possible but a bit cumbersome to perform an LRT for 60 vs 0. You would have to re-run
DESeq()
or minimallynbinomLRT
withfull
andreduced
as design matrices, wherereduced
removes the column associated with 60 vs 0 coefficient.Dear Michael, thank you very much!
I think I do have a intuition of the problem. It is still correct to use the following to compare WT vs MUT at minute 15 controlling for baseline:
To compare time points within a strain I would then just subset the data by strain and compute a LRT with a time variable only. I could then ask for responder genes of a given strain over all time points and also do between time point comparisons using the Wald test as shown above. I think that way I do less harm.
Yes, this gives mut vs wt at minute 15 comparing to baseline.