Hello,
I am trying to perform DGE analysis for a time series data. I am using DESeq2 for the analysis.
Upon searching the closes match I can get to what I am trying to do is here:
http://seqanswers.com/forums/showthread.php?t=58065
I went through the latest vignette and not able to correlate what was explained earlier with the last version.
http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#removing-hidden-batch-effects
I have samples from 4 time points (age). Each time point has at least 3 replicates. The experiment is as follows
SampleID Age Sample_41 6M Sample_42 6M Sample_44 6M Sample_45 12M Sample_46 12M Sample_47 12M Sample_48 18M Sample_49 18M Sample_50 18M Sample_51 18M Sample_52 21M Sample_53 21M Sample_54 21M
What I want to get is LFC and qvalues for geneX across the 4 time points. So this way we can determine genes that are not constant, LFC changed as per age or dropped.
If I want to compare with 6 months as ref, that is easy
ddsMat_MouseAge<-DESeqDataSetFromMatrix(MouseAge,expSummary_MouseAge,~Age) ddsMat_MouseAge$Age <- relevel(ddsMat_MouseAge$Age, ref = "6M")
But this is NOT what I want.
Do i need to perform "contrast" with each combination and then perform a union?
res <- results(dds,contrast=c("Age","M6","M12")) save the output, then the next comparison. res <- results(dds,contrast=c("Age","M12","M18")) and likes?
At the seqAnswer link (see top) the code suggested was:
results(dds, contrast=list("6M", c("12M","18M","21M")), listValues=c(1, -1/3))
Repeat above for each age: 6M, 12M,18M,21M
Does this also hold true now with the latest version of DESeq2? If you can also explain the listValues part that as I don't get it.
Thanks
Thanks Michael for a prompt response.
I had tried something similar after reading your suggestion given here (with modifications):
resultsNames(ddsMat_MouseAge)
And when I am looking at the res is showing M6 vs M12 only
I am missing something here.
https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#i-ran-a-likelihood-ratio-test-but-results-only-gives-me-one-comparison.
Hi Michael,
As suggested:
[1] "Intercept" "Age_M18_vs_M12" "Age_M21_vs_M12" "Age_M6_vs_M12"
Using:
results(dds,name="Age_M18_vs_M12")
I can see the LFC and the corresponding p and q value.
However, I cannot see the change from 6M vs18M or 18M vs 21 M does this means none on the genes were significant as per LRT?
try
results( dds, contrast = c( "Age", "M18", "M12" )
Thanks Simon,
But "18M", "12M" etc should be level? and from
resultsNames(dds)
We know they are not. Only following are:
"Age_M18_vs_M12" "Age_M21_vs_M12" "Age_M6_vs_M12"
I’m not sure there’s a question left here that’s not answered in the vignette. The base level that you see in resultsNames is set alphabetically unless you do otherwise (see vignette “Note on factor levels”). The fact that a single coefficient appears at the top of the results table is explained at the link I posted. You can set test=“Wald” in results() if you want to compare two levels alone.
Thanks Michael for the response.
I do get the part of seeing a single coefficients and use of name in results to get the LFC, p and q among other stats. The values shown must be element of
resultsNames(object)
.Looking at the output from resultsName, the comparison is against M12. This is after using LRT method.
I am looking for changes of genes across different ages. Not comparing with a base of 6M or any age in particular. Then I am looking for LFC and p/qvalues.
So is everything clear or is there a remaining question?
Thanks Michael and apologies for any confusion.
Perhaps restating will be helpful.
I have samples with replicates for diff time periods (age) and no condition. I want to perform DGE and capture genes that have changed across the ages. Use of LRT was suggested
This will give genes that changes at any point of time. However, when I look for
[1] "Intercept" "Age_M18_vs_M12" "Age_M21_vs_M12" "Age_M6_vs_M12"
This means comparison was performed with a ref as 12months. This is not what I am looking for.
I am looking for LFC, p/qvalues across the group. 6Mwith12M, 12Mwith18M etc and get LFC and q values.
Do i need to perform diff comparisons for diff combinations to get LFC and p/values? If so how should I combine the results to basically show that geneX LFC was changed/remained constant/up-regulated/down-regulated after age "y" and q values is such after multiple testing?
Thanks
You can perform the analysis across pairs of levels using the code that Simon posted. You just specify the two groups you want to compare with 'contrast' in the results() function. If DESeq() was run with test="LRT", then when you run results() you should also add test="Wald" to generate new p-values for the contrast.
The LRT gives you a single p-value if there was a change at any time point(s).
In DESeq2, we don't have functionality to test differences between all consecutive time points, but you can take a look at the stageR package for combining screening p-values with confirmation p-values. Perhaps this will give you the kind of post-hoc test correction you are looking for.
Thanks Michael as suggested I will and perform pairwise analysis and then look at stageR.
One thing that is still confusing me was when I did performed LRT test as you had suggested. Why did the comparisons was with age12M?
[1] "Intercept" "Age_M18_vs_M12" "Age_M21_vs_M12" "Age_M6_vs_M12"
Thanks
This was already answered in the thread here and it is discussed in the vignette.
Thanks Michael.
I will into stageR and ImpulseDE2.