DESeq2, problems using list argument to extract results from DESeq
2
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.4 years ago
Hi, I have gene expression from several embryogenesis stages as well as non-reproductive stages. I have done DESeq on my DESeqDataSet and want to extract the results from all the embryogenesis stages vs non- reproductive. Heres the error I get: > resNonReprVsAll = results(dds, contrast=list(c("Early vitellogenesis", "Late vitellogenesis", "Fertilization", "Early cleavage", "Late cleavage", "Early preinversion", "Late preinversion", "Early postinversion", "Late postinversion"), "Non reproductive")) Error in cleanContrast(object, contrast, expanded = isExpanded, listValues = listValues) : all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)' Here's the output of resultsNames(dds): > resultsNames(dds) [1] "Intercept" "conditionEarly.vitellogenesis" "conditionLate.vitellogenesis" [4] "conditionFertilization" "conditionEarly.cleavage" "conditionLate.cleavage" [7] "conditionEarly.preinversion" "conditionLate.preinversion" "conditionEarly.postinversion" [10] "conditionLate.postinversion" "conditionNon.reproductive" -- output of sessionInfo(): > sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.300.8.0 Rcpp_0.11.1 GenomicRanges_1.16.3 [5] GenomeInfoDb_1.0.2 IRanges_1.22.8 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] AnnotationDbi_1.26.0 Biobase_2.24.0 DBI_0.2-7 RColorBrewer_1.0-5 RSQLite_0.11.4 [6] XML_3.98-1.1 XVector_0.4.0 annotate_1.42.0 genefilter_1.46.1 geneplotter_1.42.0 [11] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1 splines_3.1.0 stats4_3.1.0 [16] survival_2.37-7 tools_3.1.0 xtable_1.7-3 -- Sent via the guest posting facility at bioconductor.org.
DESeq DESeq • 5.8k views
ADD COMMENT
0
Entering edit mode

Hi,

i'm having some trouble and seems to be similar to yours.

Can you help me?

ADD REPLY
0
Entering edit mode
Simon Anders ★ 3.8k
@simon-anders-3855
Last seen 4.4 years ago
Zentrum für Molekularbiologie, Universi…
Hi Jon On 11/07/14 10:36, Jon Br?te [guest] wrote: > I have gene expression from several embryogenesis stages as well as non-reproductive stages. I have done DESeq on my DESeqDataSet and want to extract the results from all the embryogenesis stages vs non- reproductive. > > Heres the error I get: > >> resNonReprVsAll = results(dds, contrast=list(c("Early vitellogenesis", "Late vitellogenesis", "Fertilization", "Early cleavage", "Late cleavage", "Early preinversion", "Late preinversion", "Early postinversion", "Late postinversion"), "Non reproductive")) > Error in cleanContrast(object, contrast, expanded = isExpanded, listValues = listValues) : > all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)' The 'results' function gives you a results table for _one_ contrast. If you want several contrasts, you have to call 'results' several times. Also note that 'results' now supports two ways of specifying contrasts: For main effects, we suggest that you use the new format of passing a vector with three strings, namely the factor which you want to contrast and then the two levels you want to compare, e.g., results( dds, contrast = c( "condition", "Fertilization", "Non reproductive" ) ) # (or perhaps "Non.reproductive", depending how your level is # actually names) The old way of specifying _two_ elements from 'resultsNames' needs to be used only for more complicated contrasts, e.g., those involving interactions. If this is unclear, ask again, and explain the biology of your experiment and what contrasts you are actually interested in. Simon
ADD COMMENT
0
Entering edit mode
Jon Bråte ▴ 260
@jon-brate-6263
Last seen 6 months ago
Norway
Simon Anders <anders at="" ...=""> writes: > > Hi Jon > > On 11/07/14 10:36, Jon Br?te [guest] wrote: > > I have gene expression from several embryogenesis stages as well as non-reproductive stages. I have done > DESeq on my DESeqDataSet and want to extract the results from all the embryogenesis stages vs non-reproductive. > > > > Heres the error I get: > > > >> resNonReprVsAll = results(dds, contrast=list(c("Early vitellogenesis", "Late vitellogenesis", > "Fertilization", "Early cleavage", "Late cleavage", "Early preinversion", "Late preinversion", > "Early postinversion", "Late postinversion"), "Non reproductive")) > > Error in cleanContrast(object, contrast, expanded = isExpanded, listValues = listValues) : > > all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)' > > The 'results' function gives you a results table for _one_ contrast. If > you want several contrasts, you have to call 'results' several times. > > Also note that 'results' now supports two ways of specifying contrasts: > For main effects, we suggest that you use the new format of passing a > vector with three strings, namely the factor which you want to contrast > and then the two levels you want to compare, e.g., > > results( dds, contrast = c( "condition", > "Fertilization", "Non reproductive" ) ) > # (or perhaps "Non.reproductive", depending how your level is > # actually names) > > The old way of specifying _two_ elements from 'resultsNames' needs to be > used only for more complicated contrasts, e.g., those involving > interactions. > > If this is unclear, ask again, and explain the biology of your > experiment and what contrasts you are actually interested in. > > Simon > > _______________________________________________ > Bioconductor mailing list > Bioconductor at ... > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > Dear Simon, thank you for your answer. What I am interested in is to find genes that possibly play a part in embryogenesis. In other words, genes that are up-regulated in any of the embryogenesis stages compared to the non-reproductive stages. I have tried three different approaches, but struggle to understand what they mean, how they differ, and if they actually make sense. I try to explain what I did, sorry if it is difficult to read: First I tried to contrast all the embryogenesis stages compared to the non reproductive using the Wald test in DESeq: dds = DESeq(dds) resCtrst = results(dds, contrast=list(c("conditionEarly.vitellogenesis", "conditionLate.vitellogenesis", "conditionFertilization", "conditionEarly.cleavage", "conditionLate.cleavage", "conditionEarly.preinversion", "conditionLate.preinversion", "conditionEarly.postinversion", "conditionLate.postinversion"), "conditionNon.reproductive")) resCtrst = resCtrst[order(resCtrst$padj),] resSigCtrst = resCtrst[which(resCtrst$padj < 0.1), ] resSigCtrstPositive = resSigCtrst[which(resSigCtrst$log2FoldChange > 0), ] write.csv2(resSigCtrstPositive, file="DESeq2_Wald_test_Sig_upreg_all_cond_vs_non_repr.csv") This gave me 1736 upregulated genes. Second I ran an LTR test like this: ddsLRT = DESeq(dds, test="LRT", full=~condition, reduced=~1) resLRT = results(ddsLRT) resLRT = resLRT[order(resLRT$padj),] resSigLRT = resLRT[which(resLRT$padj < 0.1), ] resSigLRTPositive = resSigLRT[which(resSigLRT$log2FoldChange > 0), ] write.csv2(resSigLRT, file="DESeq2_Sig_LRT_upreg.csv") This gave me 3158 upregulated genes (all genes have positive log2fold changes so I guess I cannot see which are up or down regulated?) And I don't understand exactly what is contained in resLRT, are the significant genes here significantly expressed between Non reproductive samples and all others compared to the null hypothesis? Or is it just for the two conditions Late.postinversion vs Non reproductive?: mcols(resLRT) DataFrame with 6 rows and 2 columns type description <character> <character> 1 intermediate the base mean over all rows 2 results log2 fold change: condition Late.postinversion vs Non.reproductive 3 results standard error: condition Late.postinversion vs Non.reproductive 4 results LRT statistic: '~ condition' vs '~ 1' 5 results LRT p-value: '~ condition' vs '~ 1' 6 results BH adjusted p-values Third I tried to compare the Wald test and the LTR: res = results(dds) #should I use resCtrst instead of res?? tab = table(Wald=res$padj < .1, LRT=resLRT$padj < .1) addmargins(tab) LRT Wald FALSE TRUE Sum FALSE 1027 1766 2793 TRUE 526 747 1273 Sum 1553 2513 4066 sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 GenomicRanges_1.16.3 [5] GenomeInfoDb_1.0.2 IRanges_1.22.9 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] AnnotationDbi_1.26.0 Biobase_2.24.0 DBI_0.2-7 RColorBrewer_1.0-5 RSQLite_0.11.4 [6] XML_3.98-1.1 XVector_0.4.0 annotate_1.42.0 genefilter_1.46.1 geneplotter_1.42.0 [11] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1 splines_3.1.0 stats4_3.1.0 [16] survival_2.37-7 tools_3.1.0 xtable_1.7-3
ADD COMMENT
0
Entering edit mode
Hi Jon, On Jul 30, 2014 3:27 PM, "Jon Bråte" > thank you for your answer. What I am interested in is to find genes that > possibly play a part in embryogenesis. In other words, genes that are > up-regulated in any of the embryogenesis stages compared to the > non-reproductive stages. Can you tell us the levels of condition which are embryogenesis vs those which are non-reproductive? > > I have tried three different approaches, but struggle to understand what > they mean, how they differ, and if they actually make sense. I try to > explain what I did, sorry if it is difficult to read: > > > First I tried to contrast all the embryogenesis stages compared to the non > reproductive using the Wald test in DESeq: > > dds = DESeq(dds) > resCtrst = results(dds, contrast=list(c("conditionEarly.vitellogenesis", > "conditionLate.vitellogenesis", "conditionFertilization", > "conditionEarly.cleavage", > "conditionLate.cleavage", > "conditionEarly.preinversion", "conditionLate.preinversion", > "conditionEarly.postinversion", > "conditionLate.postinversion"), "conditionNon.reproductive")) > resCtrst = resCtrst[order(resCtrst$padj),] > resSigCtrst = resCtrst[which(resCtrst$padj < 0.1), ] > resSigCtrstPositive = resSigCtrst[which(resSigCtrst$log2FoldChange > 0), ] > write.csv2(resSigCtrstPositive, > file="DESeq2_Wald_test_Sig_upreg_all_cond_vs_non_repr.csv") > > This gave me 1736 upregulated genes. > > > Second I ran an LTR test like this: > > ddsLRT = DESeq(dds, test="LRT", full=~condition, reduced=~1) > resLRT = results(ddsLRT) > resLRT = resLRT[order(resLRT$padj),] > resSigLRT = resLRT[which(resLRT$padj < 0.1), ] > resSigLRTPositive = resSigLRT[which(resSigLRT$log2FoldChange > 0), ] > write.csv2(resSigLRT, file="DESeq2_Sig_LRT_upreg.csv") > > This gave me 3158 upregulated genes (all genes have positive log2fold > changes so I guess I cannot see which are up or down regulated?) > > And I don't understand exactly what is contained in resLRT, are the > significant genes here significantly expressed between Non reproductive > samples and all others compared to the null hypothesis? Or is it just for > the two conditions Late.postinversion vs Non reproductive?: > > mcols(resLRT) > DataFrame with 6 rows and 2 columns > type > description > <character> > <character> > 1 intermediate the base mean over all > rows > 2 results log2 fold change: condition Late.postinversion vs > Non.reproductive > 3 results standard error: condition Late.postinversion vs > Non.reproductive > 4 results LRT statistic: '~ condition' vs > '~ 1' > 5 results LRT p-value: '~ condition' vs > '~ 1' > 6 results BH adjusted > p-values > > > Third I tried to compare the Wald test and the LTR: > > res = results(dds) #should I use resCtrst instead of res?? > tab = table(Wald=res$padj < .1, LRT=resLRT$padj < .1) > addmargins(tab) > > LRT > Wald FALSE TRUE Sum > FALSE 1027 1766 2793 > TRUE 526 747 1273 > Sum 1553 2513 4066 > > > sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 > GenomicRanges_1.16.3 > [5] GenomeInfoDb_1.0.2 IRanges_1.22.9 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.26.0 Biobase_2.24.0 DBI_0.2-7 > RColorBrewer_1.0-5 RSQLite_0.11.4 > [6] XML_3.98-1.1 XVector_0.4.0 annotate_1.42.0 > genefilter_1.46.1 geneplotter_1.42.0 > [11] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1 > splines_3.1.0 stats4_3.1.0 > [16] survival_2.37-7 tools_3.1.0 xtable_1.7-3 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi, > dds$condition [1] Early vitellogenesis Early vitellogenesis Early vitellogenesis Late vitellogenesis [5] Late vitellogenesis Late vitellogenesis Late vitellogenesis Fertilization [9] Fertilization Early cleavage Early cleavage Late cleavage [13] Late cleavage Early preinversion Early preinversion Late preinversion [17] Late preinversion Early postinversion Early postinversion Late postinversion [21] Non reproductive Non reproductive Non reproductive Non reproductive [25] Non reproductive Non reproductive 10 Levels: Non reproductive Early vitellogenesis Late vitellogenesis Fertilization ... Late postinversion > dds class: DESeqDataSet dim: 6137 26 exptData(0): assays(3): counts mu cooks rownames(6137): scign000005 scign000006 ... scign021663 scign021669 rowData metadata column names(57): baseMean baseVar ... deviance maxCooks colnames(26): 05.apr 11.apr ... R2D0T1 R3D0T1 colData names(2): condition sizeFactor On 31. juli 2014, at 02:28, Michael Love wrote: Hi Jon, On Jul 30, 2014 3:27 PM, "Jon Bråte" > thank you for your answer. What I am interested in is to find genes that > possibly play a part in embryogenesis. In other words, genes that are > up-regulated in any of the embryogenesis stages compared to the > non-reproductive stages. Can you tell us the levels of condition which are embryogenesis vs those which are non-reproductive? > > I have tried three different approaches, but struggle to understand what > they mean, how they differ, and if they actually make sense. I try to > explain what I did, sorry if it is difficult to read: > > > First I tried to contrast all the embryogenesis stages compared to the non > reproductive using the Wald test in DESeq: > > dds = DESeq(dds) > resCtrst = results(dds, contrast=list(c("conditionEarly.vitellogenesis", > "conditionLate.vitellogenesis", "conditionFertilization", > "conditionEarly.cleavage", > "conditionLate.cleavage", > "conditionEarly.preinversion", "conditionLate.preinversion", > "conditionEarly.postinversion", > "conditionLate.postinversion"), "conditionNon.reproductive")) > resCtrst = resCtrst[order(resCtrst$padj),] > resSigCtrst = resCtrst[which(resCtrst$padj < 0.1), ] > resSigCtrstPositive = resSigCtrst[which(resSigCtrst$log2FoldChange > 0), ] > write.csv2(resSigCtrstPositive, > file="DESeq2_Wald_test_Sig_upreg_all_cond_vs_non_repr.csv") > > This gave me 1736 upregulated genes. > > > Second I ran an LTR test like this: > > ddsLRT = DESeq(dds, test="LRT", full=~condition, reduced=~1) > resLRT = results(ddsLRT) > resLRT = resLRT[order(resLRT$padj),] > resSigLRT = resLRT[which(resLRT$padj < 0.1), ] > resSigLRTPositive = resSigLRT[which(resSigLRT$log2FoldChange > 0), ] > write.csv2(resSigLRT, file="DESeq2_Sig_LRT_upreg.csv") > > This gave me 3158 upregulated genes (all genes have positive log2fold > changes so I guess I cannot see which are up or down regulated?) > > And I don't understand exactly what is contained in resLRT, are the > significant genes here significantly expressed between Non reproductive > samples and all others compared to the null hypothesis? Or is it just for > the two conditions Late.postinversion vs Non reproductive?: > > mcols(resLRT) > DataFrame with 6 rows and 2 columns > type > description > <character> > <character> > 1 intermediate the base mean over all > rows > 2 results log2 fold change: condition Late.postinversion vs > Non.reproductive > 3 results standard error: condition Late.postinversion vs > Non.reproductive > 4 results LRT statistic: '~ condition' vs > '~ 1' > 5 results LRT p-value: '~ condition' vs > '~ 1' > 6 results BH adjusted > p-values > > > Third I tried to compare the Wald test and the LTR: > > res = results(dds) #should I use resCtrst instead of res?? > tab = table(Wald=res$padj < .1, LRT=resLRT$padj < .1) > addmargins(tab) > > LRT > Wald FALSE TRUE Sum > FALSE 1027 1766 2793 > TRUE 526 747 1273 > Sum 1553 2513 4066 > > > sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 > GenomicRanges_1.16.3 > [5] GenomeInfoDb_1.0.2 IRanges_1.22.9 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.26.0 Biobase_2.24.0 DBI_0.2-7 > RColorBrewer_1.0-5 RSQLite_0.11.4 > [6] XML_3.98-1.1 XVector_0.4.0 annotate_1.42.0 > genefilter_1.46.1 geneplotter_1.42.0 > [11] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1 > splines_3.1.0 stats4_3.1.0 > [16] survival_2.37-7 tools_3.1.0 xtable_1.7-3 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org<mailto:bioconductor@r-project.org> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ---------------------------------------------------------------- Jon Bråte Section for Genetics and Evolutionary Biology (EVOGENE) Department of Biosciences University of Oslo P.B. 1066 Blindern N-0316, Norway Email: jon.brate@ibv.uio.no<mailto:jon.brate@ibv.uio.no> Phone: 922 44 582 Web: mn.uio.no/ibv/english/people/aca/jonbra/index.html<http: mn.uio.="" no="" ibv="" english="" people="" aca="" jonbra="" index.html=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
hi Jon, On Thu, Jul 31, 2014 at 2:13 AM, Jon Br?te <jon.brate at="" ibv.uio.no=""> wrote: > Hi, > >> dds$condition > [1] Early vitellogenesis Early vitellogenesis Early vitellogenesis Late > vitellogenesis > [5] Late vitellogenesis Late vitellogenesis Late vitellogenesis > Fertilization > [9] Fertilization Early cleavage Early cleavage Late > cleavage > [13] Late cleavage Early preinversion Early preinversion Late > preinversion > [17] Late preinversion Early postinversion Early postinversion Late > postinversion > [21] Non reproductive Non reproductive Non reproductive Non > reproductive > [25] Non reproductive Non reproductive > 10 Levels: Non reproductive Early vitellogenesis Late vitellogenesis > Fertilization ... Late postinversion > >> dds > class: DESeqDataSet > dim: 6137 26 > exptData(0): > assays(3): counts mu cooks > rownames(6137): scign000005 scign000006 ... scign021663 scign021669 > rowData metadata column names(57): baseMean baseVar ... deviance maxCooks > colnames(26): 05.apr 11.apr ... R2D0T1 R3D0T1 > colData names(2): condition sizeFactor > > > On 31. juli 2014, at 02:28, Michael Love wrote: > > Hi Jon, > > On Jul 30, 2014 3:27 PM, "Jon Br?te" >> thank you for your answer. What I am interested in is to find genes that >> possibly play a part in embryogenesis. In other words, genes that are >> up-regulated in any of the embryogenesis stages compared to the >> non-reproductive stages. > > Can you tell us the levels of condition which are embryogenesis vs those > which are non-reproductive? > >> >> I have tried three different approaches, but struggle to understand what >> they mean, how they differ, and if they actually make sense. I try to >> explain what I did, sorry if it is difficult to read: >> >> >> First I tried to contrast all the embryogenesis stages compared to the non >> reproductive using the Wald test in DESeq: >> >> dds = DESeq(dds) >> resCtrst = results(dds, contrast=list(c("conditionEarly.vitellogenesis", >> "conditionLate.vitellogenesis", "conditionFertilization", >> "conditionEarly.cleavage", >> "conditionLate.cleavage", >> "conditionEarly.preinversion", "conditionLate.preinversion", >> "conditionEarly.postinversion", >> "conditionLate.postinversion"), "conditionNon.reproductive")) Here you need to include the results() argument listValues=c(1/9, -1). This is to compare the average of the 9 embryogenic levels to the 1 non-reproductive level. This seems like the correct results table for your question. Additionally you can perform individual contrasts of one level with the non-reproductive level: res = results(dds, contrast=c("condition","Early.vitellogenesis","Non.reproductive")) >> resCtrst = resCtrst[order(resCtrst$padj),] >> resSigCtrst = resCtrst[which(resCtrst$padj < 0.1), ] >> resSigCtrstPositive = resSigCtrst[which(resSigCtrst$log2FoldChange > 0), ] >> write.csv2(resSigCtrstPositive, >> file="DESeq2_Wald_test_Sig_upreg_all_cond_vs_non_repr.csv") >> >> This gave me 1736 upregulated genes. >> >> >> Second I ran an LTR test like this: >> >> ddsLRT = DESeq(dds, test="LRT", full=~condition, reduced=~1) >> resLRT = results(ddsLRT) >> resLRT = resLRT[order(resLRT$padj),] >> resSigLRT = resLRT[which(resLRT$padj < 0.1), ] >> resSigLRTPositive = resSigLRT[which(resSigLRT$log2FoldChange > 0), ] >> write.csv2(resSigLRT, file="DESeq2_Sig_LRT_upreg.csv") >> >> This gave me 3158 upregulated genes (all genes have positive log2fold >> changes so I guess I cannot see which are up or down regulated?) >> The log2 fold change which is shown is for just one of the 9 coefficients in the model (the last level over the base level). In the ?results help file we have: For analyses using the likelihood ratio test (using ?nbinomLRT?), the p-values are determined solely by the difference in deviance between the full and reduced model formula. A log2 fold change is included, which can be controlled using the ?name? argument, or by default this will be the estimated coefficient for the last element of ?resultsNames(object)?. >> And I don't understand exactly what is contained in resLRT, are the >> significant genes here significantly expressed between Non reproductive >> samples and all others compared to the null hypothesis? The LRT in this case is testing for any changes across *any* levels of condition. So this sounds like not the test which you are interested in answering. >> Or is it just for >> the two conditions Late.postinversion vs Non reproductive?: >> >> mcols(resLRT) >> DataFrame with 6 rows and 2 columns >> type >> description >> <character> >> <character> >> 1 intermediate the base mean over >> all >> rows >> 2 results log2 fold change: condition Late.postinversion vs >> Non.reproductive >> 3 results standard error: condition Late.postinversion vs >> Non.reproductive >> 4 results LRT statistic: '~ condition' >> vs >> '~ 1' >> 5 results LRT p-value: '~ condition' >> vs >> '~ 1' >> 6 results BH adjusted >> p-values >> >> >> Third I tried to compare the Wald test and the LTR: >> >> res = results(dds) #should I use resCtrst instead of res?? from the ?results help file: If ?results? is run without specifying ?contrast? or ?name?, it will return the comparison of the last level of the last variable in the design formula over the first level of this variable. Mike >> tab = table(Wald=res$padj < .1, LRT=resLRT$padj < .1) >> addmargins(tab) >> >> LRT >> Wald FALSE TRUE Sum >> FALSE 1027 1766 2793 >> TRUE 526 747 1273 >> Sum 1553 2513 4066 >> >> >> sessionInfo() >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-apple-darwin10.8.0 (64-bit) >> >> locale: >> [1] C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> base >> >> other attached packages: >> [1] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 >> GenomicRanges_1.16.3 >> [5] GenomeInfoDb_1.0.2 IRanges_1.22.9 BiocGenerics_0.10.0 >> >> loaded via a namespace (and not attached): >> [1] AnnotationDbi_1.26.0 Biobase_2.24.0 DBI_0.2-7 >> RColorBrewer_1.0-5 RSQLite_0.11.4 >> [6] XML_3.98-1.1 XVector_0.4.0 annotate_1.42.0 >> genefilter_1.46.1 geneplotter_1.42.0 >> [11] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1 >> splines_3.1.0 stats4_3.1.0 >> [16] survival_2.37-7 tools_3.1.0 xtable_1.7-3 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > ---------------------------------------------------------------- > Jon Br?te > > Section for Genetics and Evolutionary Biology (EVOGENE) > Department of Biosciences > University of Oslo > P.B. 1066 Blindern > N-0316, Norway > Email: jon.brate at ibv.uio.no > Phone: 922 44 582 > Web: mn.uio.no/ibv/english/people/aca/jonbra/index.html > > > >
ADD REPLY
0
Entering edit mode
Great! Thanks! Jon On 31. juli 2014, at 15:50, Michael Love wrote: hi Jon, On Thu, Jul 31, 2014 at 2:13 AM, Jon Bråte <jon.brate@ibv.uio.no<mailto:jon.brate@ibv.uio.no>> wrote: Hi, dds$condition [1] Early vitellogenesis Early vitellogenesis Early vitellogenesis Late vitellogenesis [5] Late vitellogenesis Late vitellogenesis Late vitellogenesis Fertilization [9] Fertilization Early cleavage Early cleavage Late cleavage [13] Late cleavage Early preinversion Early preinversion Late preinversion [17] Late preinversion Early postinversion Early postinversion Late postinversion [21] Non reproductive Non reproductive Non reproductive Non reproductive [25] Non reproductive Non reproductive 10 Levels: Non reproductive Early vitellogenesis Late vitellogenesis Fertilization ... Late postinversion dds class: DESeqDataSet dim: 6137 26 exptData(0): assays(3): counts mu cooks rownames(6137): scign000005 scign000006 ... scign021663 scign021669 rowData metadata column names(57): baseMean baseVar ... deviance maxCooks colnames(26): 05.apr 11.apr ... R2D0T1 R3D0T1 colData names(2): condition sizeFactor On 31. juli 2014, at 02:28, Michael Love wrote: Hi Jon, On Jul 30, 2014 3:27 PM, "Jon Bråte" thank you for your answer. What I am interested in is to find genes that possibly play a part in embryogenesis. In other words, genes that are up-regulated in any of the embryogenesis stages compared to the non-reproductive stages. Can you tell us the levels of condition which are embryogenesis vs those which are non-reproductive? I have tried three different approaches, but struggle to understand what they mean, how they differ, and if they actually make sense. I try to explain what I did, sorry if it is difficult to read: First I tried to contrast all the embryogenesis stages compared to the non reproductive using the Wald test in DESeq: dds = DESeq(dds) resCtrst = results(dds, contrast=list(c("conditionEarly.vitellogenesis", "conditionLate.vitellogenesis", "conditionFertilization", "conditionEarly.cleavage", "conditionLate.cleavage", "conditionEarly.preinversion", "conditionLate.preinversion", "conditionEarly.postinversion", "conditionLate.postinversion"), "conditionNon.reproductive")) Here you need to include the results() argument listValues=c(1/9, -1). This is to compare the average of the 9 embryogenic levels to the 1 non-reproductive level. This seems like the correct results table for your question. Additionally you can perform individual contrasts of one level with the non-reproductive level: res = results(dds, contrast=c("condition","Early.vitellogenesis","Non.reproductive")) resCtrst = resCtrst[order(resCtrst$padj),] resSigCtrst = resCtrst[which(resCtrst$padj < 0.1), ] resSigCtrstPositive = resSigCtrst[which(resSigCtrst$log2FoldChange > 0), ] write.csv2(resSigCtrstPositive, file="DESeq2_Wald_test_Sig_upreg_all_cond_vs_non_repr.csv") This gave me 1736 upregulated genes. Second I ran an LTR test like this: ddsLRT = DESeq(dds, test="LRT", full=~condition, reduced=~1) resLRT = results(ddsLRT) resLRT = resLRT[order(resLRT$padj),] resSigLRT = resLRT[which(resLRT$padj < 0.1), ] resSigLRTPositive = resSigLRT[which(resSigLRT$log2FoldChange > 0), ] write.csv2(resSigLRT, file="DESeq2_Sig_LRT_upreg.csv") This gave me 3158 upregulated genes (all genes have positive log2fold changes so I guess I cannot see which are up or down regulated?) The log2 fold change which is shown is for just one of the 9 coefficients in the model (the last level over the base level). In the ?results help file we have: For analyses using the likelihood ratio test (using ‘nbinomLRT’), the p-values are determined solely by the difference in deviance between the full and reduced model formula. A log2 fold change is included, which can be controlled using the ‘name’ argument, or by default this will be the estimated coefficient for the last element of ‘resultsNames(object)’. And I don't understand exactly what is contained in resLRT, are the significant genes here significantly expressed between Non reproductive samples and all others compared to the null hypothesis? The LRT in this case is testing for any changes across *any* levels of condition. So this sounds like not the test which you are interested in answering. Or is it just for the two conditions Late.postinversion vs Non reproductive?: mcols(resLRT) DataFrame with 6 rows and 2 columns type description <character> <character> 1 intermediate the base mean over all rows 2 results log2 fold change: condition Late.postinversion vs Non.reproductive 3 results standard error: condition Late.postinversion vs Non.reproductive 4 results LRT statistic: '~ condition' vs '~ 1' 5 results LRT p-value: '~ condition' vs '~ 1' 6 results BH adjusted p-values Third I tried to compare the Wald test and the LTR: res = results(dds) #should I use resCtrst instead of res?? from the ?results help file: If ‘results’ is run without specifying ‘contrast’ or ‘name’, it will return the comparison of the last level of the last variable in the design formula over the first level of this variable. Mike tab = table(Wald=res$padj < .1, LRT=resLRT$padj < .1) addmargins(tab) LRT Wald FALSE TRUE Sum FALSE 1027 1766 2793 TRUE 526 747 1273 Sum 1553 2513 4066 sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 GenomicRanges_1.16.3 [5] GenomeInfoDb_1.0.2 IRanges_1.22.9 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] AnnotationDbi_1.26.0 Biobase_2.24.0 DBI_0.2-7 RColorBrewer_1.0-5 RSQLite_0.11.4 [6] XML_3.98-1.1 XVector_0.4.0 annotate_1.42.0 genefilter_1.46.1 geneplotter_1.42.0 [11] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1 splines_3.1.0 stats4_3.1.0 [16] survival_2.37-7 tools_3.1.0 xtable_1.7-3 _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org<mailto:bioconductor@r-project.org> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ---------------------------------------------------------------- Jon Bråte Section for Genetics and Evolutionary Biology (EVOGENE) Department of Biosciences University of Oslo P.B. 1066 Blindern N-0316, Norway Email: jon.brate@ibv.uio.no<mailto:jon.brate@ibv.uio.no> Phone: 922 44 582 Web: mn.uio.no/ibv/english/people/aca/jonbra/index.html<http: mn.uio.="" no="" ibv="" english="" people="" aca="" jonbra="" index.html=""> ---------------------------------------------------------------- Jon Bråte Section for Genetics and Evolutionary Biology (EVOGENE) Department of Biosciences University of Oslo P.B. 1066 Blindern N-0316, Norway Email: jon.brate@ibv.uio.no<mailto:jon.brate@ibv.uio.no> Phone: 922 44 582 Web: mn.uio.no/ibv/english/people/aca/jonbra/index.html<http: mn.uio.="" no="" ibv="" english="" people="" aca="" jonbra="" index.html=""> [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 890 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6