DESeq multifactor design paired analysis
1
0
Entering edit mode
Natasha ▴ 440
@natasha-4640
Last seen 10.2 years ago
Dear List, I have a dataset with 3 conditions (control and 2 treatments), 2 replicates each. Initially I carried out a normal analysis in DESeq - each treatment vs the control, but later was told it needed to be a paired analysis. Thus, just to be consistent I decided to use DESeq itself. However, I am unsure how to extract my comparisons of interest. ------ Code: >design condition pair 1 Cont 1 2 Cont 3 3 Trt1 1 4 Trt1 3 5 Trt2 1 6 Trt2 3 >cds2 <- newCountDataSet(gct, design) >cds2 <- estimateSizeFactors(cds2) >sizeFactors(cds2) Cont_1 Cont_3 Trt1_1 Trt1_3 Trt2_1 Trt2_3 0.9373964 1.1328686 1.0097990 1.0596419 0.8562104 1.0645546 .cds2 <- estimateDispersions(cds2,"pooled-CR",modelFormula=count ~ pair + condition) >fit1 = fitNbinomGLMs(cds2, count ~ pair + condition) >fit0 = fitNbinomGLMs(cds2, count ~ pair) >str(fit1) 'data.frame': 4765 obs. of 6 variables: $ (Intercept): num 10.19 9.99 6.89 6.48 4.46 ... $ pair3 : num -0.3018 -0.3222 0.068 0.0468 0.1504 ... $ conditionTrt1: num 0.17 0.142 -0.495 0.125 0.319 ... $ conditionTrt2: num 0.1882 0.00112 -0.2704 -0.09412 0.10651 ... $ deviance : num 0.0742 0.1569 0.9072 2.0612 0.3702 ... $ converged : logi TRUE TRUE TRUE TRUE TRUE TRUE ... - attr(*, "df.residual")= num 2 >head(fit1) (Intercept) pair3 conditionTrt1 conditionTrt2 deviance converged gene0 10.188336 -0.30184662 0.1698471 0.188197344 0.0741656 TRUE gene1 9.992372 -0.32221027 0.1415406 0.001119304 0.1568869 TRUE gene10 6.893693 0.06795915 -0.4954365 -0.270402107 0.9072335 TRUE gene100 6.477847 0.04684995 0.1250743 -0.094116036 2.0611685 TRUE gene1002 4.463446 0.15038277 0.3186187 0.106512332 0.3701619 TRUE gene1003 4.090079 -0.05247162 -0.3560189 -0.054765642 1.9631768 TRUE >pvalsGLM = nbinomGLMTest( fit1, fit0 ) >padjGLM = p.adjust(pvalsGLM, method="BH" ) --------- Based on the above how do I extract: 1) Trt1 vs Cont? - is this the 3rd column of fit1, depicted as log2 FoldChange? 2) Trt2 vs Cont? - is this the 4th column of fit1, depicted as log2 FoldChange? 3) What would the p-values for each comparison be? As pvalsGLM and padjGLM gives only one set, and I don't think it will be the same for both comparisons. Many Thanks, Natasha [[alternative HTML version deleted]]
DESeq DESeq • 2.1k views
ADD COMMENT
0
Entering edit mode
Natasha ▴ 440
@natasha-4640
Last seen 10.2 years ago
Sorry forgot the sessionInfo! Added at the end now -- From: Natasha Sahgal Sent: 14 May 2013 12:29 To: bioconductor@r-project.org Subject: DESeq multifactor design paired analysis Dear List, I have a dataset with 3 conditions (control and 2 treatments), 2 replicates each. Initially I carried out a normal analysis in DESeq - each treatment vs the control, but later was told it needed to be a paired analysis. Thus, just to be consistent I decided to use DESeq itself. However, I am unsure how to extract my comparisons of interest. ------ Code: >design condition pair 1 Cont 1 2 Cont 3 3 Trt1 1 4 Trt1 3 5 Trt2 1 6 Trt2 3 >cds2 <- newCountDataSet(gct, design) >cds2 <- estimateSizeFactors(cds2) >sizeFactors(cds2) Cont_1 Cont_3 Trt1_1 Trt1_3 Trt2_1 Trt2_3 0.9373964 1.1328686 1.0097990 1.0596419 0.8562104 1.0645546 .cds2 <- estimateDispersions(cds2,"pooled-CR",modelFormula=count ~ pair + condition) >fit1 = fitNbinomGLMs(cds2, count ~ pair + condition) >fit0 = fitNbinomGLMs(cds2, count ~ pair) >str(fit1) 'data.frame': 4765 obs. of 6 variables: $ (Intercept): num 10.19 9.99 6.89 6.48 4.46 ... $ pair3 : num -0.3018 -0.3222 0.068 0.0468 0.1504 ... $ conditionTrt1: num 0.17 0.142 -0.495 0.125 0.319 ... $ conditionTrt2: num 0.1882 0.00112 -0.2704 -0.09412 0.10651 ... $ deviance : num 0.0742 0.1569 0.9072 2.0612 0.3702 ... $ converged : logi TRUE TRUE TRUE TRUE TRUE TRUE ... - attr(*, "df.residual")= num 2 >head(fit1) (Intercept) pair3 conditionTrt1 conditionTrt2 deviance converged gene0 10.188336 -0.30184662 0.1698471 0.188197344 0.0741656 TRUE gene1 9.992372 -0.32221027 0.1415406 0.001119304 0.1568869 TRUE gene10 6.893693 0.06795915 -0.4954365 -0.270402107 0.9072335 TRUE gene100 6.477847 0.04684995 0.1250743 -0.094116036 2.0611685 TRUE gene1002 4.463446 0.15038277 0.3186187 0.106512332 0.3701619 TRUE gene1003 4.090079 -0.05247162 -0.3560189 -0.054765642 1.9631768 TRUE >pvalsGLM = nbinomGLMTest( fit1, fit0 ) >padjGLM = p.adjust(pvalsGLM, method="BH" ) --------- Based on the above how do I extract: 1) Trt1 vs Cont? - is this the 3rd column of fit1, depicted as log2 FoldChange? 2) Trt2 vs Cont? - is this the 4th column of fit1, depicted as log2 FoldChange? 3) What would the p-values for each comparison be? As pvalsGLM and padjGLM gives only one set, and I don't think it will be the same for both comparisons. sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] gdata_2.12.0 WriteXLS_2.3.0 DESeq_1.10.1 lattice_0.20-6 [5] locfit_1.5-8 Biobase_2.18.0 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] annotate_1.36.0 AnnotationDbi_1.20.2 DBI_0.2-5 [4] genefilter_1.40.0 geneplotter_1.36.0 grid_2.15.2 [7] gtools_2.7.0 IRanges_1.16.4 MASS_7.3-22 [10] parallel_2.15.2 RColorBrewer_1.0-5 RSQLite_0.11.2 [13] splines_2.15.2 stats4_2.15.2 survival_2.36-14 [16] tcltk_2.15.2 tools_2.15.2 XML_3.95-0.1 [19] xtable_1.7-0 Many Thanks, Natasha [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi Natasha, On Tue, May 14, 2013 at 1:30 PM, Natasha Sahgal <nsahgal@well.ox.ac.uk>wrote: > > >head(fit1) > (Intercept) pair3 conditionTrt1 conditionTrt2 deviance > converged > gene0 10.188336 -0.30184662 0.1698471 0.188197344 0.0741656 > TRUE > gene1 9.992372 -0.32221027 0.1415406 0.001119304 0.1568869 > TRUE > gene10 6.893693 0.06795915 -0.4954365 -0.270402107 0.9072335 > TRUE > gene100 6.477847 0.04684995 0.1250743 -0.094116036 2.0611685 > TRUE > gene1002 4.463446 0.15038277 0.3186187 0.106512332 0.3701619 > TRUE > gene1003 4.090079 -0.05247162 -0.3560189 -0.054765642 1.9631768 > TRUE > > >pvalsGLM = nbinomGLMTest( fit1, fit0 ) > >padjGLM = p.adjust(pvalsGLM, method="BH" ) > --------- > > Based on the above how do I extract: > > 1) Trt1 vs Cont? - is this the 3rd column of fit1, depicted as log2 > FoldChange? > > > > 2) Trt2 vs Cont? - is this the 4th column of fit1, depicted as log2 > FoldChange? > > Yes, you are right. The 3rd and 4th columns of fit1 are the log2 fold changes for trt1 vs control and trt2 vs control. > > 3) What would the p-values for each comparison be? As pvalsGLM and > padjGLM gives only one set, and I don't think it will be the same for both > comparisons. > > > The p-values and adjusted p-values here are from a likelihood ratio test of a model including the condition and pair variables vs a model which only includes the pair variable. However, we can provide tests of significance of each treatment from a model including both now in the DESeq2 package. We tried to simplify the interface, so it shouldn't take too much time to set up the same analysis. Note that the methods have been updated from DESeq, and these changes are described in the vignette in Appendix A. To use DESeq2, you would create a DESeqDataSet and extract the results (log2 fold changes, p-values and adjusted p-values) for each treatment, with something like: library(DESeq2) dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~ pair + condition) dds <- DESeq(dds) resTrt1 <- results(dds, "condition_Trt1_vs_Cont") resTrt2 <- results(dds, "condition_Trt2_vs_Cont") Mike [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Michael, Thank you for your quick response, I shall try what you say after upgrading R and relevant packages. Many Thanks, Natasha -- From: Michael Love [mailto:michaelisaiahlove@gmail.com] Sent: 14 May 2013 14:39 To: Natasha Sahgal Cc: bioconductor@r-project.org Subject: Re: [BioC] DESeq multifactor design paired analysis Hi Natasha, On Tue, May 14, 2013 at 1:30 PM, Natasha Sahgal <nsahgal@well.ox.ac.uk<mailto:nsahgal@well.ox.ac.uk>> wrote: >head(fit1) (Intercept) pair3 conditionTrt1 conditionTrt2 deviance converged gene0 10.188336 -0.30184662 0.1698471 0.188197344 0.0741656 TRUE gene1 9.992372 -0.32221027 0.1415406 0.001119304 0.1568869 TRUE gene10 6.893693 0.06795915 -0.4954365 -0.270402107 0.9072335 TRUE gene100 6.477847 0.04684995 0.1250743 -0.094116036 2.0611685 TRUE gene1002 4.463446 0.15038277 0.3186187 0.106512332 0.3701619 TRUE gene1003 4.090079 -0.05247162 -0.3560189 -0.054765642 1.9631768 TRUE >pvalsGLM = nbinomGLMTest( fit1, fit0 ) >padjGLM = p.adjust(pvalsGLM, method="BH" ) --------- Based on the above how do I extract: 1) Trt1 vs Cont? - is this the 3rd column of fit1, depicted as log2 FoldChange? 2) Trt2 vs Cont? - is this the 4th column of fit1, depicted as log2 FoldChange? Yes, you are right. The 3rd and 4th columns of fit1 are the log2 fold changes for trt1 vs control and trt2 vs control. 3) What would the p-values for each comparison be? As pvalsGLM and padjGLM gives only one set, and I don't think it will be the same for both comparisons. The p-values and adjusted p-values here are from a likelihood ratio test of a model including the condition and pair variables vs a model which only includes the pair variable. However, we can provide tests of significance of each treatment from a model including both now in the DESeq2 package. We tried to simplify the interface, so it shouldn't take too much time to set up the same analysis. Note that the methods have been updated from DESeq, and these changes are described in the vignette in Appendix A. To use DESeq2, you would create a DESeqDataSet and extract the results (log2 fold changes, p-values and adjusted p-values) for each treatment, with something like: library(DESeq2) dds <- DESeqDataSetFromMatrix(countData = countData, colData = colData, design = ~ pair + condition) dds <- DESeq(dds) resTrt1 <- results(dds, "condition_Trt1_vs_Cont") resTrt2 <- results(dds, "condition_Trt2_vs_Cont") Mike [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 1028 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6