No significant p-values
3
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
Hello, I have constructed the following dataset for analysis using DESeq2: class: DESeqDataSet dim: 57396 10 exptData(0): assays(1): counts rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 ENSG00000210196 rowData metadata column names(0): colnames(10): 1 2 ... 10 11 colData names(1): condition > colData(ddsHTSeq) DataFrame with 10 rows and 1 column condition <factor> 1 na 2 na 3 Resistant 4 na 5 Resistant 6 Resistant 7 na 8 na 10 Sensitive 11 Sensitive I am interested in the differential expression between the drug resistant and sensitive samples ('na' are control samples). I've clustered the samples and plotted a PCA as described in the vignette. However, in each of these plots the samples do not cluster by their drug sensitivity but are distributed across the plot. I don't have any more information about the samples with which to model any potential covariates. I was wondering if there were any pointers as to how I could extract some useful meanings from these data please? As might be expected, when I try a DESeq on these data I get no significant p-values. Thanks in advance, Dave -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 [31] zlibbioc_1.10.0 -- Sent via the guest posting facility at bioconductor.org.
DESeq DESeq • 2.0k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 days ago
United States
hi Dave, If you don't find a set of genes with low FDR, then the experiment could have been underpowered to find the small differences, i.e. not enough sample size. Did you compare sensitive vs resistant using the contrast argument to results()? The default comparison is the last level of the first level of the last variable in the design, but there are three possible pairs of the three groups. Mike On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < guest@bioconductor.org> wrote: > Hello, > > I have constructed the following dataset for analysis using DESeq2: > > class: DESeqDataSet > dim: 57396 10 > exptData(0): > assays(1): counts > rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 > ENSG00000210196 > rowData metadata column names(0): > colnames(10): 1 2 ... 10 11 > colData names(1): condition > > > > colData(ddsHTSeq) > DataFrame with 10 rows and 1 column > condition > <factor> > 1 na > 2 na > 3 Resistant > 4 na > 5 Resistant > 6 Resistant > 7 na > 8 na > 10 Sensitive > 11 Sensitive > > I am interested in the differential expression between the drug resistant > and sensitive samples ('na' are control samples). > I've clustered the samples and plotted a PCA as described in the vignette. > However, in each of these plots the samples do not cluster by their drug > sensitivity but are distributed across the plot. I don't have any more > information about the samples with which to model any potential covariates. > I was wondering if there were any pointers as to how I could extract some > useful meanings from these data please? As might be expected, when I try a > DESeq on these data I get no significant p-values. > > Thanks in advance, > Dave > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 > [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 > [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 > [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 > [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 > [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 > [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 > [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 > [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 > [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 > [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 > [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 > [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 > [31] zlibbioc_1.10.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Thanks Mike - yes I used the sensitive vs resistant contrast argument to results() On 27 June 2014 14:39, Michael Love <michaelisaiahlove@gmail.com> wrote: > hi Dave, > > If you don't find a set of genes with low FDR, then the experiment could > have been underpowered to find the small differences, i.e. not enough > sample size. > > Did you compare sensitive vs resistant using the contrast argument to > results()? The default comparison is the last level of the first level of > the last variable in the design, but there are three possible pairs of the > three groups. > > Mike > > > On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < > guest@bioconductor.org> wrote: > >> Hello, >> >> I have constructed the following dataset for analysis using DESeq2: >> >> class: DESeqDataSet >> dim: 57396 10 >> exptData(0): >> assays(1): counts >> rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 >> ENSG00000210196 >> rowData metadata column names(0): >> colnames(10): 1 2 ... 10 11 >> colData names(1): condition >> >> >> > colData(ddsHTSeq) >> DataFrame with 10 rows and 1 column >> condition >> <factor> >> 1 na >> 2 na >> 3 Resistant >> 4 na >> 5 Resistant >> 6 Resistant >> 7 na >> 8 na >> 10 Sensitive >> 11 Sensitive >> >> I am interested in the differential expression between the drug resistant >> and sensitive samples ('na' are control samples). >> I've clustered the samples and plotted a PCA as described in the >> vignette. However, in each of these plots the samples do not cluster by >> their drug sensitivity but are distributed across the plot. I don't have >> any more information about the samples with which to model any potential >> covariates. >> I was wondering if there were any pointers as to how I could extract some >> useful meanings from these data please? As might be expected, when I try a >> DESeq on these data I get no significant p-values. >> >> Thanks in advance, >> Dave >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 >> [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 >> [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 >> [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 >> >> loaded via a namespace (and not attached): >> [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 >> [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 >> [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 >> [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 >> [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 >> [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 >> [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 >> [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 >> [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 >> [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 >> [31] zlibbioc_1.10.0 >> >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < guest@bioconductor.org> wrote: > Hello, > > I have constructed the following dataset for analysis using DESeq2: > > class: DESeqDataSet > dim: 57396 10 > exptData(0): > assays(1): counts > rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 > ENSG00000210196 > rowData metadata column names(0): > colnames(10): 1 2 ... 10 11 > colData names(1): condition > > > > colData(ddsHTSeq) > DataFrame with 10 rows and 1 column > condition > <factor> > 1 na > 2 na > 3 Resistant > 4 na > 5 Resistant > 6 Resistant > 7 na > 8 na > 10 Sensitive > 11 Sensitive > > I am interested in the differential expression between the drug resistant > and sensitive samples ('na' are control samples). > I've clustered the samples and plotted a PCA as described in the vignette. > However, in each of these plots the samples do not cluster by their drug > sensitivity but are distributed across the plot. I don't have any more > information about the samples with which to model any potential covariates. > I was wondering if there were any pointers as to how I could extract some > useful meanings from these data please? As might be expected, when I try a > DESeq on these data I get no significant p-values. > Hi, Dave. With an n of only 5, you might simply be underpowered to find significant genes, so increasing your sample size might be warranted. You could try using gene set analysis to look for coordinately regulated sets of genes, each with small effects. Alternatively, you could use the p-values for ranking the genes and try to validate a few genes of interest on a larger set of samples using pcr or some other technology. Sean > > Thanks in advance, > Dave > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 > [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 > [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 > [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 > [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 > [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 > [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 > [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 > [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 > [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 > [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 > [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 > [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 > [31] zlibbioc_1.10.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Lucia Peixoto ▴ 330
@lucia-peixoto-4203
Last seen 10.3 years ago
Hi Dave, If in your PCA your samples do not cluster by treatment, you likely have some sort of unwanted variation or batch effect masking the effect of the treatment in your data. I am not sure more samples will help. Have you taken a look at the PC loadings past 1 and 2 to see if there is any PC that captures your treatment? do you have any positive controls? are you sure your treatment actually causes measurable differences in gene expression? The only think I believe will help is RUVSeq: http://www.bioconductor.org/packages/devel/bioc/html/RUVSeq.html Lucia On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < guest@bioconductor.org> wrote: > Hello, > > I have constructed the following dataset for analysis using DESeq2: > > class: DESeqDataSet > dim: 57396 10 > exptData(0): > assays(1): counts > rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 > ENSG00000210196 > rowData metadata column names(0): > colnames(10): 1 2 ... 10 11 > colData names(1): condition > > > > colData(ddsHTSeq) > DataFrame with 10 rows and 1 column > condition > <factor> > 1 na > 2 na > 3 Resistant > 4 na > 5 Resistant > 6 Resistant > 7 na > 8 na > 10 Sensitive > 11 Sensitive > > I am interested in the differential expression between the drug resistant > and sensitive samples ('na' are control samples). > I've clustered the samples and plotted a PCA as described in the vignette. > However, in each of these plots the samples do not cluster by their drug > sensitivity but are distributed across the plot. I don't have any more > information about the samples with which to model any potential covariates. > I was wondering if there were any pointers as to how I could extract some > useful meanings from these data please? As might be expected, when I try a > DESeq on these data I get no significant p-values. > > Thanks in advance, > Dave > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 > [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 > [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 > [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 > [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 > [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 > [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 > [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 > [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 > [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 > [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 > [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 > [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 > [31] zlibbioc_1.10.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Lucia Peixoto PhD Postdoctoral Research Fellow Laboratory of Dr. Ted Abel Department of Biology School of Arts and Sciences University of Pennsylvania "Think boldly, don't be afraid of making mistakes, don't miss small details, keep your eyes open, and be modest in everything except your aims." Albert Szent-Gyorgyi [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Hi, On Fri, Jun 27, 2014 at 7:06 AM, Lucia Peixoto <luciap at="" iscb.org=""> wrote: > Hi Dave, > > If in your PCA your samples do not cluster by treatment, you likely have > some sort of unwanted variation or batch effect masking the effect of the > treatment in your data. I am not sure more samples will help. > Have you taken a look at the PC loadings past 1 and 2 to see if there is > any PC that captures your treatment? do you have any positive controls? are > you sure your treatment actually causes measurable differences in gene > expression? > > The only think I believe will help is RUVSeq: > > http://www.bioconductor.org/packages/devel/bioc/html/RUVSeq.html Not the only thing ... this is slightly different, but also something to keep an eye on "in this context" (ie. removing nuisance effects): svaseq: removing batch effects and other unwanted noise from sequencing data http://biorxiv.org/content/early/2014/06/25/006585 Thank you for bringing my attention to RUVSeq, though, as I haven't seen it before. HTH, -steve -- Steve Lianoglou Computational Biologist Genentech
ADD REPLY
0
Entering edit mode
Thanks Lucia; I've checked that PCs 1 and 2 capture 55% and 15% of the total variance, respectively. Could you explain how, if I did find that the treatment effect was present in another PC, that would help me please? I don't have any positive control because it's an experiment to characterise a response to a drug treatment. Thanks, Dave On 27 June 2014 15:06, Lucia Peixoto <luciap@iscb.org> wrote: > Hi Dave, > > If in your PCA your samples do not cluster by treatment, you likely have > some sort of unwanted variation or batch effect masking the effect of the > treatment in your data. I am not sure more samples will help. > Have you taken a look at the PC loadings past 1 and 2 to see if there is > any PC that captures your treatment? do you have any positive controls? are > you sure your treatment actually causes measurable differences in gene > expression? > > The only think I believe will help is RUVSeq: > > http://www.bioconductor.org/packages/devel/bioc/html/RUVSeq.html > > Lucia > > > On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < > guest@bioconductor.org> wrote: > >> Hello, >> >> I have constructed the following dataset for analysis using DESeq2: >> >> class: DESeqDataSet >> dim: 57396 10 >> exptData(0): >> assays(1): counts >> rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 >> ENSG00000210196 >> rowData metadata column names(0): >> colnames(10): 1 2 ... 10 11 >> colData names(1): condition >> >> >> > colData(ddsHTSeq) >> DataFrame with 10 rows and 1 column >> condition >> <factor> >> 1 na >> 2 na >> 3 Resistant >> 4 na >> 5 Resistant >> 6 Resistant >> 7 na >> 8 na >> 10 Sensitive >> 11 Sensitive >> >> I am interested in the differential expression between the drug resistant >> and sensitive samples ('na' are control samples). >> I've clustered the samples and plotted a PCA as described in the >> vignette. However, in each of these plots the samples do not cluster by >> their drug sensitivity but are distributed across the plot. I don't have >> any more information about the samples with which to model any potential >> covariates. >> I was wondering if there were any pointers as to how I could extract some >> useful meanings from these data please? As might be expected, when I try a >> DESeq on these data I get no significant p-values. >> >> Thanks in advance, >> Dave >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 >> [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 >> [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 >> [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 >> >> loaded via a namespace (and not attached): >> [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 >> [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 >> [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 >> [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 >> [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 >> [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 >> [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 >> [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 >> [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 >> [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 >> [31] zlibbioc_1.10.0 >> >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > Lucia Peixoto PhD > Postdoctoral Research Fellow > Laboratory of Dr. Ted Abel > Department of Biology > School of Arts and Sciences > University of Pennsylvania > > "Think boldly, don't be afraid of making mistakes, don't miss small > details, keep your eyes open, and be modest in everything except your > aims." > Albert Szent-Gyorgyi > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Dave, I assume you have only plotted PC1 vs PC2, you can do the same type of plots PC1 vs PC3, PC1 vs PC4 and so on...., to see if any PC captures the grouping by treatment This is regardless of how much variance each PC explains. I usually don't use not DESeq to do the PCA plots, so I am not sure how you will do this within DESeq I do understand that you are "characterizing" a response to a drug, but your underlying assumption is that part of that response is differences in gene expression that can be observed at the time point your are measuring. It could simply be that the differences between being drug resistant and sensitive have nothing to do with gene expression differences at the steady state, and that's why you don't get any significant p-values. Positive controls assure you that there are differences you can measure. Have you plotted the p-value distribution? you can find how to do it in the Nature protocols tutorial: http://www.nature.com/nprot/journal/v8/n9/full/nprot.2013.099.html Lucia On Fri, Jun 27, 2014 at 11:41 AM, Dave Wettmann <david.wettmann@gmail.com> wrote: > Thanks Lucia; I've checked that PCs 1 and 2 capture 55% and 15% of the > total variance, respectively. Could you explain how, if I did find that > the treatment effect was present in another PC, that would help me please? > I don't have any positive control because it's an experiment to > characterise a response to a drug treatment. > Thanks, > Dave > > > On 27 June 2014 15:06, Lucia Peixoto <luciap@iscb.org> wrote: > >> Hi Dave, >> >> If in your PCA your samples do not cluster by treatment, you likely have >> some sort of unwanted variation or batch effect masking the effect of the >> treatment in your data. I am not sure more samples will help. >> Have you taken a look at the PC loadings past 1 and 2 to see if there is >> any PC that captures your treatment? do you have any positive controls? are >> you sure your treatment actually causes measurable differences in gene >> expression? >> >> The only think I believe will help is RUVSeq: >> >> http://www.bioconductor.org/packages/devel/bioc/html/RUVSeq.html >> >> Lucia >> >> >> On Fri, Jun 27, 2014 at 9:27 AM, Dave Wettmann [guest] < >> guest@bioconductor.org> wrote: >> >>> Hello, >>> >>> I have constructed the following dataset for analysis using DESeq2: >>> >>> class: DESeqDataSet >>> dim: 57396 10 >>> exptData(0): >>> assays(1): counts >>> rownames(57396): ENSG00000223972 ENSG00000227232 ... ENSG00000210195 >>> ENSG00000210196 >>> rowData metadata column names(0): >>> colnames(10): 1 2 ... 10 11 >>> colData names(1): condition >>> >>> >>> > colData(ddsHTSeq) >>> DataFrame with 10 rows and 1 column >>> condition >>> <factor> >>> 1 na >>> 2 na >>> 3 Resistant >>> 4 na >>> 5 Resistant >>> 6 Resistant >>> 7 na >>> 8 na >>> 10 Sensitive >>> 11 Sensitive >>> >>> I am interested in the differential expression between the drug >>> resistant and sensitive samples ('na' are control samples). >>> I've clustered the samples and plotted a PCA as described in the >>> vignette. However, in each of these plots the samples do not cluster by >>> their drug sensitivity but are distributed across the plot. I don't have >>> any more information about the samples with which to model any potential >>> covariates. >>> I was wondering if there were any pointers as to how I could extract >>> some useful meanings from these data please? As might be expected, when I >>> try a DESeq on these data I get no significant p-values. >>> >>> Thanks in advance, >>> Dave >>> >>> -- output of sessionInfo(): >>> >>> R version 3.1.0 (2014-04-10) >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] parallel stats graphics grDevices utils datasets methods >>> [8] base >>> >>> other attached packages: >>> [1] pasilla_0.4.0 matrixStats_0.8.14 gplots_2.13.0 >>> [4] vsn_3.32.0 Biobase_2.24.0 DESeq2_1.4.5 >>> [7] RcppArmadillo_0.4.300.0 Rcpp_0.11.1 GenomicRanges_1.16.3 >>> [10] GenomeInfoDb_1.0.2 IRanges_1.22.7 BiocGenerics_0.10.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affy_1.42.2 affyio_1.32.0 annotate_1.42.0 >>> [4] AnnotationDbi_1.26.0 BiocInstaller_1.14.2 bitops_1.0-6 >>> [7] caTools_1.17 DBI_0.2-7 DESeq_1.16.0 >>> [10] gdata_2.13.3 genefilter_1.46.1 geneplotter_1.42.0 >>> [13] grid_3.1.0 gtools_3.4.0 KernSmooth_2.23-12 >>> [16] lattice_0.20-29 limma_3.20.4 locfit_1.5-9.1 >>> [19] preprocessCore_1.26.1 RColorBrewer_1.0-5 R.methodsS3_1.6.1 >>> [22] RSQLite_0.11.4 splines_3.1.0 stats4_3.1.0 >>> [25] survival_2.37-7 tcltk_3.1.0 tools_3.1.0 >>> [28] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 >>> [31] zlibbioc_1.10.0 >>> >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> Lucia Peixoto PhD >> Postdoctoral Research Fellow >> Laboratory of Dr. Ted Abel >> Department of Biology >> School of Arts and Sciences >> University of Pennsylvania >> >> "Think boldly, don't be afraid of making mistakes, don't miss small >> details, keep your eyes open, and be modest in everything except your >> aims." >> Albert Szent-Gyorgyi >> > > -- Lucia Peixoto PhD Postdoctoral Research Fellow Laboratory of Dr. Ted Abel Department of Biology School of Arts and Sciences University of Pennsylvania "Think boldly, don't be afraid of making mistakes, don't miss small details, keep your eyes open, and be modest in everything except your aims." Albert Szent-Gyorgyi [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 531 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6