Fwd: Conservative results using DEXSeq

0

Entering edit mode

Levi Waldron ★ 1.1k

@levi-waldron-3429

Last seen 16 days ago

CUNY Graduate School of Public Health a…

I have noticed the kind of p-value histograms that Gu describes in other situations also, even using the same technologies and bioinformatic methods as other situations where it doesn't occur. I am not sure why it happened, but it could have to do with a batch effect that is *not* confounded with the outcome variable? As an example I'm attaching raw p-value histograms of Cox regressions for each of 14 ovarian cancer datasets, code below. At least one of these has the monotonic increase described. This experiment used the same microarray platform as many of the other datasets (Affy hgu133plus2), but is the only experiment using microdissected tissues. Point is just that the effect could be magnified some reason relating to the experiment. library(survival) library(affy) library(curatedOvarianData) if( !require("survHD") || packageVersion("survHD") != "0.99.1" ){ library(devtools) install_url(" https://bitbucket.org/lwaldron/survhd/downloads/survHD_0.99.1.tar.gz") } source(system.file("extdata", "patientselection.config",package="curatedOvarianData")) source(system.file("extdata", "createEsetList.R", package = "curatedOvarianData")) pvals <- lapply(esets, function(eset) rowCoxTests(exprs(eset), eset$y)[, 3]) png("Cox_p-values.png") par(mfrow=c(4, 4)) for (i in 1:length(pvals)) hist(pvals[[i]], main=names(pvals)[i], xlab="raw p-value") dev.off() On Wed, Jul 24, 2013 at 3:55 AM, Simon Anders <anders at="" embl.de=""> wrote: > Hi > > > On 23/07/13 14:47, Gu [guest] wrote: > >> By checking the histogram of raw p-values of exons (NOT genes), I >> find that it is monotonically increasing from 0 to 1, with relatively >> few counting bins falling into the bins from 0 to 0.2. >> > > You are right, DEXSeq sometimes tends to be overly conservative, which > then results in a skewed p value histogram as you describe it. Usually, it > is, however, only a rather slight skew, and it seems that the performance > is unusually bad for your specific dataset. > > The main reason for the conservative results is the way we estimate > dispersion. Since the release of DEXSeq, we have made quite some progress > in improving the dispersion estimation by now using an empirical- Bayes > shrinkage estimator, and DESeq2 now offers a much better solution, at least > for gene-level tests. We are working on applying the same changes to > DEXSeq, and this should solve your issue. I'm afraid, however, that I have > to ask you for some patience until we are finished with these changes. > > Simon > > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > -------------- next part -------------- A non-text attachment was scrubbed... Name: Cox_p-values.png Type: image/png Size: 13580 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20130724="" b36b28fd="" attachment.png="">

Cancer Ovarian DEXSeq DESeq2 Cancer Ovarian DEXSeq DESeq2 • 1.1k views

ADD COMMENT • link updated 11.5 years ago by Wolfgang Huber ★ 13k • written 11.5 years ago by Levi Waldron ★ 1.1k

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 5 months ago

EMBL European Molecular Biology Laborat…

Dear Levi thanks, you are right, batch effects can lead to excessive within- group vs between-group variation and thus p-value distributions that are more concentrated towards 1 than uniform. Such an effect could play a role in addition to the one that Simon described. In Gu's case, further diagnostics are needed to disentangle and potentially fix the problem. Best wishes Wolfgang On 24 Jul 2013, at 17:06, Levi Waldron <lwaldron.research at="" gmail.com=""> wrote: > I have noticed the kind of p-value histograms that Gu describes in other > situations also, even using the same technologies and bioinformatic methods > as other situations where it doesn't occur. I am not sure why it happened, > but it could have to do with a batch effect that is *not* confounded with > the outcome variable? > > As an example I'm attaching raw p-value histograms of Cox regressions for > each of 14 ovarian cancer datasets, code below. At least one of these has > the monotonic increase described. This experiment used the same microarray > platform as many of the other datasets (Affy hgu133plus2), but is the only > experiment using microdissected tissues. Point is just that the effect > could be magnified some reason relating to the experiment. > > library(survival) > library(affy) > library(curatedOvarianData) > if( !require("survHD") || packageVersion("survHD") != "0.99.1" ){ > library(devtools) > install_url(" > https://bitbucket.org/lwaldron/survhd/downloads/survHD_0.99.1.tar.gz") > } > > > source(system.file("extdata", > "patientselection.config",package="curatedOvarianData")) > source(system.file("extdata", "createEsetList.R", package = > "curatedOvarianData")) > > pvals <- lapply(esets, function(eset) rowCoxTests(exprs(eset), eset$y)[, 3]) > > png("Cox_p-values.png") > par(mfrow=c(4, 4)) > for (i in 1:length(pvals)) > hist(pvals[[i]], main=names(pvals)[i], xlab="raw p-value") > dev.off() > > > > On Wed, Jul 24, 2013 at 3:55 AM, Simon Anders <anders at="" embl.de=""> wrote: > >> Hi >> >> >> On 23/07/13 14:47, Gu [guest] wrote: >> >>> By checking the histogram of raw p-values of exons (NOT genes), I >>> find that it is monotonically increasing from 0 to 1, with relatively >>> few counting bins falling into the bins from 0 to 0.2. >>> >> >> You are right, DEXSeq sometimes tends to be overly conservative, which >> then results in a skewed p value histogram as you describe it. Usually, it >> is, however, only a rather slight skew, and it seems that the performance >> is unusually bad for your specific dataset. >> >> The main reason for the conservative results is the way we estimate >> dispersion. Since the release of DEXSeq, we have made quite some progress >> in improving the dispersion estimation by now using an empirical- Bayes >> shrinkage estimator, and DESeq2 now offers a much better solution, at least >> for gene-level tests. We are working on applying the same changes to >> DEXSeq, and this should solve your issue. I'm afraid, however, that I have >> to ask you for some patience until we are finished with these changes. >> >> Simon >> >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > <cox_p-values.png>_______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 11.5 years ago Wolfgang Huber ★ 13k

Login before adding your answer.