Is it possible to adopt the CQN normalization method (Hansen, 2012) as
an option of the edgeR function 'calcNormFactors' ? And the new
shrinkage estimator for dispersion (Wu, 2012) that seems to be better
than the currently used by edgeR ?
thanks,
-- output of sessionInfo():
any
--
Sent via the guest posting facility at bioconductor.org.
On Nov 26, 2012, at 12:30 PM, aec [guest] wrote:
>
> Is it possible to adopt the CQN normalization method (Hansen, 2012)
as an option of the edgeR function 'calcNormFactors' ? And the new
shrinkage estimator for dispersion (Wu, 2012) that seems to be better
than the currently used by edgeR ?
>
> thanks,
>
>
Dear Aec,
Is it possible you mean this paper:
Lund SP, Nettleton D, McCarthy DJ, Smyth GK. Detecting Differential
Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken
Dispersion Estimates.
Stat Appl Genet Mol Biol. 2012 Oct 22;11(5). doi:pii:
/j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826.xml.
10.1515/1544-6115.1826. PubMed PMID: 23104842.
If not, please give the complete reference to the Wu paper.
Thanks and best wishes,
Rich
Richard A. Friedman, PhD
Associate Research Scientist,
Biomedical Informatics Shared Resource
Herbert Irving Comprehensive Cancer Center (HICCC)
Lecturer,
Department of Biomedical Informatics (DBMI)
Educational Coordinator,
Center for Computational Biology and Bioinformatics (C2B2)/
National Center for Multiscale Analysis of Genomic Networks (MAGNet)
Room 824
Irving Cancer Research Center
Columbia University
1130 St. Nicholas Ave
New York, NY 10032
(212)851-4765 (voice)
friedman@cancercenter.columbia.edu
http://cancercenter.columbia.edu/~friedman/
In memoriam, Ray Bradbury
> -- output of sessionInfo():
>
> any
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
[[alternative HTML version deleted]]
I think he means
A new shrinkage estimator for dispersion improves differential
expression
detection in RNA-seq data.
http://www.ncbi.nlm.nih.gov/pubmed/23001152
which outlines an empirical Bayes method to improve estimation of the
Gamma
parameters in the Gamma-Poisson (i.e. negative binomial) formulation
of the
count model. Other authors have proposed generalized Poisson,
beta-binomial, Laplace mixture models, etc. for similar purposes, and
Dr.
Smyth has presented extensive empirical results for the existing edgeR
formulations via Cox-Reid estimation along a sliding scale from
"individual" to "completely shared" variance (Gamma).
On the other hand, the authors (Hao Wu and Jean Wu, at least) are
Hopkins
alumni, if I'm not mistaken, and the corresponding author wrote the
SQN
package, so I can't imagine an implementation of DSS for use in edgeR
is
too terribly far off. However:
We find that most of the improvement obtained in DSS is due to the
different dispersion estimate, as
passing the DSS estimates to edgeR/DESeq yields very similar results
as in
DSS (supplementary material
available at Biostatistics online, Figure S9). Both the edgeR and the
DESeq
methods have been expanded
to now accommodate multiclass comparisons. Our test is currently
limited to
two-class comparison and it
is our immediate plan to extend the dispersion estimators to
multifactor
designs. With an estimate of the
dispersion, one can use generalized linear models as done in McCarthy
and
others (2012).
The paper is open-access and thus attached. Perhaps the authors (of
any of
the above works) can comment.
On Mon, Nov 26, 2012 at 9:38 AM, Richard Friedman <
friedman at cancercenter.columbia.edu> wrote:
>
> On Nov 26, 2012, at 12:30 PM, aec [guest] wrote:
>
>
>
> >
> > Is it possible to adopt the CQN normalization method (Hansen,
2012) as
> an option of the edgeR function 'calcNormFactors' ? And the new
shrinkage
> estimator for dispersion (Wu, 2012) that seems to be better than the
> currently used by edgeR ?
> >
> > thanks,
> >
> >
>
> Dear Aec,
>
> Is it possible you mean this paper:
>
> Lund SP, Nettleton D, McCarthy DJ, Smyth GK. Detecting Differential
> Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken
> Dispersion Estimates.
> Stat Appl Genet Mol Biol. 2012 Oct 22;11(5). doi:pii:
> /j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826.xml.
> 10.1515/1544-6115.1826. PubMed PMID: 23104842.
> If not, please give the complete reference to the Wu paper.
> Thanks and best wishes,
> Rich
> Richard A. Friedman, PhD
> Associate Research Scientist,
> Biomedical Informatics Shared Resource
> Herbert Irving Comprehensive Cancer Center (HICCC)
> Lecturer,
> Department of Biomedical Informatics (DBMI)
> Educational Coordinator,
> Center for Computational Biology and Bioinformatics (C2B2)/
> National Center for Multiscale Analysis of Genomic Networks (MAGNet)
> Room 824
> Irving Cancer Research Center
> Columbia University
> 1130 St. Nicholas Ave
> New York, NY 10032
> (212)851-4765 (voice)
> friedman at cancercenter.columbia.edu
> http://cancercenter.columbia.edu/~friedman/
>
> In memoriam, Ray Bradbury
>
>
>
>
>
>
> > -- output of sessionInfo():
> >
> > any
> >
> > --
> > Sent via the guest posting facility at bioconductor.org.
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
*A model is a lie that helps you see the truth.*
*
*
Howard
Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf="">
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Biostat-2012-Wu-biostatistics_kxs033.pdf
Type: application/pdf
Size: 1118372 bytes
Desc: not available
URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20121126="" 21c401eb="" attachment-0001.pdf="">
The supplement mentioned by the authors in their final paragraph is
also
attached (here). It is worth at least a glance. I wonder whether the
notoriously anticonservative Wald test is responsible for recovering
information lost to over-shrinkage.
On Mon, Nov 26, 2012 at 10:20 AM, Tim Triche, Jr. <tim.triche at="" gmail.com="">wrote:
> I think he means
>
> A new shrinkage estimator for dispersion improves differential
expression
> detection in RNA-seq data.
> http://www.ncbi.nlm.nih.gov/pubmed/23001152
>
> which outlines an empirical Bayes method to improve estimation of
the
> Gamma parameters in the Gamma-Poisson (i.e. negative binomial)
formulation
> of the count model. Other authors have proposed generalized
Poisson,
> beta-binomial, Laplace mixture models, etc. for similar purposes,
and Dr.
> Smyth has presented extensive empirical results for the existing
edgeR
> formulations via Cox-Reid estimation along a sliding scale from
> "individual" to "completely shared" variance (Gamma).
>
> On the other hand, the authors (Hao Wu and Jean Wu, at least) are
Hopkins
> alumni, if I'm not mistaken, and the corresponding author wrote the
SQN
> package, so I can't imagine an implementation of DSS for use in
edgeR is
> too terribly far off. However:
>
> We find that most of the improvement obtained in DSS is due to the
> different dispersion estimate, as
> passing the DSS estimates to edgeR/DESeq yields very similar results
as in
> DSS (supplementary material
> available at Biostatistics online, Figure S9). Both the edgeR and
the
> DESeq methods have been expanded
> to now accommodate multiclass comparisons. Our test is currently
limited
> to two-class comparison and it
> is our immediate plan to extend the dispersion estimators to
multifactor
> designs. With an estimate of the
> dispersion, one can use generalized linear models as done in
McCarthy and
> others (2012).
>
> The paper is open-access and thus attached. Perhaps the authors (of
any
> of the above works) can comment.
>
>
> On Mon, Nov 26, 2012 at 9:38 AM, Richard Friedman <
> friedman at cancercenter.columbia.edu> wrote:
>
>>
>> On Nov 26, 2012, at 12:30 PM, aec [guest] wrote:
>>
>>
>>
>> >
>> > Is it possible to adopt the CQN normalization method (Hansen,
2012) as
>> an option of the edgeR function 'calcNormFactors' ? And the new
shrinkage
>> estimator for dispersion (Wu, 2012) that seems to be better than
the
>> currently used by edgeR ?
>> >
>> > thanks,
>> >
>> >
>>
>> Dear Aec,
>>
>> Is it possible you mean this paper:
>>
>> Lund SP, Nettleton D, McCarthy DJ, Smyth GK. Detecting Differential
>> Expression in RNA-sequence Data Using Quasi-likelihood with
Shrunken
>> Dispersion Estimates.
>> Stat Appl Genet Mol Biol. 2012 Oct 22;11(5). doi:pii:
>> /j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826.xml.
>> 10.1515/1544-6115.1826. PubMed PMID: 23104842.
>> If not, please give the complete reference to the Wu paper.
>> Thanks and best wishes,
>> Rich
>> Richard A. Friedman, PhD
>> Associate Research Scientist,
>> Biomedical Informatics Shared Resource
>> Herbert Irving Comprehensive Cancer Center (HICCC)
>> Lecturer,
>> Department of Biomedical Informatics (DBMI)
>> Educational Coordinator,
>> Center for Computational Biology and Bioinformatics (C2B2)/
>> National Center for Multiscale Analysis of Genomic Networks
(MAGNet)
>> Room 824
>> Irving Cancer Research Center
>> Columbia University
>> 1130 St. Nicholas Ave
>> New York, NY 10032
>> (212)851-4765 (voice)
>> friedman at cancercenter.columbia.edu
>> http://cancercenter.columbia.edu/~friedman/
>>
>> In memoriam, Ray Bradbury
>>
>>
>>
>>
>>
>>
>> > -- output of sessionInfo():
>> >
>> > any
>> >
>> > --
>> > Sent via the guest posting facility at bioconductor.org.
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
>
> --
> *A model is a lie that helps you see the truth.*
> *
> *
> Howard
Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf="">
>
>
--
*A model is a lie that helps you see the truth.*
*
*
Howard
Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf="">
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kxs033supp.pdf
Type: application/pdf
Size: 1973404 bytes
Desc: not available
URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20121126="" 35e5b52a="" attachment-0001.pdf="">
It doesn't seem that the Wald test is compensating for anything, since
the authors show that putting their dispersion estimates into edgeR
causes it to produce nearly identical results.
On 11/26/2012 10:24 AM, Tim Triche, Jr. wrote:
> The supplement mentioned by the authors in their final paragraph is
also
> attached (here). It is worth at least a glance. I wonder whether
the
> notoriously anticonservative Wald test is responsible for recovering
> information lost to over-shrinkage.
Good point -- thank you for catching that.
On Mon, Nov 26, 2012 at 12:35 PM, Ryan C. Thompson
<rct@thompsonclan.org>wrote:
> It doesn't seem that the Wald test is compensating for anything,
since the
> authors show that putting their dispersion estimates into edgeR
causes it
> to produce nearly identical results.
>
>
>
> On 11/26/2012 10:24 AM, Tim Triche, Jr. wrote:
>
>> The supplement mentioned by the authors in their final paragraph is
also
>> attached (here). It is worth at least a glance. I wonder whether
the
>> notoriously anticonservative Wald test is responsible for
recovering
>> information lost to over-shrinkage.
>>
>
--
*A model is a lie that helps you see the truth.*
*
*
Howard
Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf="">
[[alternative HTML version deleted]]
Tim,
Thanks.
Jean,
Sorry.
I thought aec might have meant Gordon's paper on QuasiSeq and there is
was a paper by Wu and Smyth write after the QuasiSeq on Pubmed.
I looked for a Wu Z paper but there were so many,
Anyway, any thoughts about how the method cited compares with
QuasiSeq
would be appreciated.
Best wishes,
Rich
On Nov 26, 2012, at 1:24 PM, Tim Triche, Jr. wrote:
> The supplement mentioned by the authors in their final paragraph is
also attached (here). It is worth at least a glance. I wonder
whether the notoriously anticonservative Wald test is responsible for
recovering information lost to over-shrinkage.
>
>
>
> On Mon, Nov 26, 2012 at 10:20 AM, Tim Triche, Jr.
<tim.triche@gmail.com> wrote:
> I think he means
>
> A new shrinkage estimator for dispersion improves differential
expression detection in RNA-seq data.
> http://www.ncbi.nlm.nih.gov/pubmed/23001152
>
> which outlines an empirical Bayes method to improve estimation of
the Gamma parameters in the Gamma-Poisson (i.e. negative binomial)
formulation of the count model. Other authors have proposed
generalized Poisson, beta-binomial, Laplace mixture models, etc. for
similar purposes, and Dr. Smyth has presented extensive empirical
results for the existing edgeR formulations via Cox-Reid estimation
along a sliding scale from "individual" to "completely shared"
variance (Gamma).
>
> On the other hand, the authors (Hao Wu and Jean Wu, at least) are
Hopkins alumni, if I'm not mistaken, and the corresponding author
wrote the SQN package, so I can't imagine an implementation of DSS for
use in edgeR is too terribly far off. However:
>
> We find that most of the improvement obtained in DSS is due to the
different dispersion estimate, as
> passing the DSS estimates to edgeR/DESeq yields very similar results
as in DSS (supplementary material
> available at Biostatistics online, Figure S9). Both the edgeR and
the DESeq methods have been expanded
> to now accommodate multiclass comparisons. Our test is currently
limited to two-class comparison and it
> is our immediate plan to extend the dispersion estimators to
multifactor designs. With an estimate of the
> dispersion, one can use generalized linear models as done in
McCarthy and others (2012).
>
> The paper is open-access and thus attached. Perhaps the authors (of
any of the above works) can comment.
>
>
> On Mon, Nov 26, 2012 at 9:38 AM, Richard Friedman
<friedman@cancercenter.columbia.edu> wrote:
>
> On Nov 26, 2012, at 12:30 PM, aec [guest] wrote:
>
>
>
> >
> > Is it possible to adopt the CQN normalization method (Hansen,
2012) as an option of the edgeR function 'calcNormFactors' ? And the
new shrinkage estimator for dispersion (Wu, 2012) that seems to be
better than the currently used by edgeR ?
> >
> > thanks,
> >
> >
>
> Dear Aec,
>
> Is it possible you mean this paper:
>
> Lund SP, Nettleton D, McCarthy DJ, Smyth GK. Detecting Differential
Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken
Dispersion Estimates.
> Stat Appl Genet Mol Biol. 2012 Oct 22;11(5). doi:pii:
> /j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826.xml.
> 10.1515/1544-6115.1826. PubMed PMID: 23104842.
> If not, please give the complete reference to the Wu paper.
> Thanks and best wishes,
> Rich
> Richard A. Friedman, PhD
> Associate Research Scientist,
> Biomedical Informatics Shared Resource
> Herbert Irving Comprehensive Cancer Center (HICCC)
> Lecturer,
> Department of Biomedical Informatics (DBMI)
> Educational Coordinator,
> Center for Computational Biology and Bioinformatics (C2B2)/
> National Center for Multiscale Analysis of Genomic Networks (MAGNet)
> Room 824
> Irving Cancer Research Center
> Columbia University
> 1130 St. Nicholas Ave
> New York, NY 10032
> (212)851-4765 (voice)
> friedman@cancercenter.columbia.edu
> http://cancercenter.columbia.edu/~friedman/
>
> In memoriam, Ray Bradbury
>
>
>
>
>
>
> > -- output of sessionInfo():
> >
> > any
> >
> > --
> > Sent via the guest posting facility at bioconductor.org.
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
> --
> A model is a lie that helps you see the truth.
>
> Howard Skipper
>
>
>
>
> --
> A model is a lie that helps you see the truth.
>
> Howard Skipper
>
> <kxs033supp.pdf>
[[alternative HTML version deleted]]
Anyway, to address the question of using DSS to estimate dispersions
and
then plugging those dispersions into an edgeR DGEList object, this
should be perfectly possible, with one LARGE caveat: DSS only supports
the simplest possible experimental design: two-classes, unpaired
samples. If your experiment fits this design, you can use DSS to
estimate dispersions and copy those dispersions into a DGEList object
and use edgeR's significance tests. However, doing so would not
necessarily be useful because, as discussed, the DSS paper shows that
doing so would give basically the same results as using the waldTest
function of DSS.
On 11/26/2012 09:30 AM, aec [guest] wrote:
> Is it possible to adopt the CQN normalization method (Hansen, 2012)
as an option of the edgeR function 'calcNormFactors' ? And the new
shrinkage estimator for dispersion (Wu, 2012) that seems to be better
than the currently used by edgeR ?
>
> thanks,
>
>
> -- output of sessionInfo():
>
> any
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
For normalization with CQN, Kasper (CQN maintainer) has example code
how
to import the offset into edgeR
(http://www.bioconductor.org/packages/release/bioc/vignettes/cqn/inst/
doc/cqn.pdf).
The new shrinkage estimate provided by DSS can also be used to replace
the estimate edgeR and Hao Wu (the author of the DSS pacakge) will add
the example code to the vignette (soon hopefully, when he gets back
from
vacation).
Whether these two will be included as options of functions within
edgeR
can only be determined by the edgeR maintainers (Mark, Davis, Yunshun
and Gordon)
Jean Wu
On 11/26/2012 12:30 PM, aec [guest] wrote:
>
> Is it possible to adopt the CQN normalization method (Hansen, 2012)
as an option of the edgeR function 'calcNormFactors' ? And the new
shrinkage estimator for dispersion (Wu, 2012) that seems to be better
than the currently used by edgeR ?
>
> thanks,
>
>
> -- output of sessionInfo():
>
> any
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
Dear Anna,
> Date: Mon, 26 Nov 2012 09:30:19 -0800 (PST)
> From: "aec [guest]" <guest at="" bioconductor.org="">
> To: bioconductor at r-project.org, aesteve at pcb.ub.es
> Subject: [BioC] normalization method & dispersion estimation RNA-seq
> data
>
> Is it possible to adopt the CQN normalization method (Hansen, 2012)
as
> an option of the edgeR function 'calcNormFactors' ?
No it isn't possible, because calcNormFactors() implements scale
normalization methods, and cqn is not of this type.
But why do you need this anyway? The cqn package has always worked
with
edgeR, and the cqn package provides code examples of how to do this.
What are you looking for that is not already provided?
> And the new shrinkage estimator for dispersion (Wu, 2012) that seems
to
> be better than the currently used by edgeR ?
It is inevitable that each new paper that is published claims to have
to
best method. In our own (unpublished) simulations with the DSS
package
that goes with Wu et al (Biostatistics, 2012), we find that it is
similar
in performance to DESeq, but worse than BBSeq, PoissonSeq, BaySeq,
voom
and edgeR, the latter two being the best. Of course DSS may do better
in
other simulation scenarios, and it may have been improved since our
simulations were done in April 2012. I don't expect you to believe
this
until we publish our results, but it is not my intention to change the
methods used in edgeR with every new published paper.
Best wishes
Gordon
> thanks,
>
>
> -- output of sessionInfo():
>
> any
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}