QQ plot 450k data
1
0
Entering edit mode
@khadeeja-ismail-4711
Last seen 8.8 years ago
Hi All, What would be the right distribution to use as expected p values in a QQ plot for results from 450k analysis? I have been searching, but not able to find in mentioned anywhere. Thanks in advance, Khadeeja [[alternative HTML version deleted]]
• 2.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 10 hours ago
United States
Hi Khadeeja, The distribution of p-values under the null hypothesis should always be uniform. Best, Jim On 5/19/2014 5:14 PM, khadeeja ismail wrote: > Hi All, > > What would be the right distribution to use as expected p values in a QQ plot for results from 450k analysis? I have been searching, but not able to find in mentioned anywhere. > > > Thanks in advance, > > Khadeeja > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi Jim, Jim, if (groups) of samples differ in their global methylation profile (i.e. the majority of probes significantly differs in one direction), would you not expect to see a deviation from this uniform distribution then? Best, Martin -- M.A. (Martin) Rijlaarsdam MSc. MD Erasmus MC - University Medical Center Rotterdam Department of Pathology Room Be-432b Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands Email: m.a.rijlaarsdam@gmail.com Mobile: +31 6 45408508 Telephone (work): +31 10 7033409 Fax +31 10 7044365 Website: http://www.martinrijlaarsdam.nl On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Khadeeja, > > The distribution of p-values under the null hypothesis should always be > uniform. > > Best, > > Jim > > > On 5/19/2014 5:14 PM, khadeeja ismail wrote: > >> Hi All, >> >> What would be the right distribution to use as expected p values in a QQ >> plot for results from 450k analysis? I have been searching, but not able to >> find in mentioned anywhere. >> >> >> Thanks in advance, >> >> Khadeeja >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane. > science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi all, I think this depends on what question Khadeeja wants to answer. Traditionally a qqplot of the p-values from many statistical tests is used to assess the amount of signal in the data, i.e. low p-values (though this could be due either to true biological signal or biases). I assume this is what she wants to answer. The p-values from a genomic study can be compared to the uniform distribution using the following code: library("qqman") qq(pvals) where `pvals` is a vector of the p-values. If there is a strong signal in the data, we do expect a deviation from the uniform distribution. I am not aware of a potential use for knowing the exact distribution of these p-values. John On Tue, May 20, 2014 at 8:51 AM, Martin Rijlaarsdam < m.a.rijlaarsdam@gmail.com> wrote: > Hi Jim, > > Jim, if (groups) of samples differ in their global methylation profile > (i.e. the majority of probes significantly differs in one direction), would > you not expect to see a deviation from this uniform distribution then? > > Best, > Martin > > > -- > M.A. (Martin) Rijlaarsdam MSc. MD > Erasmus MC - University Medical Center Rotterdam > Department of Pathology > Room Be-432b > Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands > Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands > > Email: m.a.rijlaarsdam@gmail.com > Mobile: +31 6 45408508 > Telephone (work): +31 10 7033409 > Fax +31 10 7044365 > Website: http://www.martinrijlaarsdam.nl > > > On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon@uw.edu> > wrote: > > > Hi Khadeeja, > > > > The distribution of p-values under the null hypothesis should always be > > uniform. > > > > Best, > > > > Jim > > > > > > On 5/19/2014 5:14 PM, khadeeja ismail wrote: > > > >> Hi All, > >> > >> What would be the right distribution to use as expected p values in a QQ > >> plot for results from 450k analysis? I have been searching, but not > able to > >> find in mentioned anywhere. > >> > >> > >> Thanks in advance, > >> > >> Khadeeja > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane. > >> science.biology.informatics.conductor > >> > > > > -- > > James W. MacDonald, M.S. > > Biostatistician > > University of Washington > > Environmental and Occupational Health Sciences > > 4225 Roosevelt Way NE, # 100 > > Seattle WA 98105-6099 > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane. > > science.biology.informatics.conductor > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Martin, That's exactly what you would expect to see, which is why you do a QQ plot in the first place. On one axis you plot the expected (uniform) distribution of p-values, and on the other axis you plot the observed p-values. The p-values that deviate from the expectation are then possible true positives. Best, Jim On 5/20/2014 9:51 AM, Martin Rijlaarsdam wrote: > Hi Jim, > > Jim, if (groups) of samples differ in their global methylation profile > (i.e. the majority of probes significantly differs in one direction), would > you not expect to see a deviation from this uniform distribution then? > > Best, > Martin > > > -- > M.A. (Martin) Rijlaarsdam MSc. MD > Erasmus MC - University Medical Center Rotterdam > Department of Pathology > Room Be-432b > Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands > Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands > > Email: m.a.rijlaarsdam at gmail.com > Mobile: +31 6 45408508 > Telephone (work): +31 10 7033409 > Fax +31 10 7044365 > Website: http://www.martinrijlaarsdam.nl > > > On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > >> Hi Khadeeja, >> >> The distribution of p-values under the null hypothesis should always be >> uniform. >> >> Best, >> >> Jim >> >> >> On 5/19/2014 5:14 PM, khadeeja ismail wrote: >> >>> Hi All, >>> >>> What would be the right distribution to use as expected p values in a QQ >>> plot for results from 450k analysis? I have been searching, but not able to >>> find in mentioned anywhere. >>> >>> >>> Thanks in advance, >>> >>> Khadeeja >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane. >>> science.biology.informatics.conductor >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Thank you very much. I was suggested QQ plots to look for biases in two different analysis. One seem to have a uniform distribution while the other seem to deviate much. Probably due to different global methylation profiles? Can I conclude from it that there is a bias? Thanks again, Khadeeja On Tuesday, May 20, 2014 5:14 PM, James W. MacDonald <jmacdon@uw.edu> wrote: Hi Martin, That's exactly what you would expect to see, which is why you do a QQ plot in the first place. On one axis you plot the expected (uniform) distribution of p-values, and on the other axis you plot the observed p-values. The p-values that deviate from the expectation are then possible true positives. Best, Jim On 5/20/2014 9:51 AM, Martin Rijlaarsdam wrote: > Hi Jim, > > Jim, if (groups) of samples differ in their global methylation profile > (i.e. the majority of probes significantly differs in one direction), would > you not expect to see a deviation from this uniform distribution then? > > Best, > Martin > > > -- > M.A. (Martin) Rijlaarsdam MSc. MD > Erasmus MC - University Medical Center Rotterdam > Department of Pathology > Room Be-432b > Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands > Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands > > Email: m.a.rijlaarsdam@gmail.com > Mobile: +31 6 45408508 > Telephone (work): +31 10 7033409 > Fax +31 10 7044365 > Website: http://www.martinrijlaarsdam.nl > > > On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > >> Hi Khadeeja, >> >> The distribution of p-values under the null hypothesis should always be >> uniform. >> >> Best, >> >> Jim >> >> >> On 5/19/2014 5:14 PM, khadeeja ismail wrote: >> >>> Hi All, >>> >>> What would be the right distribution to use as expected p values in a QQ >>> plot for results from 450k analysis? I have been searching, but not able to >>> find in mentioned anywhere. >>> >>> >>> Thanks in advance, >>> >>> Khadeeja >>>          [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane. >>> science.biology.informatics.conductor >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> >     [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Khadeeja, I don't think you can conclude much at all from a QQ plot, other than the obvious that the observed p-values don't follow the expected distribution. There can be any number of things that could cause that, including systemic differential methylation (e.g., there really might be lots of differences in the methylation between your samples). Best, Jim On 5/20/2014 12:40 PM, khadeeja ismail wrote: > Thank you very much. I was suggested QQ plots to look for biases in > two different analysis. > > One seem to have a uniform distribution while the other seem to > deviate much. Probably due to different global methylation profiles? > Can I conclude from it that there is a bias? > > Thanks again, > Khadeeja > > > On Tuesday, May 20, 2014 5:14 PM, James W. MacDonald <jmacdon at="" uw.edu=""> > wrote: > > > Hi Martin, > > That's exactly what you would expect to see, which is why you do a QQ > plot in the first place. On one axis you plot the expected (uniform) > distribution of p-values, and on the other axis you plot the observed > p-values. The p-values that deviate from the expectation are then > possible true positives. > > Best, > > Jim > > > On 5/20/2014 9:51 AM, Martin Rijlaarsdam wrote: > > Hi Jim, > > > > Jim, if (groups) of samples differ in their global methylation profile > > (i.e. the majority of probes significantly differs in one > direction), would > > you not expect to see a deviation from this uniform distribution then? > > > > Best, > > Martin > > > > > > -- > > M.A. (Martin) Rijlaarsdam MSc. MD > > Erasmus MC - University Medical Center Rotterdam > > Department of Pathology > > Room Be-432b > > Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands > > Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The > Netherlands > > > > Email: m.a.rijlaarsdam at gmail.com <mailto:m.a.rijlaarsdam at="" gmail.com=""> > > Mobile: +31 6 45408508 > > Telephone (work): +31 10 7033409 > > Fax +31 10 7044365 > > Website: http://www.martinrijlaarsdam.nl > <http: www.martinrijlaarsdam.nl=""/> > > > > > > On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > > >> Hi Khadeeja, > >> > >> The distribution of p-values under the null hypothesis should always be > >> uniform. > >> > >> Best, > >> > >> Jim > >> > >> > >> On 5/19/2014 5:14 PM, khadeeja ismail wrote: > >> > >>> Hi All, > >>> > >>> What would be the right distribution to use as expected p values > in a QQ > >>> plot for results from 450k analysis? I have been searching, but > not able to > >>> find in mentioned anywhere. > >>> > >>> > >>> Thanks in advance, > >>> > >>> Khadeeja > >>> [[alternative HTML version deleted]] > >>> > >>> _______________________________________________ > >>> Bioconductor mailing list > >>> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> Search the archives: http://news.gmane.org/gmane. > >>> science.biology.informatics.conductor > >>> > >> -- > >> James W. MacDonald, M.S. > >> Biostatistician > >> University of Washington > >> Environmental and Occupational Health Sciences > >> 4225 Roosevelt Way NE, # 100 > >> Seattle WA 98105-6099 > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane. > >> science.biology.informatics.conductor > > >> > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Very helpful. Thank you, Jim, Martin and John. Best, Khadeeja On Tuesday, May 20, 2014 8:28 PM, James W. MacDonald <jmacdon@uw.edu> wrote: Hi Khadeeja, I don't think you can conclude much at all from a QQ plot, other than the obvious that the observed p-values don't follow the expected distribution. There can be any number of things that could cause that, including systemic differential methylation (e.g., there really might be lots of differences in the methylation between your samples). Best, Jim On 5/20/2014 12:40 PM, khadeeja ismail wrote: > Thank you very much. I was suggested QQ plots to look for biases in > two different analysis. > > One seem to have a uniform distribution while the other seem to > deviate much. Probably due to different global methylation profiles? > Can I conclude from it that there is a bias? > > Thanks again, > Khadeeja > > > On Tuesday, May 20, 2014 5:14 PM, James W. MacDonald <jmacdon@uw.edu> > wrote: > > > Hi Martin, > > That's exactly what you would expect to see, which is why you do a QQ > plot in the first place. On one axis you plot the expected (uniform) > distribution of p-values, and on the other axis you plot the observed > p-values. The p-values that deviate from the expectation are then > possible true positives. > > Best, > > Jim > > > On 5/20/2014 9:51 AM, Martin Rijlaarsdam wrote: > > Hi Jim, > > > > Jim, if (groups) of samples differ in their global methylation profile > > (i.e. the majority of probes significantly differs in one > direction), would > > you not expect to see a deviation from this uniform distribution then? > > > > Best, > > Martin > > > > > > -- > > M.A. (Martin) Rijlaarsdam MSc. MD > > Erasmus MC - University Medical Center Rotterdam > > Department of Pathology > > Room Be-432b > > Shipping adress: P.O. Box 2040, 3000 CA Rotterdam, The Netherlands > > Visiting adress: Dr. Molewaterplein 50, 3015 GE Rotterdam, The > Netherlands > > > > Email: m.a.rijlaarsdam@gmail.com <mailto:m.a.rijlaarsdam@gmail.com> > > Mobile: +31 6 45408508 > > Telephone (work): +31 10 7033409 > > Fax +31 10 7044365 > > Website: http://www.martinrijlaarsdam.nl > <http: www.martinrijlaarsdam.nl=""/> > > > > > > On Tue, May 20, 2014 at 3:48 PM, James W. MacDonald <jmacdon@uw.edu> <mailto:jmacdon@uw.edu>> wrote: > > > >> Hi Khadeeja, > >> > >> The distribution of p-values under the null hypothesis should always be > >> uniform. > >> > >> Best, > >> > >> Jim > >> > >> > >> On 5/19/2014 5:14 PM, khadeeja ismail wrote: > >> > >>> Hi All, > >>> > >>> What would be the right distribution to use as expected p values > in a QQ > >>> plot for results from 450k analysis? I have been searching, but > not able to > >>> find in mentioned anywhere. > >>> > >>> > >>> Thanks in advance, > >>> > >>> Khadeeja > >>>          [[alternative HTML version deleted]] > >>> > >>> _______________________________________________ > >>> Bioconductor mailing list > >>> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> Search the archives: http://news.gmane.org/gmane. > >>> science.biology.informatics.conductor > >>> > >> -- > >> James W. MacDonald, M.S. > >> Biostatistician > >> University of Washington > >> Environmental and Occupational Health Sciences > >> 4225 Roosevelt Way NE, # 100 > >> Seattle WA 98105-6099 > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org <mailto:bioconductor@r-project.org> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane. > >> science.biology.informatics.conductor > > >> > >    [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org <mailto:bioconductor@r-project.org> > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6