Theoretical Question

0

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 1 day ago

United States

Naomi, The limma package fits an ANOVA with an adjusted denominator, based on an empirical Bayes procedure. Literature describing the procedure can be found here: http://www.statsci.org/smyth/pubs/ebayes.pdf Best, Jim James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 >>> Naomi Altman <naomi@stat.psu.edu> 05/30/04 12:25AM >>> I would use ANOVA (lm or lme) followed by a contrast. It would likely be better to adjust the denominator (like SAM) but I don't think there is any software for this (or literature on exactly how to do it). So, probably the best thing for now is to treat this as a 1-way ANOVA with say a Bonferroni correction (for each gene). Once you have the Bonferroni-corrected p-values, you use FDR to determine an appropriate p-value to select genes. --Naomi At 02:10 PM 5/19/2004 -0400, Luckey, John wrote: >I posted a similar question last week and received some help with this >problem, but I am still a bit unclear on the best way to proceed- any >insights would be greatly appreciated. > >I want to identify a set of genes that are co-regulated with a given >phenotype that is observed across various tissue types -to ID the >'signature' that corresponds to the phenotype regardless of tissue- > > > >Here is the simplest set up: (all data is affymetrix and has been >pre-processed/normalized by rma) > > > >Tissue type A has 3 conditions: 1A, 2A, 3A > >Type B has 4 conditions: 1B, 2B, 3B, 4B > > > >My phenotype of interest is observed only in 1A and 1B. > > > >I am interested in knowing what is common (both up and down regulated) >between 1A (relative only to 2A and 3A) and 1B (relative to 2B, 3B, and >4B). I have varying numbers of replicates per condition (2-5). > > > >I have done unsupervised clustering using all genes, and 1A and 1B don't >cluster together (not really surprising since they are quite different in >many respects , I am interested only in their overlapping phenotypes). I >am not entirely sure how best to proceed. > > > >I have used straight fold change to ID unique genes in 1A vs 2A and 1A vs >3A. I then select those genes up (or down) in 1A in both comparisons. I >then look at how the *?~1A specific*?? genes are expressed in 1B vs all >other B's- and there is a general positive skewing- but the concern is >where to draw cutoffs- how to estimate FDR, etc in such a comparison. >Basically, how does one go about saying that the skewing in a different >comparison of a subset of genes is significant? > > > >Any insights you might have would be appreciated. > > > >Thx > > > > > >John Luckey, MD PhD > >Clinical Pathology Resident - Brigham and Womens Hospital > >Post Doctoral Fellow - Mathis - Benoist Lab > >Joslin Diabetes Center > >One Joslin Place, Rm. 474 > >Boston, MA 02215 > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111 _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

Microarray GO Clustering Cancer limma Microarray GO Clustering Cancer limma • 1.3k views

ADD COMMENT • link updated 20.6 years ago by A.J. Rossini ▴ 810 • written 20.6 years ago by James W. MacDonald 67k

0

Entering edit mode

A.J. Rossini ▴ 810

@aj-rossini-209

Last seen 10.3 years ago

Some tools that help: 1. limma will do empirical bayes adjustments for the linear models (ANOVA), so that would be one approach. 2. EBarrays as well (different methodology). 3. there is always siggenes for doing SAM-style analyses within R. best, -tony Naomi Altman <naomi@stat.psu.edu> writes: > I would use ANOVA (lm or lme) followed by a contrast. It would > likely be better to adjust the denominator (like SAM) but I don't > think there is any software for this (or literature on exactly how to > do it). So, probably the best thing for now is to treat this as a > 1-way ANOVA with say a Bonferroni correction (for each gene). Once you > have the Bonferroni-corrected p-values, you use FDR to determine an > appropriate p-value to select genes. > > --Naomi > > At 02:10 PM 5/19/2004 -0400, Luckey, John wrote: >> I posted a similar question last week and received some help with >> this problem, but I am still a bit unclear on the best way to >> proceed- any insights would be greatly appreciated. >> >> I want to identify a set of genes that are co-regulated with a given >> phenotype that is observed across various tissue types -to ID the >> 'signature' that corresponds to the phenotype regardless of tissue- >> >> >> >> Here is the simplest set up: (all data is affymetrix and has been >> pre-processed/normalized by rma) >> >> >> >>Tissue type A has 3 conditions: 1A, 2A, 3A >> >>Type B has 4 conditions: 1B, 2B, 3B, 4B >> >> >> >>My phenotype of interest is observed only in 1A and 1B. >> >> >> >> I am interested in knowing what is common (both up and down >> regulated) between 1A (relative only to 2A and 3A) and 1B (relative >> to 2B, 3B, and 4B). I have varying numbers of replicates per >> condition (2-5). >> >> >> >> I have done unsupervised clustering using all genes, and 1A and 1B >> don't cluster together (not really surprising since they are quite >> different in many respects , I am interested only in their >> overlapping phenotypes). I am not entirely sure how best to proceed. >> >> >> >> I have used straight fold change to ID unique genes in 1A vs 2A and >> 1A vs 3A. I then select those genes up (or down) in 1A in both >> comparisons. I then look at how the ???1A specific??? genes are >> expressed in 1B vs all other B's- and there is a general positive >> skewing- but the concern is where to draw cutoffs- how to estimate >> FDR, etc in such a comparison. Basically, how does one go about >> saying that the skewing in a different comparison of a subset of >> genes is significant? >> >> >> >>Any insights you might have would be appreciated. >> >> >> >>Thx >> >> >> >> >> >>John Luckey, MD PhD >> >>Clinical Pathology Resident - Brigham and Womens Hospital >> >>Post Doctoral Fellow - Mathis - Benoist Lab >> >>Joslin Diabetes Center >> >>One Joslin Place, Rm. 474 >> >>Boston, MA 02215 >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor@stat.math.ethz.ch >>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > -- rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

ADD COMMENT • link 20.6 years ago A.J. Rossini ▴ 810

0

Entering edit mode

I know that you can get adjusted denominators for the F-tests from these packages, but what about for the contrasts? Also, suppose you do something like "multiple comparisons with control" or "all pairwise comparisons". Should you feed the adjusted p-values into FDR, or feed all of your p-values to FDR? --Naomi At 07:44 AM 6/1/2004 -0700, A.J. Rossini wrote: >Some tools that help: > >1. limma will do empirical bayes adjustments for the linear models > (ANOVA), so that would be one approach. >2. EBarrays as well (different methodology). >3. there is always siggenes for doing SAM-style analyses within R. > >best, >-tony > >Naomi Altman <naomi@stat.psu.edu> writes: > > > I would use ANOVA (lm or lme) followed by a contrast. It would > > likely be better to adjust the denominator (like SAM) but I don't > > think there is any software for this (or literature on exactly how to > > do it). So, probably the best thing for now is to treat this as a > > 1-way ANOVA with say a Bonferroni correction (for each gene). Once you > > have the Bonferroni-corrected p-values, you use FDR to determine an > > appropriate p-value to select genes. > > > > --Naomi > > > > At 02:10 PM 5/19/2004 -0400, Luckey, John wrote: > >> I posted a similar question last week and received some help with > >> this problem, but I am still a bit unclear on the best way to > >> proceed- any insights would be greatly appreciated. > >> > >> I want to identify a set of genes that are co-regulated with a given > >> phenotype that is observed across various tissue types -to ID the > >> 'signature' that corresponds to the phenotype regardless of tissue- > >> > >> > >> > >> Here is the simplest set up: (all data is affymetrix and has been > >> pre-processed/normalized by rma) > >> > >> > >> > >>Tissue type A has 3 conditions: 1A, 2A, 3A > >> > >>Type B has 4 conditions: 1B, 2B, 3B, 4B > >> > >> > >> > >>My phenotype of interest is observed only in 1A and 1B. > >> > >> > >> > >> I am interested in knowing what is common (both up and down > >> regulated) between 1A (relative only to 2A and 3A) and 1B (relative > >> to 2B, 3B, and 4B). I have varying numbers of replicates per > >> condition (2-5). > >> > >> > >> > >> I have done unsupervised clustering using all genes, and 1A and 1B > >> don't cluster together (not really surprising since they are quite > >> different in many respects , I am interested only in their > >> overlapping phenotypes). I am not entirely sure how best to proceed. > >> > >> > >> > >> I have used straight fold change to ID unique genes in 1A vs 2A and > >> 1A vs 3A. I then select those genes up (or down) in 1A in both > >> comparisons. I then look at how the ???1A specific??? genes are > >> expressed in 1B vs all other B's- and there is a general positive > >> skewing- but the concern is where to draw cutoffs- how to estimate > >> FDR, etc in such a comparison. Basically, how does one go about > >> saying that the skewing in a different comparison of a subset of > >> genes is significant? > >> > >> > >> > >>Any insights you might have would be appreciated. > >> > >> > >> > >>Thx > >> > >> > >> > >> > >> > >>John Luckey, MD PhD > >> > >>Clinical Pathology Resident - Brigham and Womens Hospital > >> > >>Post Doctoral Fellow - Mathis - Benoist Lab > >> > >>Joslin Diabetes Center > >> > >>One Joslin Place, Rm. 474 > >> > >>Boston, MA 02215 > >> > >>_______________________________________________ > >>Bioconductor mailing list > >>Bioconductor@stat.math.ethz.ch > >>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > Naomi S. Altman 814-865-3791 (voice) > > Associate Professor > > Bioinformatics Consulting Center > > Dept. of Statistics 814-863-7114 (fax) > > Penn State University 814-865-1348 (Statistics) > > University Park, PA 16802-2111 > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > >-- >rossini@u.washington.edu http://www.analytics.washington.edu/ >Biomedical and Health Informatics University of Washington >Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center >UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable >FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email > >CONFIDENTIALITY NOTICE: This e-mail message and any attachments may be >confidential and privileged. If you received this message in error, >please destroy it and notify the sender. Thank you. Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD REPLY • link 20.6 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 3.7 years ago

United States

I am not sure who is responsible for the bioconductor web page, but I would find it useful to have these supporting papers listed on the web page along with the Release Packages. Thanks, Naomi At 10:32 AM 6/1/2004 -0400, James MacDonald wrote: >Naomi, > >The limma package fits an ANOVA with an adjusted denominator, based on an >empirical Bayes procedure. Literature describing the procedure can be >found here: http://www.statsci.org/smyth/pubs/ebayes.pdf > >Best, > >Jim > > > >James W. MacDonald >Affymetrix and cDNA Microarray Core >University of Michigan Cancer Center >1500 E. Medical Center Drive >7410 CCGC >Ann Arbor MI 48109 >734-647-5623 > > >>> Naomi Altman <naomi@stat.psu.edu> 05/30/04 12:25AM >>> >I would use ANOVA (lm or lme) followed by a contrast. It would likely be >better to adjust the denominator (like SAM) but I don't think there is any >software for this (or literature on exactly how to do it). So, probably >the best thing for now is to treat this as a 1-way ANOVA with say a >Bonferroni correction (for each gene). Once you have the >Bonferroni-corrected p-values, you use FDR to determine an appropriate >p-value to select genes. > >--Naomi > >At 02:10 PM 5/19/2004 -0400, Luckey, John wrote: > >I posted a similar question last week and received some help with this > >problem, but I am still a bit unclear on the best way to proceed- any > >insights would be greatly appreciated. > > > >I want to identify a set of genes that are co-regulated with a given > >phenotype that is observed across various tissue types -to ID the > >'signature' that corresponds to the phenotype regardless of tissue- > > > > > > > >Here is the simplest set up: (all data is affymetrix and has been > >pre-processed/normalized by rma) > > > > > > > >Tissue type A has 3 conditions: 1A, 2A, 3A > > > >Type B has 4 conditions: 1B, 2B, 3B, 4B > > > > > > > >My phenotype of interest is observed only in 1A and 1B. > > > > > > > >I am interested in knowing what is common (both up and down regulated) > >between 1A (relative only to 2A and 3A) and 1B (relative to 2B, 3B, and > >4B). I have varying numbers of replicates per condition (2-5). > > > > > > > >I have done unsupervised clustering using all genes, and 1A and 1B don't > >cluster together (not really surprising since they are quite different in > >many respects , I am interested only in their overlapping phenotypes). I > >am not entirely sure how best to proceed. > > > > > > > >I have used straight fold change to ID unique genes in 1A vs 2A and 1A vs > >3A. I then select those genes up (or down) in 1A in both comparisons. I > >then look at how the *?~1A specific*?? genes are expressed in 1B vs all > >other B's- and there is a general positive skewing- but the concern is > >where to draw cutoffs- how to estimate FDR, etc in such a comparison. > >Basically, how does one go about saying that the skewing in a different > >comparison of a subset of genes is significant? > > > > > > > >Any insights you might have would be appreciated. > > > > > > > >Thx > > > > > > > > > > > >John Luckey, MD PhD > > > >Clinical Pathology Resident - Brigham and Womens Hospital > > > >Post Doctoral Fellow - Mathis - Benoist Lab > > > >Joslin Diabetes Center > > > >One Joslin Place, Rm. 474 > > > >Boston, MA 02215 > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > >Naomi S. Altman 814-865-3791 (voice) >Associate Professor >Bioinformatics Consulting Center >Dept. of Statistics 814-863-7114 (fax) >Penn State University 814-865-1348 (Statistics) >University Park, PA 16802-2111 > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 20.6 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

A.J. Rossini ▴ 810

@aj-rossini-209

Last seen 10.3 years ago

Q1: Limma, yes. The others, I should know, sigh... Q2: Limma will compute an FDR. I need to recall which one though (you'd think I'd remember it....). --tony Naomi Altman <naomi@stat.psu.edu> writes: > I know that you can get adjusted denominators for the F-tests from > these packages, but what about for the contrasts? > Also, suppose you do something like "multiple comparisons with > control" or "all pairwise comparisons". Should you feed the adjusted > p-values into FDR, or feed all of your p-values to FDR? > > --Naomi > > At 07:44 AM 6/1/2004 -0700, A.J. Rossini wrote: > >>Some tools that help: >> >>1. limma will do empirical bayes adjustments for the linear models >> (ANOVA), so that would be one approach. >>2. EBarrays as well (different methodology). >>3. there is always siggenes for doing SAM-style analyses within R. >> >>best, >>-tony >> >>Naomi Altman <naomi@stat.psu.edu> writes: >> >> > I would use ANOVA (lm or lme) followed by a contrast. It would >> > likely be better to adjust the denominator (like SAM) but I don't >> > think there is any software for this (or literature on exactly how to >> > do it). So, probably the best thing for now is to treat this as a >> > 1-way ANOVA with say a Bonferroni correction (for each gene). Once you >> > have the Bonferroni-corrected p-values, you use FDR to determine an >> > appropriate p-value to select genes. >> > >> > --Naomi >> > >> > At 02:10 PM 5/19/2004 -0400, Luckey, John wrote: >> >> I posted a similar question last week and received some help with >> >> this problem, but I am still a bit unclear on the best way to >> >> proceed- any insights would be greatly appreciated. >> >> >> >> I want to identify a set of genes that are co-regulated with a given >> >> phenotype that is observed across various tissue types -to ID the >> >> 'signature' that corresponds to the phenotype regardless of tissue- >> >> >> >> >> >> >> >> Here is the simplest set up: (all data is affymetrix and has been >> >> pre-processed/normalized by rma) >> >> >> >> >> >> >> >>Tissue type A has 3 conditions: 1A, 2A, 3A >> >> >> >>Type B has 4 conditions: 1B, 2B, 3B, 4B >> >> >> >> >> >> >> >>My phenotype of interest is observed only in 1A and 1B. >> >> >> >> >> >> >> >> I am interested in knowing what is common (both up and down >> >> regulated) between 1A (relative only to 2A and 3A) and 1B (relative >> >> to 2B, 3B, and 4B). I have varying numbers of replicates per >> >> condition (2-5). >> >> >> >> >> >> >> >> I have done unsupervised clustering using all genes, and 1A and 1B >> >> don't cluster together (not really surprising since they are quite >> >> different in many respects , I am interested only in their >> >> overlapping phenotypes). I am not entirely sure how best to proceed. >> >> >> >> >> >> >> >> I have used straight fold change to ID unique genes in 1A vs 2A and >> >> 1A vs 3A. I then select those genes up (or down) in 1A in both >> >> comparisons. I then look at how the ???1A specific??? genes are >> >> expressed in 1B vs all other B's- and there is a general positive >> >> skewing- but the concern is where to draw cutoffs- how to estimate >> >> FDR, etc in such a comparison. Basically, how does one go about >> >> saying that the skewing in a different comparison of a subset of >> >> genes is significant? >> >> >> >> >> >> >> >>Any insights you might have would be appreciated. >> >> >> >> >> >> >> >>Thx >> >> >> >> >> >> >> >> >> >> >> >>John Luckey, MD PhD >> >> >> >>Clinical Pathology Resident - Brigham and Womens Hospital >> >> >> >>Post Doctoral Fellow - Mathis - Benoist Lab >> >> >> >>Joslin Diabetes Center >> >> >> >>One Joslin Place, Rm. 474 >> >> >> >>Boston, MA 02215 >> >> >> >>_______________________________________________ >> >>Bioconductor mailing list >> >>Bioconductor@stat.math.ethz.ch >> >>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >> > >> > Naomi S. Altman 814-865-3791 (voice) >> > Associate Professor >> > Bioinformatics Consulting Center >> > Dept. of Statistics 814-863-7114 (fax) >> > Penn State University 814-865-1348 (Statistics) >> > University Park, PA 16802-2111 >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor@stat.math.ethz.ch >> > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >> > >> >>-- >>rossini@u.washington.edu http://www.analytics.washington.edu/ >>Biomedical and Health Informatics University of Washington >>Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center >>UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable >>FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email >> >>CONFIDENTIALITY NOTICE: This e-mail message and any attachments may be >>confidential and privileged. If you received this message in error, >>please destroy it and notify the sender. Thank you. > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > -- rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

ADD COMMENT • link 20.6 years ago A.J. Rossini ▴ 810

Login before adding your answer.