globaltest question

0

Entering edit mode

mike Ad. ▴ 30

@mike-ad-1876

Last seen 10.2 years ago

Dear list, I am new to use the ?globaltest? packages (version "4.2.0"). I have 10 mouse arrays from two groups (control and treated). I tested them against all the kegg pathways. The result looks stage to me because among the 171 pathways tested, most of them have the identical p-value. And that p-value is the smallest. The code I used is listed, could someone help to tell me where went wrong with my code? Thanks! /Mike kegg<-as.list(mouse4302PATH2PROBE) gtkegg<-globaltest(affy_expression, diagno, kegg) ##where the first argument ?affy_expression? is the affy expression data set I got by using function ?exprs()?, each row is one affy probe and each column is from one array. ## the second argument ?diagno? is a vector containing 10 group names (?treated? or ?control?) for the 10 arrays and they are in the corresponding order to the 10 columns in the expression data. gtkegg<-sort(gtkegg) #Just list the top 5 of the result, the P-value are identical, what?s wrong? gtkegg[1:5] Global Test result: Data: 10 samples with 45101 genes; 5 pathways tested Model: logistic Method: All 210 permutations Genes Tested Statistic Q Expected Q sd of Q P-value 00623 12 12 37.552 9.1318 9.2135 0.0047619 00440 47 47 13.010 3.3143 1.7585 0.0047619 00624 43 43 57.812 9.1819 8.2350 0.0047619 00625 19 19 71.404 12.6820 10.4620 0.0047619 00626 28 28 15.648 3.5587 2.2039 0.0047619 _________________________________________________________________ Check the weather nationwide with MSN Search: Try it now!

Pathways probe affy Pathways probe affy • 2.4k views

ADD COMMENT • link written 18.2 years ago by mike Ad. ▴ 30

0

Entering edit mode

Oosting, J. PATH ▴ 550

@oosting-j-path-412

Last seen 10.2 years ago

On small groups the globaltest automatically uses permutation tests. You can see in your result that in this case there are 210 permutations, and all p-values will therefore be multiples of 1/210, with a minimum of 1/210(=0.0047619). You can force it to use a more gliding scale by using the argument method="asymptotic". Jan -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of mike Ad. Sent: dinsdag 12 september 2006 16:23 To: bioconductor at stat.math.ethz.ch Subject: [BioC] globaltest question Dear list, I am new to use the "globaltest" packages (version "4.2.0"). I have 10 mouse arrays from two groups (control and treated). I tested them against all the kegg pathways. The result looks stage to me because among the 171 pathways tested, most of them have the identical p-value. And that p-value is the smallest. The code I used is listed, could someone help to tell me where went wrong with my code? Thanks! /Mike kegg<-as.list(mouse4302PATH2PROBE) gtkegg<-globaltest(affy_expression, diagno, kegg) ##where the first argument "affy_expression" is the affy expression data set I got by using function "exprs()", each row is one affy probe and each column is from one array. ## the second argument "diagno" is a vector containing 10 group names ("treated" or "control") for the 10 arrays and they are in the corresponding order to the 10 columns in the expression data. gtkegg<-sort(gtkegg) #Just list the top 5 of the result, the P-value are identical, what's wrong? gtkegg[1:5] Global Test result: Data: 10 samples with 45101 genes; 5 pathways tested Model: logistic Method: All 210 permutations Genes Tested Statistic Q Expected Q sd of Q P-value 00623 12 12 37.552 9.1318 9.2135 0.0047619 00440 47 47 13.010 3.3143 1.7585 0.0047619 00624 43 43 57.812 9.1819 8.2350 0.0047619 00625 19 19 71.404 12.6820 10.4620 0.0047619 00626 28 28 15.648 3.5587 2.2039 0.0047619 _________________________________________________________________ Check the weather nationwide with MSN Search: Try it now!

ADD COMMENT • link 18.2 years ago Oosting, J. PATH ▴ 550

0

Entering edit mode

Hi, Thanks for the reply! I have two following questions: 1. It is ok to use globaltest for small groups? (totally 10 arrays for 2 groups in my case.) 2. Different method ("auto", "asymptotic"...) in the globaltest gives different p-values, how should one set threshold to pick out the significant pathways? Thanks, /Mike >From: "Oosting, J. (PATH)" <j.oosting at="" lumc.nl=""> >To: "mike Ad." <mikeaddr at="" hotmail.com="">,<bioconductor at="" stat.math.ethz.ch=""> >Subject: RE: [BioC] globaltest question >Date: Tue, 12 Sep 2006 16:42:25 +0200 > >On small groups the globaltest automatically uses permutation tests. You >can see in your result that in this case there are 210 permutations, and >all p-values will therefore be multiples of 1/210, with a minimum of >1/210(=0.0047619). You can force it to use a more gliding scale by using >the argument method="asymptotic". > >Jan > >-----Original Message----- >From: bioconductor-bounces at stat.math.ethz.ch >[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of mike Ad. >Sent: dinsdag 12 september 2006 16:23 >To: bioconductor at stat.math.ethz.ch >Subject: [BioC] globaltest question > >Dear list, > >I am new to use the "globaltest" packages (version "4.2.0"). I have 10 >mouse arrays from two groups (control and treated). I tested them >against all the kegg pathways. The result looks stage to me because >among the 171 pathways tested, most of them have the identical p-value. >And that p-value is the smallest. >The code I used is listed, could someone help to tell me where went >wrong with my code? > >Thanks! > >/Mike > >kegg<-as.list(mouse4302PATH2PROBE) > >gtkegg<-globaltest(affy_expression, diagno, kegg) ##where the first >argument "affy_expression" is the affy expression data set I got by >using function "exprs()", each row is one affy probe and each column is >from one array. >## the second argument "diagno" is a vector containing 10 group names >("treated" or "control") for the 10 arrays and they are in the >corresponding order to the 10 columns in the expression data. > >gtkegg<-sort(gtkegg) > >#Just list the top 5 of the result, the P-value are identical, what's >wrong? >gtkegg[1:5] >Global Test result: >Data: 10 samples with 45101 genes; 5 pathways tested >Model: logistic >Method: All 210 permutations > > Genes Tested Statistic Q Expected Q sd of Q P-value >00623 12 12 37.552 9.1318 9.2135 0.0047619 >00440 47 47 13.010 3.3143 1.7585 0.0047619 >00624 43 43 57.812 9.1819 8.2350 0.0047619 >00625 19 19 71.404 12.6820 10.4620 0.0047619 >00626 28 28 15.648 3.5587 2.2039 0.0047619 > >_________________________________________________________________ >Check the weather nationwide with MSN Search: Try it now! > _________________________________________________________________ Windows Live Spaces is here! It?s easy to create your own personal Web site. http://spaces.live.com/signup.aspx

ADD REPLY • link 18.2 years ago mike Ad. ▴ 30

0

Entering edit mode

Dear Mike, The permutation version of globaltest is safe but conservative for small sample size. It can always be used, even in small groups, but it is not so useful if you want to test many pathways because a Bonferroni or FDR correction may leave you with no significant pathways at all due to the conservatism of the permutation test. In that case you may therefore want to use the asymptotic version. This is like using the t-test for small samples when you are not completely sure that the data are normally distributed, so some care should be taken when interpreting the results. But for mining pathways for strong association with your phenotype this works quite well. Except in unusual situations, the most asymptotically significant pathways will also have the smallest possible permutation p-value. For multiple testing correction see the gt.multtest function. Jelle -----Oorspronkelijk bericht----- Van: mike Ad. [mailto:mikeaddr at hotmail.com] Verzonden: dinsdag 12 september 2006 20:57 Aan: Oosting, J. (PATH); bioconductor at stat.math.ethz.ch Onderwerp: Re: [BioC] globaltest question Hi, Thanks for the reply! I have two following questions: 1. It is ok to use globaltest for small groups? (totally 10 arrays for 2 groups in my case.) 2. Different method ("auto", "asymptotic"...) in the globaltest gives different p-values, how should one set threshold to pick out the significant pathways? Thanks, /Mike >From: "Oosting, J. (PATH)" <j.oosting at="" lumc.nl=""> >To: "mike Ad." <mikeaddr at="" hotmail.com="">,<bioconductor at="" stat.math.ethz.ch=""> >Subject: RE: [BioC] globaltest question >Date: Tue, 12 Sep 2006 16:42:25 +0200 > >On small groups the globaltest automatically uses permutation tests. You >can see in your result that in this case there are 210 permutations, and >all p-values will therefore be multiples of 1/210, with a minimum of >1/210(=0.0047619). You can force it to use a more gliding scale by using >the argument method="asymptotic". > >Jan > >-----Original Message----- >From: bioconductor-bounces at stat.math.ethz.ch >[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of mike Ad. >Sent: dinsdag 12 september 2006 16:23 >To: bioconductor at stat.math.ethz.ch >Subject: [BioC] globaltest question > >Dear list, > >I am new to use the "globaltest" packages (version "4.2.0"). I have 10 >mouse arrays from two groups (control and treated). I tested them >against all the kegg pathways. The result looks stage to me because >among the 171 pathways tested, most of them have the identical p-value. >And that p-value is the smallest. >The code I used is listed, could someone help to tell me where went >wrong with my code? > >Thanks! > >/Mike > >kegg<-as.list(mouse4302PATH2PROBE) > >gtkegg<-globaltest(affy_expression, diagno, kegg) ##where the first >argument "affy_expression" is the affy expression data set I got by >using function "exprs()", each row is one affy probe and each column is >from one array. >## the second argument "diagno" is a vector containing 10 group names >("treated" or "control") for the 10 arrays and they are in the >corresponding order to the 10 columns in the expression data. > >gtkegg<-sort(gtkegg) > >#Just list the top 5 of the result, the P-value are identical, what's >wrong? >gtkegg[1:5] >Global Test result: >Data: 10 samples with 45101 genes; 5 pathways tested >Model: logistic >Method: All 210 permutations > > Genes Tested Statistic Q Expected Q sd of Q P-value >00623 12 12 37.552 9.1318 9.2135 0.0047619 >00440 47 47 13.010 3.3143 1.7585 0.0047619 >00624 43 43 57.812 9.1819 8.2350 0.0047619 >00625 19 19 71.404 12.6820 10.4620 0.0047619 >00626 28 28 15.648 3.5587 2.2039 0.0047619 > >_________________________________________________________________ >Check the weather nationwide with MSN Search: Try it now! > _________________________________________________________________ Windows Live Spaces is here! It's easy to create your own personal Web site. http://spaces.live.com/signup.aspx

ADD REPLY • link 18.2 years ago Goeman, J.J. MSTAT ▴ 150

0

Entering edit mode

Dear list, I am using the Rgraphviz("1.10.0?) to layout a graph. I would like to change the positions of some nodes (node center). But it seems the node center is not an attribute for the node, and I can not set them in the ?nodeAttrs? in the "agopen". The reason to do this is because when I plot the graph, the warning message says ?zero-length arrow is of indeterminate angle and so skipped?. I checked the result graph and found that some nodes are very close and the arrows between them were not drawn. So, I would like to change the node positions for them and replot the graph. Could some one help with this? Thanks, /Mike _________________________________________________________________ Check the weather nationwide with MSN Search: Try it now!

ADD REPLY • link 18.2 years ago mike Ad. ▴ 30

0

Entering edit mode

I don't think you could specify the node position. The node position is calculated based on the selected algorithm. You could probably try to do a bigger plot, or plot the graph part by part. Li > Dear list, > > I am using the Rgraphviz("1.10.0?) to layout a graph. I would like to > change > the positions of some nodes (node center). But it seems the node center is > not an attribute for the node, and I can not set them in the ?nodeAttrs? > in > the "agopen". > The reason to do this is because when I plot the graph, the warning > message > says ?zero-length arrow is of indeterminate angle and so skipped?. I > checked > the result graph and found that some nodes are very close and the arrows > between them were not drawn. So, I would like to change the node positions > for them and replot the graph. > Could some one help with this? > > Thanks, > /Mike > > _________________________________________________________________ > Check the weather nationwide with MSN Search: Try it now! > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 18.2 years ago Li.Long@isb-sib.ch ▴ 510

0

Entering edit mode

Oosting, J. PATH ▴ 550

@oosting-j-path-412

Last seen 10.2 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060912/ a226da50/attachment.pl

ADD COMMENT • link 18.2 years ago Oosting, J. PATH ▴ 550

0

Entering edit mode

Hi, I am trying to use GOstats (1.6.0) for GOSlim analyses, but have just realised that several of my selected Slim terms (chosen via AmiGO http://www.godatabase.org/) are not in GOstats. > GOTERM$"GO:0031975" NULL > GOTERM$"GO:0007067" GOID = GO:0007067 Term = mitosis Definition = The division of the eukaryotic cell nucleus to produce two daughter nuclei that, usually, contain the identical chromosome complement to their mother. Ontology = BP I have tried the text search methods as suggested by Nianhua recently, also drew a blank. The unifying feature of all the "failures" is that they have high number GOIDs. Is there a simple way of finding what the highest GOID in GOstats is? Also, I guess, when might GOstats be updated!? Many thanks, al > sessionInfo() Version 2.3.1 (2006-06-01) i386-pc-mingw32 attached base packages: [1] "splines" "tools" "methods" "stats" "graphics" "grDevices" "utils" "datasets" "base" other attached packages: GOstats Category hgu95av2 KEGG multtest genefilter survival xtable RBGL annotate GO graph Ruuid Biobase "1.6.0" "1.4.1" "1.6.5" "1.6.5" "1.10.2" "1.11.7" "2.28" "1.3-2" "1.8.1" "1.10.0" "1.6.5" "1.10.6" "1.10.0" "1.11.17" gplots gdata gtools lattice MASS statmod sma limma Hmisc "2.0.2" "2.0.2" "2.0.2" "0.13-10" "7.2-27.1" "1.2.4" "0.5.15" "2.7.10" "3.0-12"

ADD REPLY • link 18.2 years ago Al Ivens ▴ 270

0

Entering edit mode

Hi Al, GOstats depends on the GO package, which is updated at the same time as the other annotation packages. That happens once per stable release if I'm not mistaken. My suggestion would be to update your annotations (e.g. KEGG, hgu95av2 and GO) because they're quite out of date: versions 1.6.5. The current version for bioC 1.8 is 1.12.0. Running your query on an up-to-date version of the GO annotation gives me the following: > GOTERM$"GO:0031975"@Term [1] "envelope" I think you have a mix from different versions, because your Biobase is version 1.11.17, while the current version for bioC 1.8 is version 1.10.1. Sometimes strange behavior can happen when packages from different versions are installed. You can update your annotations using the biocLite source("http://www.bioconductor.org/biocLite.R") and then updating the package that you want: biocLite('GO') I strongly suggest that you update all of your annotation packages to the same version, because code generally breaks when your chip annotation, for example, points to a KEGG pathway that the KEGG annotation ignores. Hope this helps, Francois On Tue, 2006-09-12 at 22:43 +0100, Al Ivens wrote: > Hi, > > I am trying to use GOstats (1.6.0) for GOSlim analyses, but have just > realised that several of my selected Slim terms (chosen via AmiGO > http://www.godatabase.org/) are not in GOstats. > > > GOTERM$"GO:0031975" > NULL > > GOTERM$"GO:0007067" > GOID = GO:0007067 > Term = mitosis > Definition = The division of the eukaryotic cell nucleus to produce two > daughter nuclei that, usually, contain the identical chromosome > complement to their mother. > Ontology = BP > > I have tried the text search methods as suggested by Nianhua recently, > also drew a blank. The unifying feature of all the "failures" is that > they have high number GOIDs. > > Is there a simple way of finding what the highest GOID in GOstats is? > Also, I guess, when might GOstats be updated!? > > Many thanks, > > al > > > sessionInfo() > Version 2.3.1 (2006-06-01) > i386-pc-mingw32 > > attached base packages: > [1] "splines" "tools" "methods" "stats" "graphics" > "grDevices" "utils" "datasets" "base" > > other attached packages: > GOstats Category hgu95av2 KEGG multtest genefilter > survival xtable RBGL annotate GO graph > Ruuid Biobase > "1.6.0" "1.4.1" "1.6.5" "1.6.5" "1.10.2" "1.11.7" > "2.28" "1.3-2" "1.8.1" "1.10.0" "1.6.5" "1.10.6" "1.10.0" > "1.11.17" > gplots gdata gtools lattice MASS statmod > sma limma Hmisc > "2.0.2" "2.0.2" "2.0.2" "0.13-10" "7.2-27.1" "1.2.4" > "0.5.15" "2.7.10" "3.0-12" > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 18.2 years ago Francois Pepin ★ 1.3k

0

Entering edit mode

Hi Francois, Thanks for the prompt response. I thought it might be a versioning issue, but I religiously update every Monday AM! I use: * Windows GUI Menu * Melbourne CRAN mirror * all 4 repositories. then run update packages from the menu. This should do the trick, shouldn't it, normally? The Biobase version was to make affyPLM and affyIO work with the latest version of RMAExpress. I shall source("http://www.bioconductor.org/biocLite.R") as you suggest. One other quick question: once one has done a: require(mylibraryname) is there a way of unloading it from RAM, so that an update can be done without restarting the session? Many thanks, a > -----Original Message----- > From: Francois Pepin [mailto:fpepin at cs.mcgill.ca] > Sent: 12 September 2006 23:08 > To: Al Ivens > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] GOstats > > > Hi Al, > > GOstats depends on the GO package, which is updated at the > same time as the other annotation packages. That happens once > per stable release if I'm not mistaken. > > My suggestion would be to update your annotations (e.g. KEGG, > hgu95av2 and GO) because they're quite out of date: versions > 1.6.5. The current version for bioC 1.8 is 1.12.0. > > Running your query on an up-to-date version of the GO > annotation gives me the following: > > GOTERM$"GO:0031975"@Term > [1] "envelope" > > I think you have a mix from different versions, because your > Biobase is version 1.11.17, while the current version for > bioC 1.8 is version 1.10.1. Sometimes strange behavior can > happen when packages from different versions are installed. > > You can update your annotations using the biocLite > source("http://www.bioconductor.org/biocLite.R") > and then updating the package that you want: > biocLite('GO') > > I strongly suggest that you update all of your annotation > packages to the same version, because code generally breaks > when your chip annotation, for example, points to a KEGG > pathway that the KEGG annotation ignores. > > Hope this helps, > > Francois > > On Tue, 2006-09-12 at 22:43 +0100, Al Ivens wrote: > > Hi, > > > > I am trying to use GOstats (1.6.0) for GOSlim analyses, but > have just > > realised that several of my selected Slim terms (chosen via AmiGO > > http://www.godatabase.org/) are not in GOstats. > > > > > GOTERM$"GO:0031975" > > NULL > > > GOTERM$"GO:0007067" > > GOID = GO:0007067 > > Term = mitosis > > Definition = The division of the eukaryotic cell nucleus to produce > > two daughter nuclei that, usually, contain the identical chromosome > > complement to their mother. > > Ontology = BP > > > > I have tried the text search methods as suggested by > Nianhua recently, > > also drew a blank. The unifying feature of all the > "failures" is that > > they have high number GOIDs. > > > > Is there a simple way of finding what the highest GOID in > GOstats is? > > Also, I guess, when might GOstats be updated!? > > > > Many thanks, > > > > al > > > > > sessionInfo() > > Version 2.3.1 (2006-06-01) > > i386-pc-mingw32 > > > > attached base packages: > > [1] "splines" "tools" "methods" "stats" "graphics" > > "grDevices" "utils" "datasets" "base" > > > > other attached packages: > > GOstats Category hgu95av2 KEGG multtest genefilter > > survival xtable RBGL annotate GO graph > > Ruuid Biobase > > "1.6.0" "1.4.1" "1.6.5" "1.6.5" "1.10.2" "1.11.7" > > "2.28" "1.3-2" "1.8.1" "1.10.0" "1.6.5" > "1.10.6" "1.10.0" > > "1.11.17" > > gplots gdata gtools lattice MASS statmod > > sma limma Hmisc > > "2.0.2" "2.0.2" "2.0.2" "0.13-10" "7.2-27.1" "1.2.4" > > "0.5.15" "2.7.10" "3.0-12" > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > >

ADD REPLY • link 18.2 years ago Al Ivens ▴ 270

0

Entering edit mode

Hi Ali > Thanks for the prompt response. I thought it might be a versioning > issue, but I religiously update every Monday AM! > > I use: > * Windows GUI Menu > * Melbourne CRAN mirror > * all 4 repositories. > > then run update packages from the menu. This should do the trick, > shouldn't it, normally? I'm unfamiliar with the windows GUI menu, but the annotation and the experiment data each have a repository of their own. I'm guessing you're not in your list. You might want to add them manually, the relevant lines from the biocLite.R (after sourcing getBioC and biocinstall): ## CRAN-style Repositories where we'll look for packages repos <- c( "bioc", "data/annotation", "data/experiment", "omegahat", "lindsey" ) repos <- paste("http://bioconductor.org/packages/1.8", repos, sep="/") repos <- c(repos, "http://cran.fhcrc.org") This isn't recommended though, because it's more error-prone to changes. biocLite gets updated to make sure it keeps on working as it should (like verifying that you have the right R version and such). > is there a way of unloading it from RAM, so that an update can be done > without restarting the session? you might want to look at ?detach. I am not familiar enough with the R internals enough to know if it frees the RAM or not. Francois

ADD REPLY • link 18.2 years ago Francois Pepin ★ 1.3k

Login before adding your answer.