(no subject)
1
0
Entering edit mode
Paul Evans ▴ 180
@paul-evans-2716
Last seen 10.4 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20080409/ 0974cdc6/attachment.pl
• 401 views
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.7 years ago
United States
Hi, Paul Evans wrote: > Hi Robert, > > Two questions. > > First, does that mean that I will be able to use the org.XX packages and > KEGG if I download the GOstats package from the devel download page in > bioconductor (instead of the release version)? Alternatively, is there Yes, but you need to use the release candidate for R 2.7.0. In about two weeks R 2.7.0 will be released, so you may want to wait, and shortly after that BioC 2.2 will come out, and at that time all of this will work, in the "new" release branches. > any way I can get the hyperG test to function with a set of Entrez IDs > only (for example, if I get data from SMD I will not have the chip > details but only entrez ids). Yes, of course, but then you are not using KEGG or GO or any of those things for your gene sets, unless you do the mapping for them. You should be able to use Entrez IDs from the SMD together with the org.Sc.sgd.db, by simply restricting attention to those that are contained in the org.Sc.sgd.db package. > > Second, I tried the same test with several affy and agilent arrays. For > the code given below the 'hgug4110b' package returned the same error. I > am reproducting the code below: > > -------------------------------------------------------------------- ------- > ### TEST HYPERGTEST FOR AFFY/AGILENT CHIPS## > rm(list = ls()) > library("hgug4110b") > library("KEGG.db") > library("GOstats") > > chips <- c("hgug4110b") > pvalue <- 1 > for(i in 1:length(chips)){ > y <- get(paste(chips[i],"ENTREZID",sep='')) > print(chips[i]) > xx <- as.list(y) > # Remove probe identifiers that do not map to any ENTREZID > xx <- xx[!is.na(xx)] > if(length(xx) > 0){ > # The ENTREZIDs for the first two elements of XX > xx[1:2] > # Get the first one > xx[[1]] > } > allGenes <- unique(unlist(xx)) > geneUniverse <- allGenes[1:7000] > set.seed(37688) > ## Create random cluster of 13 genes > geneCluster <- sample(1:7000,13,replace=F) > geneCluster <- unique(unlist(geneUniverse[geneCluster])) > print(geneCluster) > paramsGO <- new("GOHyperGParams", geneIds = geneCluster, > universeGeneIds = geneUniverse, annotation = chips[i], > ontology = "BP", > pvalueCutoff = pvalue, conditional = FALSE, testDirection = > "over") > > paramsKEGG <- new("KEGGHyperGParams", geneIds = geneCluster, > universeGeneIds = geneUniverse, annotation = chips[i], > pvalueCutoff = pvalue, testDirection = "over") > #tryCatch(hgOverGO <- hyperGTest(paramsGO),error = function(e) > {print('error GO')}) > tryCatch(hgOverKEGG <- hyperGTest(paramsKEGG),error = function(e) > {print('error KEGG')}) > } > > ------------------------------------- > The output I get is: > > [1] "hgug4110b" > [1] "4644" "55630" "BX647822" "9933" "79016" "5774" > "7274" "6331" "51249" "55515" "AK096394" "28299" "AF116641" > [1] "error KEGG" > > i.e. for this chip I get the same error ("Error in numW - numWdrawn : > non-numeric argument to binary operator"). Am I doing something wrong? No you are not doing anything wrong, there is a bug. You will need to either wait for the next release (about 3 weeks), or use the devel versions of everything. best wishes Robert > > > regards. > > > > > > Hi Paul, > Thanks for the report. Please, if you use sample also set a seed, > otherwise your example is not reproducible. > > The short answer is that you cannot use KEGG with the org.XX packages > in release. Based on your report I have modified the Category package > (which is doing most of the work), so that this now should work in the > devel branch, and that change should propagate in the next day or so to > the web (version 2.5.9). > > best wishes > Robert > > > Paul Evans wrote: > >> > Thanks Robert. I tried the KEGG.db package and tried the >> > KEGGHyperGParams again. The code I used is: >> > >> > ----------------------------------------------------------------- ------------ >> > >> > ############ TEST hyperGTest for HOMO SAPIENS ###### >> > library("KEGG.db") >> > library("GOstats") >> > library("org.Hs.eg.db") >> > >> > x <- org.Hs.egACCNUM >> > # Get the entrez gene identifiers that are mapped to an ACCNUM >> > mapped_genes <- mappedkeys(x) >> > geneUniverse <- mapped_genes[1:1200] >> > >> > >> > ## Create random cluster of 13 genes >> > geneCluster <- sample(1:1200,13,replace=F) >> > geneCluster <- unique(unlist(geneUniverse[geneCluster])) >> > >> > print(geneCluster) >> > >> > paramsGO <- new("GOHyperGParams", geneIds = geneCluster, >> > universeGeneIds = geneUniverse, annotation = "org.Hs.eg.db", >> > ontology = "BP", >> > pvalueCutoff = 1, conditional = FALSE, testDirection = "over") >> > >> > >> > paramsKEGG <- new("KEGGHyperGParams", geneIds = geneCluster, >> > universeGeneIds = geneUniverse, annotation = "org.Hs.eg.db", >> > pvalueCutoff = 1, testDirection = "over") >> > >> > >> > tryCatch(hgOverGO <- hyperGTest(paramsGO),error = function(e) >> > {print('error GO')}) >> > tryCatch(hgOverKEGG <- hyperGTest(paramsKEGG),error = function(e) >> > {print('error KEGG')}) >> > >> > ----------------------------------------------------------------- ------------ >> > >> > >> > >> > The output/error I got now is: >> > >> > >> > >> > [1] "901" "599" "435" "100" "1525" "25" "204" "1159" "865" >> > "1195" "1629" "912" "998" >> > >> > Error in get(paste(lib, name, sep = "")) : >> > no function to return from, jumping to top level >> > [1] "error KEGG" >> > >> > >> > >> > My sessionInfo() is: >> > >> > >> > >> > > sessionInfo() >> > R version 2.6.2 (2008-02-08) >> > i386-pc-mingw32 >> > >> > locale: >> > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United >> > States.1252;LC_MONETARY=English_United >> > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 >> > >> > attached base packages: >> > [1] splines tools stats graphics grDevices utils >> > datasets methods base >> > >> > other attached packages: >> > [1] org.Hs.eg.db_2.0.2 GOstats_2.4.0 Category_2.4.0 >> > genefilter_1.16.0 survival_2.34 RBGL_1.14.0 >> > annotate_1.16.1 >> > [8] xtable_1.5-2 GO.db_2.0.2 graph_1.16.1 >> > KEGG.db_2.0.2 AnnotationDbi_1.0.6 RSQLite_0.6-8 >> > DBI_0.2-4 >> > [15] Biobase_1.16.3 >> > >> > loaded via a namespace (and not attached): >> > [1] cluster_1.11.10 >> > > >> > >> > >> > >> > My apologies if I have missed something elementary! >> > >> > >> > >> > thanks! >> > >> > >> > >> > >> > >> > ----- Original Message ---- >> > From: Robert Gentleman <rgentlem at="" fhcrc.org=""> >> > To: Paul Evans <p.evans48 at="" yahoo.com=""> >> > Cc: Bioconductor at stat.math.ethz.ch >> > Sent: Monday, March 31, 2008 3:45:11 PM >> > Subject: Re: [BioC] GOstats - hyperGTest using "KEGGHyperGParams" >> > >> > Hi Paul, >> > Thanks for the bug report, it seems that there is an issue when all >> > values are zero, which shows up intermittently. You can solve it by >> > using try or tryCatch around the call to hyperGTest. You can simply use >> > a p-value of 1, which is what it will be. >> > >> > You should not be loading the GO package for this (KEGG if anything, and >> > even then, please use KEGG.db, not KEGG). >> > >> > I will fix the bug, but given how close the release is I won't back >> > port it, and it will only be available in the devel branch (soon to be >> > the release branch), >> > >> > best wishes >> > Robert >> > >> > Paul Evans wrote: >> > > Hi all, >> > > >> > > I was trying to understand the hyperGTest for KEGG, and used the >> > following code: >> > > >> > > >> > ----------------------------------------------------------------- ------------------------------------------ >> > > ## TEST HYPERGTEST FOR KEGG >> > > >> > > library("YEAST") >> > > library("GOstats") >> > > library("GO") >> > > >> > > # Convert to a list >> > > xx <- as.list(YEASTGENENAME) >> > > # Remove probes that do not map to any GENENAME >> > > xx <- xx[!is.na <http: is.na=""/>(xx)] >> > > if(length(xx) > 0){ >> > > # Gets the gene names for the first five probe identifiers >> > > xx[1:5] >> > > # Get the first one >> > > xx[[1]] >> > > } >> > > >> > > ## Create gene universe >> > > allGenes <- names(xx) >> > > print(length(allGenes)) >> > > geneUniverse <- allGenes[1:800] >> > > for(i in 1:20){ >> > > ## Create random cluster of 13 genes >> > > geneCluster <- sample(1:800,13,replace=F) >> > > geneCluster <- geneUniverse[geneCluster] >> > > print(i) >> > > print(geneCluster) >> > > params <- new("KEGGHyperGParams", geneIds = geneCluster, >> > > universeGeneIds = geneUniverse, annotation = "YEAST", >> > > pvalueCutoff = 0.1, testDirection = "over") >> > > hgOver <- hyperGTest(params) >> > > dfrm <- summary(hgOver) >> > > #print(dfrm) >> > > } >> > > >> > > >> > ----------------------------------------------------------------- --------------------------------------- >> > > >> > > The output/error that I got is: >> > > >> > > [1] 1 >> > > [1] "YKR067W" "MOF9" "YDR518W" "YPR074C" "YCL011C" "YCR069W" >> > "YDL104C" "YGR136W" "YAR003W" "YFR013W" "YOR116C" "YDR507C" "YGR167W" >> > > [1] 2 >> > > [1] "YJR112W" "CEN8" "YPL005W" "YHR081W" "YLR323C" "YBR131W" >> > "YLR347C" "YHR098C" "YOR107W" "YCL027W" "YNR012W" "CRL16" "YLR329W" >> > > [1] 3 >> > > [1] "YNL327W" "YEL056W" "YNL321W" "YDL111C" "YMR284W" "YLR338W" >> > "YPL008W" "CRL17" "YEL065W" "YFR027W" "YMR269W" "YPL019C" "YML038C" >> > > Error in numW - numWdrawn : non-numeric argument to binary operator >> > > >> > > >> > > [[elided trailing spam]] >> > > >> > > My sessionInfo(): >> > > >> > >> sessionInfo() >> > > R version 2.6.2 (2008-02-08) >> > > i386-pc-mingw32 >> > > locale: >> > > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United >> > States.1252;LC_MONETARY=English_United >> > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 >> > > attached base packages: >> > > [1] splines tools stats graphics grDevices utils datasets >> > methods base >> > > other attached packages: >> > > [1] KEGG_2.0.1 GOstats_2.4.0 Category_2.4.0 >> > genefilter_1.16.0 survival_2.34 RBGL_1.14.0 GO.db_2.0.2 >> > > [8] graph_1.16.1 goTools_1.10.0 annotate_1.16.1 >> > xtable_1.5-2 AnnotationDbi_1.0.6 RSQLite_0.6-8 DBI_0.2-4 >> > >> > > [15] Biobase_1.16.3 GO_2.0.1 hu6800_2.0.1 >> > hgu95a_2.0.1 hgu95av2_2.0.1 hgu133plus2_2.0.1 >> > hgu133b_2.0.1 >> > > [22] hgu133a_2.0.1 som_0.3-4 YEAST_2.0.1 >> > cluster_1.11.10 >> > > >> > > >> > > thanks! >> > > >> > > >> > > >> > _________________________________________________________________ ___________________ >> > > Looking for last minute shopping deals? >> > > >> > > [[alternative HTML version deleted]] >> > > >> > > _______________________________________________ >> > > Bioconductor mailing list >> > > Bioconductor at stat.math.ethz.ch <mailto:bioconductor at="" stat.math.ethz.ch=""> >> > > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > >> > >> > -- >> > Robert Gentleman, PhD >> > Program in Computational Biology >> > Division of Public Health Sciences >> > Fred Hutchinson Cancer Research Center >> > 1100 Fairview Ave. N, M2-B876 >> > PO Box 19024 >> > Seattle, Washington 98109-1024 >> > 206-667-7700 >> > rgentlem at fhcrc.org <mailto:rgentlem at="" fhcrc.org=""> >> > >> > >> > ----------------------------------------------------------------- ------- >> > You rock. That's why Blockbuster's offering you one month of Blockbuster >> > Total Access >> > <http: us.rd.yahoo.com="" evt="47523/*&lt;a href=" http:="" tc.deals.yahoo.com="" tc="" b"="" rel="nofollow">http://tc.deals.yahoo.com/tc/b" lockbuster="" text5.com="">> > >, No Cost. > > -- Robert Gentleman, PhD Program in Computational Biology Division of > Public Health Sciences Fred Hutchinson Cancer Research Center 1100 > Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 > 206-667-7700 rgentlem at fhcrc.org > _______________________________________________ Bioconductor mailing > list Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD COMMENT

Login before adding your answer.

Traffic: 642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6