GOHyperGResult in a data.frame, possible NAMESPACE problem?
1
0
Entering edit mode
Robert Castelo ★ 3.4k
@rcastelo
Last seen 2 days ago
Barcelona/Universitat Pompeu Fabra
dear list, i have the following GO reporting function that builds a data frame with the output of the call to summary() on a GOHyperGResult object resulting from testing for GO enrichment with the package GOstats, plus the gene symbols and entrez ids of the genes that provide enrichment to each category, all nicely ordered by odds ratio of enrichment. the third argument allows one to produce latex code that highlight genes of interest in bold face but which can be set to NULL to avoid that: GOreport <- function(goHypGresult, chip, highlightedGenes=NULL) { cats <- sigCategories(goHypGresult) reportGenes <- vector() for (i in 1:length(cats)) { reportGenes <- append(reportGenes, geneIdsByCategory(goHypGresult, cats[i])) } reportGeneSyms <- sapply(reportGenes, function(egIDs) { syms <- as.vector(unlist(mget(egIDs, getAnnMap(map="SYMBOL", chip=chip, type="db")))) syms <- sort(syms) syms <- sapply(1:length(egIDs), function(i, egIDs) { s <- syms[i] if (!is.null(highlightedGenes) && !is.na(match(egIDs[i], highlightedGenes))) { s <- sprintf("{\\bf %s}", s) } s }, egIDs) paste(syms, collapse=", ") }) reportGenes <- sapply(reportGenes, function(x) { paste(x, collapse=",") }) report <- data.frame(summary(goHypGresult), GeneSyms=reportGeneSyms, Genes=reportGenes) rownames(report) <- NULL report[sort(report$"OddsRatio", decreasing=TRUE, index.return=TRUE)$ix, ] } this function forms part of a package i'm building and when i use it as follows gives the error shown below: library(annotate) library(org.Hs.eg.db) library(GOstats) library(myPkg) ## this would be the package where GOreport resides ## use the genes from the death GO category as test deathEGs <- org.Hs.egGO2EG[["GO:0016265"]] ## sample 100 genes randomly and add the death genes as universe set.seed(123) universeEGs <- unique(c(sample(mappedLkeys(org.Hs.egGO2EG), size=100), deathEGs)) ## test for GO enrichment goHypGparams <- new("GOHyperGParams", geneIds=deathEGs, universeGeneIds=universeEGs, annotation="org.Hs.eg.db", ontology="BP", pvalueCutoff=0.05, conditional=TRUE, testDirection="over") goHypGcond <- hyperGTest(goHypGparams) ## call the problematic function report <- GOreport(goHypGcond, "org.Hs.eg.db", NULL) Error in do.call("expand.grid", dimnames(x)) : second argument must be a list > traceback() 9: stop("second argument must be a list") 8: do.call("expand.grid", dimnames(x)) 7: data.frame(do.call("expand.grid", dimnames(x)), Freq = c(x), row.names = row.names) 6: eval(expr, envir, enclos) 5: eval(ex) 4: as.data.frame.table(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) 3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) 2: data.frame(summary(goHypGresult), GeneSyms = reportGeneSyms, Genes = reportGenes) 1: GOreport(goHypGcond, "org.Hs.eg.db", NULL) so there is something wrong with the instruction that builds the data.frame at the very end of the function. however, if i paste the function on the R shell and i call it again works smoothly: GOreport(goHypGcond, "org.Hs.eg.db", NULL) 1 GO:0016265 7.966717e-05 Inf 0.4500000 4 9 3 GO:0032501 6.718598e-03 Inf 1.2000000 4 24 6 GO:0050896 3.279279e-02 Inf 0.3733333 2 14 2 GO:0007517 5.598832e-03 75.00000 0.1500000 2 3 4 GO:0007610 1.802312e-02 24.33333 0.2500000 2 5 5 GO:0048731 1.956272e-02 16.00000 0.7500000 3 15 7 GO:0009653 4.784456e-02 11.66667 0.4000000 2 8 Term GeneSyms 1 death AFG3L2, RAG1, SLC18A2, TCF15 3 multicellular organismal process AFG3L2, RAG1, SLC18A2, TCF15 6 response to stimulus AFG3L2, RAG1 2 muscle development AFG3L2, TCF15 4 behavior SLC18A2, TCF15 5 system development AFG3L2, RAG1, TCF15 7 anatomical structure morphogenesis AFG3L2, TCF15 Genes 1 10939,5896,6571,6939 3 10939,5896,6571,6939 6 10939,5896 2 10939,6939 4 6571,6939 5 10939,5896,6939 7 10939,6939 this is really puzzling for me and i suspect that i'm confronted with some general problem regarding namespaces or so, thus these are the contents of the NAMESPACE file from this package myPkg containing the GOreport function: exportPattern("^[[:alpha:]]+") importFrom(annotate, getAnnMap) importFrom(AnnotationDbi, mget) importFrom(IRanges, unique) importMethodsFrom(GOstats) and these are the contents of the DESCRIPTION file Package: myPkg Type: Package Title: What the package does (short line) Version: 1.0 Date: 2010-01-11 Author: Who wrote it Description: More about what it does (maybe more than one line) Depends: methods, annotate, Biobase (>= 2.4.1), AnnotationDbi, GOstats, IRanges Imports: methods, annotate, Biobase (>= 2.4.1), AnnotationDbi, GOstats, IRanges Maintainer: Who to complain to <yourfault at="" somewhere.net=""> License: What license is it under? i guess this is difficult to reproduce since if you simply paste the function is going to work for you and the problem arises only within the context of a package, but any hint on the possible reason related to the namespace or whatever else you think will be highly appreciated. thanks! robert. sessionInfo() R version 2.9.1 (2009-06-26) x86_64-unknown-linux-gnu locale: C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GO.db_2.2.11 myPkg_1.0 [3] IRanges_1.2.3 GOstats_2.10.0 [5] graph_1.22.3 Category_2.10.1 [7] org.Hs.eg.db_2.2.11 RSQLite_0.7-2 [9] DBI_0.2-4 annotate_1.22.0 [11] AnnotationDbi_1.6.1 Biobase_2.4.1 loaded via a namespace (and not attached): [1] GSEABase_1.6.1 RBGL_1.20.0 XML_2.6-0 genefilter_1.24.2 [5] splines_2.9.1 survival_2.35-7 tools_2.9.1 xtable_1.5-6
GO Biobase annotate PROcess GOstats Category AnnotationDbi GO Biobase annotate PROcess • 1.4k views
ADD COMMENT
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 10.2 years ago
Hi Robert, This sort of post might have a better home on the bioc-devel list, but since we've started here, some comments below... On 1/11/10 11:05 AM, Robert Castelo wrote: <snip> > ## call the problematic function > report<- GOreport(goHypGcond, "org.Hs.eg.db", NULL) > Error in do.call("expand.grid", dimnames(x)) : > second argument must be a list >> traceback() > 9: stop("second argument must be a list") > 8: do.call("expand.grid", dimnames(x)) > 7: data.frame(do.call("expand.grid", dimnames(x)), Freq = c(x), > row.names = row.names) > 6: eval(expr, envir, enclos) > 5: eval(ex) > 4: as.data.frame.table(x[[i]], optional = TRUE, stringsAsFactors = > stringsAsFactors) > 3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors = > stringsAsFactors) > 2: data.frame(summary(goHypGresult), GeneSyms = reportGeneSyms, > Genes = reportGenes) > 1: GOreport(goHypGcond, "org.Hs.eg.db", NULL) > > so there is something wrong with the instruction that builds the > data.frame at the very end of the function. > > however, if i paste the function on the R shell and i call it again > works smoothly: Yes, that sounds like a name space sort of issue. > this is really puzzling for me and i suspect that i'm confronted with > some general problem regarding namespaces or so, thus these are the > contents of the NAMESPACE file from this package myPkg containing the > GOreport function: > > exportPattern("^[[:alpha:]]+") > importFrom(annotate, getAnnMap) > importFrom(AnnotationDbi, mget) > importFrom(IRanges, unique) > importMethodsFrom(GOstats) I would try: import(GOstats) You need to import the classes as well as the methods. You could try declaring just those that you use, but I would try the above catch-all first. > > and these are the contents of the DESCRIPTION file > > Package: myPkg > Type: Package > Title: What the package does (short line) > Version: 1.0 > Date: 2010-01-11 > Author: Who wrote it > Description: More about what it does (maybe more than one line) > Depends: methods, annotate, Biobase (>= 2.4.1), AnnotationDbi, GOstats, > IRanges > Imports: methods, annotate, Biobase (>= 2.4.1), AnnotationDbi, GOstats, > IRanges > Maintainer: Who to complain to<yourfault at="" somewhere.net=""> > License: What license is it under? > > i guess this is difficult to reproduce since if you simply paste the > function is going to work for you and the problem arises only within the > context of a package, but any hint on the possible reason related to the > namespace or whatever else you think will be highly appreciated. If you are able to post a .tar.gz of your source package to a public URL, we could take a closer look. > sessionInfo() > R version 2.9.1 (2009-06-26) > x86_64-unknown-linux-gnu Especially since you are developing a package of your own, I would recommend upgrading to the latest R and BioC releases. You might also consider using R-devel and Bioc-devel versions to make sure that what you develope remains compatible as the project(s) evolve. + seth -- Seth Falcon Bioconductor Core Team | FHCRC
ADD COMMENT
0
Entering edit mode
hi Seth, > This sort of post might have a better home on the bioc-devel list, but > since we've started here, some comments below... i was afraid this would something too trivial for bioc-devel > I would try: > > import(GOstats) > > You need to import the classes as well as the methods. You could try > declaring just those that you use, but I would try the above catch- all > first. I've tried it out but it doesn't work either > If you are able to post a .tar.gz of your source package to a public > URL, we could take a closer look. thanks!! that's very nice from you, please see below > > sessionInfo() > > R version 2.9.1 (2009-06-26) > > x86_64-unknown-linux-gnu > > Especially since you are developing a package of your own, I would > recommend upgrading to the latest R and BioC releases. You might also > consider using R-devel and Bioc-devel versions to make sure that what > you develope remains compatible as the project(s) evolve. i know, rather than a software package this is a "packaged" analysis pipeline that has been evolving since last year and i still have to port it to the latest release which haven't done yet due to major changes i'll have to do in re-making a custom annotation package to adapt to the new .db annotation-package standards. anyway, i've isolated the piece of code that produces the error and made a package running under the devel version and as you'll see below the error reproduces also in the current devel version. i've put the package in the following public url: http://functionalgenomics.upf.edu/myPkg_1.0.tar.gz which has been created by calling package.skeleton("myPkg", "GOreport") from the R shell with the GOreport function definition in the workspace. then i added something to \title{ } in myPkg/man/GOreport.rd (otherwise it cannot be installed) and added the DESCRIPTION and NAMESPACE files i paste in my previous email. these are the commands and session info that reproduce the problem using this package with the devel version: library(annotate) library(org.Hs.eg.db) library(GOstats) library(myPkg) ## use the genes from the death GO category as test deathEGs <- org.Hs.egGO2EG[["GO:0016265"]] ## sample 100 genes randomly and add the death genes as universe universeEGs <- unique(c(sample(mappedLkeys(org.Hs.egGO2EG), size=100), deathEGs)) ## test from GO enrichment goHypGparams <- new("GOHyperGParams", geneIds=deathEGs, universeGeneIds=universeEGs, annotation="org.Hs.eg.db", ontology="BP", pvalueCutoff=0.05, conditional=TRUE, testDirection="over") goHypGcond <- hyperGTest(goHypGparams) ## call the problematic function report <- GOreport(goHypGcond, "org.Hs.eg.db") Error in do.call("expand.grid", c(dimnames(x), stringsAsFactors = stringsAsFactors)) : second argument must be a list traceback() 9: stop("second argument must be a list") 8: do.call("expand.grid", c(dimnames(x), stringsAsFactors = stringsAsFactors)) 7: data.frame(do.call("expand.grid", c(dimnames(x), stringsAsFactors = stringsAsFactors)), Freq = c(x), row.names = row.names) 6: eval(expr, envir, enclos) 5: eval(ex) 4: as.data.frame.table(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) 3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) 2: data.frame(summary(goHypGresult), GeneSyms = reportGeneSyms, Genes = reportGenes) 1: GOreport(goHypGcond, "org.Hs.eg.db") sessionInfo() R version 2.11.0 Under development (unstable) (2009-10-06 r49948) x86_64-unknown-linux-gnu locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GO.db_2.3.5 myPkg_1.0 IRanges_1.5.16 [4] GOstats_2.13.0 graph_1.25.3 Category_2.13.0 [7] org.Hs.eg.db_2.3.6 RSQLite_0.7-3 DBI_0.2-4 [10] annotate_1.25.0 AnnotationDbi_1.9.2 Biobase_2.7.2 loaded via a namespace (and not attached): [1] GSEABase_1.9.0 RBGL_1.23.0 XML_2.6-0 genefilter_1.29.5 [5] splines_2.11.0 survival_2.35-7 tools_2.11.0 xtable_1.5-6 thanks!! robert.
ADD REPLY
0
Entering edit mode
diff --git a/R/GOreport.R b/R/GOreport.R index aa3d2a5..e7f6c8d 100644 --- a/R/GOreport.R +++ b/R/GOreport.R @@ -24,6 +24,8 @@ GOreport <- function(goHypGresult, chip, highlightedGenes=NULL) { reportGenes <- sapply(reportGenes, function(x) { paste(x, collapse=",") }) + ## XXX: please fix this temporary workaround + summary <- getGeneric("summary") report <- data.frame(summary(goHypGresult), GeneSyms=reportGeneSyms, Genes=reportGenes) rownames(report) <- NULL report[sort(report$"OddsRatio", decreasing=TRUE, index.return=TRUE)$ix, ]
ADD REPLY

Login before adding your answer.

Traffic: 827 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6