topGO and Arabidopsis data

0

Entering edit mode

Johannes Hanson ▴ 30

@johannes-hanson-3621

Last seen 10.2 years ago

Dear All, I am analyzing affymetrix expression data using the topGO package. I basically follow the script in the topGO vingette. It works fine for the CC and BP ontologies but for MF i get the following error: > sampleGOdata <- new("topGOdata", description = "Simple session", ontology = "MF", allGenes = geneList, geneSel = topDiffGenes, nodeSize = 10, annot = annFUN.db,affyLib = affyLib) Building most specific GOs ..... ( 1348 GO terms found. ) Build GO DAG topology .......... There are no adj nodes for node: GO:0010241 Error in switch(type, isa = 0, partof = 1, -1) : EXPR must be a length 1 vector I guess that there is something wrong in the annotation packages but honestly, I might well have misunderstood the error message. Using the examples in the topGO vingette and human annotation no errors are given. Thanks in advance, Johannes > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] topGO_2.2.0 SparseM_0.86 GO.db_2.4.5 graph_1.28.0 ath1121501.db_2.4.5 org.At.tair.db_2.4.6 RSQLite_0.9-3 [8] DBI_0.2-5 AnnotationDbi_1.12.0 ath1121501cdf_2.7.0 affy_1.28.0 Biobase_2.10.0 loaded via a namespace (and not attached): [1] affyio_1.18.0 grid_2.12.0 lattice_0.19-13 preprocessCore_1.12.0 tools_2.12.0

Annotation GO ath1121501 topGO Annotation GO ath1121501 topGO • 3.0k views

ADD COMMENT • link updated 14.0 years ago by Adrian Alexa ▴ 400 • written 14.0 years ago by Johannes Hanson ▴ 30

0

Entering edit mode

Adrian Alexa ▴ 400

@adrian-alexa-936

Last seen 10.2 years ago

Hi Johannes, I apologise for the late reply. I had a chance to look into the issue you reported and apparently the problem is with the org.At.tair.db_2.4.6. Namely the GO id reported in your error belongs to the BPontology and to the MF ontology, at least based on GO.db_2.4.5 and some other only resources. For example, if you run: library(ath1121501.db) goID <- "GO:0010241" .sql <- "SELECT gene_id, go_id FROM genes INNER JOIN go_mf USING('_id')" retVal <- dbGetQuery(org.At.tair_dbconn(), .sql) go2prob <- split(retVal[["gene_id"]], retVal[["go_id"]]) go2prob[goID] you will get: $`GO:0010241` [1] "AT5G25900" But if you search this GO in the GO.db database you will find that it belongs to the BP ontology. GOTERM[[goID] I hope this bug will be solved in the org.At.tair.db_2.4.6 package. Until then I did a quick fix such that you can further use the topGO package. The annFUN.db2 will keep only the GO terms available in the GO.db specific for the chosen ontology: annFUN.db2 <- function(whichOnto, feasibleGenes = NULL, affyLib) { ## we add the .db ending if needed affyLib <- paste(sub(".db$", "", affyLib), ".db", sep = "") require(affyLib, character.only = TRUE) || stop(paste("package", affyLib, "is required", sep = " ")) affyLib <- sub(".db$", "", affyLib) orgFile <- get(paste(get(paste(affyLib, "ORGPKG", sep = "")), "_dbfile", sep = "")) try(dbGetQuery(get(paste(affyLib, "dbconn", sep = "_"))(), paste("ATTACH '", orgFile(), "' as org;", sep ="")), silent = TRUE) .sql <- paste("SELECT DISTINCT probe_id, go_id FROM probes INNER JOIN ", "(SELECT * FROM org.genes INNER JOIN org.go_", tolower(whichOnto)," USING('_id')) USING('gene_id');", sep = "") retVal <- dbGetQuery(get(paste(affyLib, "dbconn", sep = "_"))(), .sql) ## restric to the set of feasibleGenes if(!is.null(feasibleGenes)) retVal <- retVal[retVal[["probe_id"]] %in% feasibleGenes, ] ## split the table into a named list of GOs retVal <- split(retVal[["probe_id"]], retVal[["go_id"]]) ## return only the GOs mapped in GO.db return(retVal[names(retVal) %in% ls(get(paste("GO", toupper(whichOnto), "Term", sep = "")))]) } Using this function you can build a topGOdata object as before (just replace annFUN.db with annFUN.db2): library(topGO) allProbes <- ls(ath1121501GO) ## generate a random set of interesting pro myInterestingGenes <- sample(allProbes, 100) geneList <- factor(as.integer(allProbes %in% myInterestingGenes)) names(geneList) <- allProbes ## build a topGOdata object sampleGOdata <- new("topGOdata", description = "Simple session", ontology = "MF", allGenes = geneList, nodeSize = 10, annot = annFUN.db2, affyLib = "ath1121501.db") sampleGOdata This works without a problem on the latest stable R/Bioconductor version. Let me know if you have further questions. Best regards, Adrian > sessionInfo() R version 2.12.1 beta (2010-12-06 r53802) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US [4] LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=en_US [7] LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ath1121501.db_2.4.5 org.At.tair.db_2.4.6 topGO_2.2.0 [4] SparseM_0.86 GO.db_2.4.5 RSQLite_0.9-4 [7] DBI_0.2-5 AnnotationDbi_1.12.0 Biobase_2.10.0 [10] graph_1.28.0 loaded via a namespace (and not attached): [1] grid_2.12.0 lattice_0.19-13 tools_2.12.0 On Thu, Dec 2, 2010 at 5:44 PM, Johannes Hanson <s.j.hanson at="" uu.nl=""> wrote: > Dear All, > > I am analyzing affymetrix expression data using the topGO package. > I basically follow the script in the topGO vingette. It works fine for the CC and BP ontologies but for MF i get the following error: > >> sampleGOdata <- new("topGOdata", description = "Simple session", ontology = "MF", allGenes = geneList, geneSel = topDiffGenes, nodeSize = 10, annot = annFUN.db,affyLib = affyLib) > > Building most specific GOs ..... ? ? ? ?( 1348 GO terms found. ) > > Build GO DAG topology .......... > ?There are no adj nodes for node: ?GO:0010241 > Error in switch(type, isa = 0, partof = 1, -1) : > ?EXPR must be a length 1 vector > > I guess that there is something wrong in the annotation packages but honestly, I might well have misunderstood the error message. Using the examples in the topGO vingette and human annotation no errors are given. > > Thanks in advance, > Johannes > > >> sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > ?[1] topGO_2.2.0 ? ? ? ? ?SparseM_0.86 ? ? ? ? GO.db_2.4.5 ? ? ? ? ?graph_1.28.0 ? ? ? ? ath1121501.db_2.4.5 ?org.At.tair.db_2.4.6 RSQLite_0.9-3 > ?[8] DBI_0.2-5 ? ? ? ? ? ?AnnotationDbi_1.12.0 ath1121501cdf_2.7.0 ?affy_1.28.0 ? ? ? ? ?Biobase_2.10.0 > > loaded via a namespace (and not attached): > [1] affyio_1.18.0 ? ? ? ? grid_2.12.0 ? ? ? ? ? lattice_0.19-13 ? ? ? preprocessCore_1.12.0 tools_2.12.0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 14.0 years ago Adrian Alexa ▴ 400

Login before adding your answer.