GSEABase error in parsing msigdb_v2.5.xml
1
0
Entering edit mode
@vladimir-morozov-2740
Last seen 10.3 years ago
Hi, I get error reading the last vesrsion of Broad msigdb . Is it supposed to work? > gss <- getBroadSets('/data/PathDB/msigdb_v2.5.xml') Error: 'getBroadSets' failed to create gene sets: invalid BroadCollection category: 'c5' > traceback() 6: stop("'getBroadSets' failed to create gene sets:\n ", conditionMessage(err), call. = FALSE) 5: value[[3]](cond) 4: tryCatchOne(expr, names, parentenv, handlers[[1]]) 3: tryCatchList(expr, classes, parentenv, handlers) 2: tryCatch({ geneSets <- unlist(mapply(.fromXML, uri, "//GENESET", factories, SIMPLIFY = FALSE, USE.NAMES = FALSE)) }, error = function(err) { stop("'getBroadSets' failed to create gene sets:\n ", conditionMessage(err), call. = FALSE) }) 1: getBroadSets("/data/PathDB/msigdb_v2.5.xml") > packageDescription('GSEABase') Package: GSEABase Type: Package Title: Gene set enrichment data structures and methods Version: 1.2.0 Author: Martin Morgan, Seth Falcon, Robert Gentleman Maintainer: Biocore Team c/o BioC user list <bioconductor@stat.math.ethz.ch> Description: This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA). License: Artistic-2.0 Depends: R (>= 2.6.0), methods, AnnotationDbi, Biobase, annotate Suggests: Ruuid, hgu95av2.db, GO.db, org.Hs.eg.db Imports: methods, XML, graph LazyLoad: yes biocViews: Infrastructure, Statistics Collate: utilities.R AAA.R AllClasses.R AllGenerics.R getObjects.R methods-CollectionType.R methods-ExpressionSet.R methods-GeneColorSet.R methods-GeneIdentifierType.R methods-GeneSet.R methods-GeneSetCollection.R methods-OBOCollection.R zzz.R Packaged: Wed Apr 30 02:43:40 2008; biocbuild Built: R 2.7.0; ; 2008-05-14 16:18:51; unix -- File: /usr/local/lib64/R/library/GSEABase/Meta/package.rds Althogh getBroadSets('/data/PathDB/msigdb_v2.1.xml') works. I don's see obvios signs of corruption in the 2.5.xml [rstats:GeneLogic070523] head -n 2 /data/PathDB/*.xml ==> /data/PathDB/msigdb_v2.1.xml <== ==> /data/PathDB/msigdb_v2.5.xml <== tail -n 2 /data/PathDB/*.xml ==> /data/PathDB/msigdb_v2.1.xml <== <geneset standard_name="GNF2_ZAP70" systematic_name="c4:526" organism="Human" chip="GENE_SYMBOL" category_code="c4" contributor="Broad Institute" contributor_org="Broad Institute" description_brief="Neighborhood of ZAP70" description_full="Neighborhood of ZAP70 zeta-chain (TCR) associated protein kinase 70kDa in the GNF2 expression compendium" tags="" members="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG7,PRKCH,KLRK1 ,B TN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP,CD96,RASGRP1, GZ MM,TRD@,MATK,ITGAL,KLRB1" members_symbolized="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG7, PR KCH,KLRK1,BTN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP,CD 96 ,RASGRP1,GZMM,TRD@,MATK,ITGAL,KLRB1"/> </msigdb> ==> /data/PathDB/msigdb_v2.5.xml <== <geneset standard_name="INOSITOL_OR_PHOSPHATIDYLINOSITOL_KINASE_ACTIVITY" systematic_name="c5:1203" organism="Homo sapiens" authors="Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC,Richardson JE, Ringwald M, Rubin GM, Sherlock G." external_details_url="&lt;a href=" http:="" amigo.geneontology.org="" cgi-"="" rel="nofollow">http://amigo.geneontology.org/cgi- bin/amigo/go.cgi ?view=details&search_constraint=terms&depth=0&query=GO:000 44 28" chip="GENE_SYMBOL" category_code="c5" contributor="Gene Ontology" contributor_org="Gene Ontology" description_brief="Genes annotated by the GO term GO:0004428. Catalysis of the phosphorylation of myo- inositol (1,2,3,5/4,6-cyclohexanehexol) or a phosphatidylinositol." description_full="" tags="Molecular function" members="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3CA,PIK3CB,PIK3 CG ,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB" members_symbolized="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3CA, PI K3CB,PIK3CG,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB"/> </msigdb> Best Vlad Vladimir Morozov ALS Therapy Development Institute [[alternative HTML version deleted]]
GO Infrastructure hgu95av2 Biobase Ruuid AnnotationDbi GO Infrastructure hgu95av2 Biobase • 1.5k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 5 months ago
United States
Thanks Vladimir for the report, more below... "Vladimir Morozov" <vmorozov at="" als.net=""> writes: > Hi, > > I get error reading the last vesrsion of Broad msigdb . Is it supposed > to work? > >> gss <- getBroadSets('/data/PathDB/msigdb_v2.5.xml') > Error: 'getBroadSets' failed to create gene sets: > invalid BroadCollection category: 'c5' The Broad added a category; I've updated GSEABase in both the devel and release branches. The update should be available with biocLite after 12 noon Friday; look for GSEABase 1.2.1 in the release. One aspect that is a little unsatisfactory is that the subcategories (CC/ BP/MF for c5, for instance) are not encoded in the XML, and so are not present in the gene sets. Martin >> traceback() > 6: stop("'getBroadSets' failed to create gene sets:\n ", > conditionMessage(err), > call. = FALSE) > 5: value[[3]](cond) > 4: tryCatchOne(expr, names, parentenv, handlers[[1]]) > 3: tryCatchList(expr, classes, parentenv, handlers) > 2: tryCatch({ > geneSets <- unlist(mapply(.fromXML, uri, "//GENESET", factories, > SIMPLIFY = FALSE, USE.NAMES = FALSE)) > }, error = function(err) { > stop("'getBroadSets' failed to create gene sets:\n ", > conditionMessage(err), > call. = FALSE) > }) > 1: getBroadSets("/data/PathDB/msigdb_v2.5.xml") >> packageDescription('GSEABase') > Package: GSEABase > Type: Package > Title: Gene set enrichment data structures and methods > Version: 1.2.0 > Author: Martin Morgan, Seth Falcon, Robert Gentleman > Maintainer: Biocore Team c/o BioC user list > <bioconductor at="" stat.math.ethz.ch=""> > Description: This package provides classes and methods to support Gene > Set Enrichment Analysis (GSEA). > License: Artistic-2.0 > Depends: R (>= 2.6.0), methods, AnnotationDbi, Biobase, annotate > Suggests: Ruuid, hgu95av2.db, GO.db, org.Hs.eg.db > Imports: methods, XML, graph > LazyLoad: yes > biocViews: Infrastructure, Statistics > Collate: utilities.R AAA.R AllClasses.R AllGenerics.R getObjects.R > methods-CollectionType.R methods-ExpressionSet.R > methods-GeneColorSet.R methods-GeneIdentifierType.R > methods-GeneSet.R methods-GeneSetCollection.R > methods-OBOCollection.R zzz.R > Packaged: Wed Apr 30 02:43:40 2008; biocbuild > Built: R 2.7.0; ; 2008-05-14 16:18:51; unix > > -- File: /usr/local/lib64/R/library/GSEABase/Meta/package.rds > > > Althogh > getBroadSets('/data/PathDB/msigdb_v2.1.xml') > works. I don's see obvios signs of corruption in the 2.5.xml > [rstats:GeneLogic070523] head -n 2 /data/PathDB/*.xml > ==> /data/PathDB/msigdb_v2.1.xml <== > > > > ==> /data/PathDB/msigdb_v2.5.xml <== > > > tail -n 2 /data/PathDB/*.xml > ==> /data/PathDB/msigdb_v2.1.xml <== > <geneset standard_name="GNF2_ZAP70" systematic_name="c4:526"> ORGANISM="Human" CHIP="GENE_SYMBOL" CATEGORY_CODE="c4" > CONTRIBUTOR="Broad Institute" CONTRIBUTOR_ORG="Broad Institute" > DESCRIPTION_BRIEF="Neighborhood of ZAP70" DESCRIPTION_FULL="Neighborhood > of ZAP70 zeta-chain (TCR) associated protein kinase 70kDa in the GNF2 > expression compendium" TAGS="" > MEMBERS="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG7,PRKCH,KLR K1,B > TN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP,CD96,RASGRP 1,GZ > MM,TRD@,MATK,ITGAL,KLRB1" > MEMBERS_SYMBOLIZED="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG 7,PR > KCH,KLRK1,BTN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP, CD96 > ,RASGRP1,GZMM,TRD@,MATK,ITGAL,KLRB1"/> > </msigdb> > > ==> /data/PathDB/msigdb_v2.5.xml <== > <geneset> STANDARD_NAME="INOSITOL_OR_PHOSPHATIDYLINOSITOL_KINASE_ACTIVITY" > SYSTEMATIC_NAME="c5:1203" ORGANISM="Homo sapiens" AUTHORS="Ashburner M, > Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski > K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, > Lewis S, Matese JC,Richardson JE, Ringwald M, Rubin GM, Sherlock G." > EXTERNAL_DETAILS_URL="http://amigo.geneontology.org/cgi- bin/amigo/go.cgi > ?view=details&search_constraint=terms&depth=0&query=GO:0 0044 > 28" CHIP="GENE_SYMBOL" CATEGORY_CODE="c5" CONTRIBUTOR="Gene Ontology" > CONTRIBUTOR_ORG="Gene Ontology" DESCRIPTION_BRIEF="Genes annotated by > the GO term GO:0004428. Catalysis of the phosphorylation of myo- inositol > (1,2,3,5/4,6-cyclohexanehexol) or a phosphatidylinositol." > DESCRIPTION_FULL="" TAGS="Molecular function" > MEMBERS="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3CA,PIK3CB,PI K3CG > ,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB" > MEMBERS_SYMBOLIZED="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3C A,PI > K3CB,PIK3CG,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB"/> > </msigdb> > > > > Best > Vlad > > > > Vladimir Morozov > > ALS Therapy Development Institute > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793
ADD COMMENT

Login before adding your answer.

Traffic: 436 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6