BioPAX parsing
1
0
Entering edit mode
@martin-preusse-5224
Last seen 10.2 years ago
Many biological pathway resourced provide their data in the BioPAX format (http://www.biopax.org/index.php), a special XML format for biological interaction networks. Examples are pathway commons (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org (http://www.reactome.org/)). A JAVA library for parsing BioPAX files exists: http://www.biopax.org/paxtools.php Has anybody used BioPAX files with R? Is it possible to read BioPAX files in any R based graph structure? A solution similar to the KEGGgraph package for KEGG pahways would be great, since more and more databases start using BioPAX. Any ideas are appreciated! Cheers Martin
graph graph • 2.3k views
ADD COMMENT
0
Entering edit mode
@oliver-ruebenacker-5312
Last seen 10.2 years ago
Hello Martin, I'm currently looking into reading BioPAX into R using RJava and OpenRDF Sesame. If there is interest, I may be looking into submitting a package to BioConductor. It would be very helpful if you could tell me what you need the BioPAX data for, and in what form it would be best for you. Possible options are: - A data frame of the RDF/OWL triples - A graph of the RDF/OWL triples - A data frame with one row for each reaction-participant - A bi-partite graph with nodes for reactions and nodes for substances - A with nodes for substances only, with edges for interactions - A genetic interaction graph This list is roughly sorted form the one most easy to the most difficult to provide. Take care Oliver On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse <martin.preusse at="" googlemail.com=""> wrote: > Many biological pathway resourced provide their data in the BioPAX format (http://www.biopax.org/index.php), a special XML format for biological interaction networks. Examples are pathway commons (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org (http://www.reactome.org/)). > > A JAVA library for parsing BioPAX files exists: http://www.biopax.org/paxtools.php > > Has anybody used BioPAX files with R? Is it possible to read BioPAX files in any R based graph structure? A solution similar to the KEGGgraph package for KEGG pahways would be great, since more and more databases start using BioPAX. > > > Any ideas are appreciated! > > Cheers > Martin > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Oliver Ruebenacker Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Knowomics, The Bioinformatics Network (http://www.knowomics.com) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
ADD COMMENT
0
Entering edit mode
Hi Oliver, I think there is a lot interest in a bioconductor package! Personally, I would like to read pathways stored in the BioPAX format into any kind of graph. It's a philosophical question if reactions should have nodes or should sit on the edges :) So far I have not used any R graph package. But I assume there are some very generic packages which are flexible enough to support both direct and bi-partite pathway structure. I used e.g. the JUNG graph API for JAVA extensively. I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only a format to store a pathway. And I would like to bring it back into its natural form: a network! Do you have any code to test? I have used RJava before. All this RDF and XML file format stuff kind of puzzles me though ? :) Cheers Martin Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker: > Hello Martin, > > I'm currently looking into reading BioPAX into R using RJava and > OpenRDF Sesame. If there is interest, I may be looking into submitting > a package to BioConductor. > > It would be very helpful if you could tell me what you need the > BioPAX data for, and in what form it would be best for you. Possible > options are: > > - A data frame of the RDF/OWL triples > - A graph of the RDF/OWL triples > - A data frame with one row for each reaction-participant > - A bi-partite graph with nodes for reactions and nodes for substances > - A with nodes for substances only, with edges for interactions > - A genetic interaction graph > > This list is roughly sorted form the one most easy to the most > difficult to provide. > > Take care > Oliver > > On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse > <martin.preusse at="" googlemail.com="" (mailto:martin.preusse="" at="" googlemail.com)=""> wrote: > > Many biological pathway resourced provide their data in the BioPAX format (http://www.biopax.org/index.php), a special XML format for biological interaction networks. Examples are pathway commons (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org (http://www.reactome.org/)). > > > > A JAVA library for parsing BioPAX files exists: http://www.biopax.org/paxtools.php > > > > Has anybody used BioPAX files with R? Is it possible to read BioPAX files in any R based graph structure? A solution similar to the KEGGgraph package for KEGG pahways would be great, since more and more databases start using BioPAX. > > > > > > Any ideas are appreciated! > > > > Cheers > > Martin > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at r-project.org (mailto:Bioconductor at r-project.org) > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > > -- > Oliver Ruebenacker > Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) > Knowomics, The Bioinformatics Network (http://www.knowomics.com) > SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >
ADD REPLY
0
Entering edit mode
Hello Martin, I don't have code in R to test yet, but I do have extensive experience handling BioPAX in Java, so I'm assuming reading BioPAX using RJava should not be too difficult. The best target format depends on what people would like to do with the data. For visualization, a bi-partite graph in a popular graph-layout package should be best. Is there any particular graph package in BioConductor or R in general you would recommend? For actual analysis, people probably have more specific requirements. BioPAX is a format based on RDF/OWL, which in turn is based on organizing data in triples, which could be stored in a three-column data frame (or perhaps a fourth column for data type). For example (incomplete, for illustration only): ex:mapPhosphorylization rdf:type bp:BiochemicalReaction. ex:atp rdf:type bp:SmallMolecule. ex:adp rdf:type bp:SmallMolecule. ex:map rdf:type bp:Protein. ex:mapPhosphorylized rdf:type bp:Protein. ex:mapPhosphorylization bp:left ex:atp. ex:mapPhosphorylization bp:left ex:map. ex:mapPhosphorylization bp:right ex:adp. ex:mapPhosphorylization bp:right ex:mapPhosphorylized. Take care Oliver On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse <martin.preusse at="" googlemail.com=""> wrote: > Hi Oliver, > > I think there is a lot interest in a bioconductor package! > > Personally, I would like to read pathways stored in the BioPAX format into any kind of graph. It's a philosophical question if reactions should have nodes or should sit on the edges :) So far I have not used any R graph package. But I assume there are some very generic packages which are flexible enough to support both direct and bi-partite pathway structure. I used e.g. the JUNG graph API for JAVA extensively. > > I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only a format to store a pathway. And I would like to bring it back into its natural form: a network! > > Do you have any code to test? I have used RJava before. All this RDF and XML file format stuff kind of puzzles me though ? :) > > Cheers > Martin > > > > Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker: > >> Hello Martin, >> >> I'm currently looking into reading BioPAX into R using RJava and >> OpenRDF Sesame. If there is interest, I may be looking into submitting >> a package to BioConductor. >> >> It would be very helpful if you could tell me what you need the >> BioPAX data for, and in what form it would be best for you. Possible >> options are: >> >> - A data frame of the RDF/OWL triples >> - A graph of the RDF/OWL triples >> - A data frame with one row for each reaction-participant >> - A bi-partite graph with nodes for reactions and nodes for substances >> - A with nodes for substances only, with edges for interactions >> - A genetic interaction graph >> >> This list is roughly sorted form the one most easy to the most >> difficult to provide. >> >> Take care >> Oliver >> >> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse >> <martin.preusse at="" googlemail.com="" (mailto:martin.preusse="" at="" googlemail.com)=""> wrote: >> > Many biological pathway resourced provide their data in the BioPAX format (http://www.biopax.org/index.php), a special XML format for biological interaction networks. Examples are pathway commons (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org (http://www.reactome.org/)). >> > >> > A JAVA library for parsing BioPAX files exists: http://www.biopax.org/paxtools.php >> > >> > Has anybody used BioPAX files with R? Is it possible to read BioPAX files in any R based graph structure? A solution similar to the KEGGgraph package for KEGG pahways would be great, since more and more databases start using BioPAX. >> > >> > >> > Any ideas are appreciated! >> > >> > Cheers >> > Martin >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at r-project.org (mailto:Bioconductor at r-project.org) >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> >> >> >> >> >> -- >> Oliver Ruebenacker >> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >> > > > -- Oliver Ruebenacker Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Knowomics, The Bioinformatics Network (http://www.knowomics.com) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
ADD REPLY
0
Entering edit mode
Oliver and Martin, It would be very helpful to have easy access to BioPAX data in Biocondcutor. Just now, at the weekly Bioconductor dev-team meeting, we discussed your ideas, and want to endorse them. Oliver's proposal to parse the RDF triples into a data.frame has lots to recommend it. It would be immediately useful, and yet also allow for more sophisticated uses later. With these relationships in R, annotated as BioPAX data often are, we can imagine interested parties writing S4 classes which use the data, which might provide flexible querying capabilities, and be able to transform those triples into graphs and networks, for further computation and display. Please let us know if we can help. - Paul On Jun 15, 2012, at 12:23 PM, Oliver Ruebenacker wrote: > Hello Martin, > > I don't have code in R to test yet, but I do have extensive > experience handling BioPAX in Java, so I'm assuming reading BioPAX > using RJava should not be too difficult. > > The best target format depends on what people would like to do with > the data. For visualization, a bi-partite graph in a popular > graph-layout package should be best. Is there any particular graph > package in BioConductor or R in general you would recommend? > > For actual analysis, people probably have more specific requirements. > > BioPAX is a format based on RDF/OWL, which in turn is based on > organizing data in triples, which could be stored in a three-column > data frame (or perhaps a fourth column for data type). For example > (incomplete, for illustration only): > > ex:mapPhosphorylization rdf:type bp:BiochemicalReaction. > ex:atp rdf:type bp:SmallMolecule. > ex:adp rdf:type bp:SmallMolecule. > ex:map rdf:type bp:Protein. > ex:mapPhosphorylized rdf:type bp:Protein. > ex:mapPhosphorylization bp:left ex:atp. > ex:mapPhosphorylization bp:left ex:map. > ex:mapPhosphorylization bp:right ex:adp. > ex:mapPhosphorylization bp:right ex:mapPhosphorylized. > > Take care > Oliver > > On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse > <martin.preusse at="" googlemail.com=""> wrote: >> Hi Oliver, >> >> I think there is a lot interest in a bioconductor package! >> >> Personally, I would like to read pathways stored in the BioPAX format into any kind of graph. It's a philosophical question if reactions should have nodes or should sit on the edges :) So far I have not used any R graph package. But I assume there are some very generic packages which are flexible enough to support both direct and bi-partite pathway structure. I used e.g. the JUNG graph API for JAVA extensively. >> >> I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only a format to store a pathway. And I would like to bring it back into its natural form: a network! >> >> Do you have any code to test? I have used RJava before. All this RDF and XML file format stuff kind of puzzles me though ? :) >> >> Cheers >> Martin >> >> >> >> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker: >> >>> Hello Martin, >>> >>> I'm currently looking into reading BioPAX into R using RJava and >>> OpenRDF Sesame. If there is interest, I may be looking into submitting >>> a package to BioConductor. >>> >>> It would be very helpful if you could tell me what you need the >>> BioPAX data for, and in what form it would be best for you. Possible >>> options are: >>> >>> - A data frame of the RDF/OWL triples >>> - A graph of the RDF/OWL triples >>> - A data frame with one row for each reaction-participant >>> - A bi-partite graph with nodes for reactions and nodes for substances >>> - A with nodes for substances only, with edges for interactions >>> - A genetic interaction graph >>> >>> This list is roughly sorted form the one most easy to the most >>> difficult to provide. >>> >>> Take care >>> Oliver >>> >>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse >>> <martin.preusse at="" googlemail.com="" (mailto:martin.preusse="" at="" googlemail.com)=""> wrote: >>>> Many biological pathway resourced provide their data in the BioPAX format (http://www.biopax.org/index.php), a special XML format for biological interaction networks. Examples are pathway commons (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org (http://www.reactome.org/)). >>>> >>>> A JAVA library for parsing BioPAX files exists: http://www.biopax.org/paxtools.php >>>> >>>> Has anybody used BioPAX files with R? Is it possible to read BioPAX files in any R based graph structure? A solution similar to the KEGGgraph package for KEGG pahways would be great, since more and more databases start using BioPAX. >>>> >>>> >>>> Any ideas are appreciated! >>>> >>>> Cheers >>>> Martin >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org (mailto:Bioconductor at r-project.org) >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> >>> >>> >>> >>> -- >>> Oliver Ruebenacker >>> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >>> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >>> >> >> >> > > > > -- > Oliver Ruebenacker > Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) > Knowomics, The Bioinformatics Network (http://www.knowomics.com) > SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
FWIW I have added Biopax level3 owl/rdf to current image of Rredland -- I think you can get it via bioc devel svn with the usual credentials readonly/readonly. If we run the Rredland.Rnw > bp3m RDFModel instance > as(bp3m, "data.frame") -> bp3df > dim(bp3df) [1] 1611 3 > as(bp3m, "graphNEL") -> bp3g > bp3g A graphNEL graph with directed edges Number of Nodes = 554 Number of Edges = 1606 > nodes(bp3g)[1:10] [1] "<http: www.biopax.org="" release="" biopax-level3.owl="">" [2] "<http: www.biopax.org="" release="" biopax-="" level3.owl#absoluteregion="">" [3] "_:r1339811872r5333r238" [4] "_:r1339811872r5333r239" [5] "<http: www.biopax.org="" release="" biopax-="" level3.owl#dnaregionreference="">" [6] "_:r1339811872r5333r240" [7] "<http: www.biopax.org="" release="" biopax-="" level3.owl#rnaregionreference="">" [8] "<http: www.biopax.org="" release="" biopax-level3.owl#author="">" [9] "<http: www.biopax.org="" release="" biopax-level3.owl#availability="">" [10] "<http: www.biopax.org="" release="" biopax-="" level3.owl#bindingfeature="">" > adj(bp3g, "< http://www.biopax.org/release/biopax-level3.owl#RnaRegionReference>") $`<http: www.biopax.org="" release="" biopax-="" level3.owl#rnaregionreference="">` [1] "<http: www.w3.org="" 2002="" 07="" owl#class="">" [2] "<http: www.biopax.org="" release="" biopax-="" level3.owl#entityreference="">" [3] "_:r1339811872r5333r366" [4] "<http: www.biopax.org="" release="" biopax-level3.owl#dnareference="">" [5] "<http: www.biopax.org="" release="" biopax-="" level3.owl#dnaregionreference="">" [6] "<http: www.biopax.org="" release="" biopax-="" level3.owl#proteinreference="">" [7] "<http: www.biopax.org="" release="" biopax-level3.owl#rnareference="">" [8] "<http: www.biopax.org="" release="" biopax-="" level3.owl#smallmoleculereference="">" [9] "\"Definition: A RNARegion reference is a grouping of several RNARegion entities that are common in sequence and genomic position. Members can differ in celular location, sequence features, mutations and bound partners.\"^^<http: www.w3.org="" 2001="" xmlschema#string="">" So various useful transformations are available with the Rredland package, that I withdrew from distribution because I wanted to expose more of the librdf RDF model query facilities and that takes more work than I have had time for. On Fri, Jun 15, 2012 at 6:08 PM, Paul Shannon <pshannon@fhcrc.org> wrote: > Oliver and Martin, > > It would be very helpful to have easy access to BioPAX data in > Biocondcutor. > > Just now, at the weekly Bioconductor dev-team meeting, we discussed your > ideas, and want to endorse them. Oliver's proposal to parse the RDF > triples into a data.frame has lots to recommend it. It would be > immediately useful, and yet also allow for more sophisticated uses later. > With these relationships in R, annotated as BioPAX data often are, we can > imagine interested parties writing S4 classes which use the data, which > might provide flexible querying capabilities, and be able to transform > those triples into graphs and networks, for further computation and display. > > Please let us know if we can help. > > - Paul > > > On Jun 15, 2012, at 12:23 PM, Oliver Ruebenacker wrote: > > > Hello Martin, > > > > I don't have code in R to test yet, but I do have extensive > > experience handling BioPAX in Java, so I'm assuming reading BioPAX > > using RJava should not be too difficult. > > > > The best target format depends on what people would like to do with > > the data. For visualization, a bi-partite graph in a popular > > graph-layout package should be best. Is there any particular graph > > package in BioConductor or R in general you would recommend? > > > > For actual analysis, people probably have more specific requirements. > > > > BioPAX is a format based on RDF/OWL, which in turn is based on > > organizing data in triples, which could be stored in a three- column > > data frame (or perhaps a fourth column for data type). For example > > (incomplete, for illustration only): > > > > ex:mapPhosphorylization rdf:type bp:BiochemicalReaction. > > ex:atp rdf:type bp:SmallMolecule. > > ex:adp rdf:type bp:SmallMolecule. > > ex:map rdf:type bp:Protein. > > ex:mapPhosphorylized rdf:type bp:Protein. > > ex:mapPhosphorylization bp:left ex:atp. > > ex:mapPhosphorylization bp:left ex:map. > > ex:mapPhosphorylization bp:right ex:adp. > > ex:mapPhosphorylization bp:right ex:mapPhosphorylized. > > > > Take care > > Oliver > > > > On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse > > <martin.preusse@googlemail.com> wrote: > >> Hi Oliver, > >> > >> I think there is a lot interest in a bioconductor package! > >> > >> Personally, I would like to read pathways stored in the BioPAX format > into any kind of graph. It's a philosophical question if reactions should > have nodes or should sit on the edges :) So far I have not used any R graph > package. But I assume there are some very generic packages which are > flexible enough to support both direct and bi-partite pathway structure. I > used e.g. the JUNG graph API for JAVA extensively. > >> > >> I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only > a format to store a pathway. And I would like to bring it back into its > natural form: a network! > >> > >> Do you have any code to test? I have used RJava before. All this RDF > and XML file format stuff kind of puzzles me though :) > >> > >> Cheers > >> Martin > >> > >> > >> > >> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker: > >> > >>> Hello Martin, > >>> > >>> I'm currently looking into reading BioPAX into R using RJava and > >>> OpenRDF Sesame. If there is interest, I may be looking into submitting > >>> a package to BioConductor. > >>> > >>> It would be very helpful if you could tell me what you need the > >>> BioPAX data for, and in what form it would be best for you. Possible > >>> options are: > >>> > >>> - A data frame of the RDF/OWL triples > >>> - A graph of the RDF/OWL triples > >>> - A data frame with one row for each reaction-participant > >>> - A bi-partite graph with nodes for reactions and nodes for substances > >>> - A with nodes for substances only, with edges for interactions > >>> - A genetic interaction graph > >>> > >>> This list is roughly sorted form the one most easy to the most > >>> difficult to provide. > >>> > >>> Take care > >>> Oliver > >>> > >>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse > >>> <martin.preusse@googlemail.com (mailto:martin.preusse@googlemail.com)=""> > wrote: > >>>> Many biological pathway resourced provide their data in the BioPAX > format (http://www.biopax.org/index.php), a special XML format for > biological interaction networks. Examples are pathway commons ( > http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org ( > http://www.reactome.org/)). > >>>> > >>>> A JAVA library for parsing BioPAX files exists: > http://www.biopax.org/paxtools.php > >>>> > >>>> Has anybody used BioPAX files with R? Is it possible to read BioPAX > files in any R based graph structure? A solution similar to the KEGGgraph > package for KEGG pahways would be great, since more and more databases > start using BioPAX. > >>>> > >>>> > >>>> Any ideas are appreciated! > >>>> > >>>> Cheers > >>>> Martin > >>>> > >>>> _______________________________________________ > >>>> Bioconductor mailing list > >>>> Bioconductor@r-project.org mailto:Bioconductor@r-project.org) > >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >>>> > >>> > >>> > >>> > >>> > >>> > >>> -- > >>> Oliver Ruebenacker > >>> Bioinformatics Consultant ( > http://www.knowomics.com/wiki/Oliver_Ruebenacker) > >>> Knowomics, The Bioinformatics Network (http://www.knowomics.com) > >>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) > >>> > >> > >> > >> > > > > > > > > -- > > Oliver Ruebenacker > > Bioinformatics Consultant ( > http://www.knowomics.com/wiki/Oliver_Ruebenacker) > > Knowomics, The Bioinformatics Network (http://www.knowomics.com) > > SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hello, Thanks a lot for the endorsement! I will try to create a prototype in the next days, and then you can probably advice me on how to turn that into a package of desired quality. Take care Oliver On Fri, Jun 15, 2012 at 6:08 PM, Paul Shannon <pshannon at="" fhcrc.org=""> wrote: > Oliver and Martin, > > It would be very helpful to have easy access to BioPAX data in Biocondcutor. > > Just now, at the weekly Bioconductor dev-team meeting, we discussed your ideas, and want to endorse them. ?Oliver's proposal to parse the RDF triples into a data.frame has lots to recommend it. ?It would be immediately useful, and yet also allow for more sophisticated uses later. ?With these relationships in R, annotated as BioPAX data often are, we can imagine interested parties writing S4 classes which use the data, which might provide flexible querying capabilities, and be able to transform those triples into graphs and networks, for further computation and display. > > Please let us know if we can help. > > - Paul > > > On Jun 15, 2012, at 12:23 PM, Oliver Ruebenacker wrote: > >> ? ? Hello Martin, >> >> ?I don't have code in R to test yet, but I do have extensive >> experience handling BioPAX in Java, so I'm assuming reading BioPAX >> using RJava should not be too difficult. >> >> ?The best target format depends on what people would like to do with >> the data. For visualization, a bi-partite graph in a popular >> graph-layout package should be best. Is there any particular graph >> package in BioConductor or R in general you would recommend? >> >> ?For actual analysis, people probably have more specific requirements. >> >> ?BioPAX is a format based on RDF/OWL, which in turn is based on >> organizing data in triples, which could be stored in a three-column >> data frame (or perhaps a fourth column for data type). For example >> (incomplete, for illustration only): >> >> ?ex:mapPhosphorylization ? rdf:type ? bp:BiochemicalReaction. >> ?ex:atp ? rdf:type ? bp:SmallMolecule. >> ?ex:adp ? rdf:type ? bp:SmallMolecule. >> ?ex:map ? rdf:type ? bp:Protein. >> ?ex:mapPhosphorylized ? rdf:type ? bp:Protein. >> ?ex:mapPhosphorylization ? bp:left ? ex:atp. >> ?ex:mapPhosphorylization ? bp:left ? ex:map. >> ?ex:mapPhosphorylization ? bp:right ? ex:adp. >> ?ex:mapPhosphorylization ? bp:right ? ex:mapPhosphorylized. >> >> ? ? Take care >> ? ? Oliver >> >> On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse >> <martin.preusse at="" googlemail.com=""> wrote: >>> Hi Oliver, >>> >>> I think there is a lot interest in a bioconductor package! >>> >>> Personally, I would like to read pathways stored in the BioPAX format into any kind of graph. It's a philosophical question if reactions should have nodes or should sit on the edges :) So far I have not used any R graph package. But I assume there are some very generic packages which are flexible enough to support both direct and bi-partite pathway structure. I used e.g. the JUNG graph API for JAVA extensively. >>> >>> I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only a format to store a pathway. And I would like to bring it back into its natural form: a network! >>> >>> Do you have any code to test? I have used RJava before. All this RDF and XML file format stuff kind of puzzles me though ? :) >>> >>> Cheers >>> Martin >>> >>> >>> >>> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker: >>> >>>> Hello Martin, >>>> >>>> I'm currently looking into reading BioPAX into R using RJava and >>>> OpenRDF Sesame. If there is interest, I may be looking into submitting >>>> a package to BioConductor. >>>> >>>> It would be very helpful if you could tell me what you need the >>>> BioPAX data for, and in what form it would be best for you. Possible >>>> options are: >>>> >>>> - A data frame of the RDF/OWL triples >>>> - A graph of the RDF/OWL triples >>>> - A data frame with one row for each reaction-participant >>>> - A bi-partite graph with nodes for reactions and nodes for substances >>>> - A with nodes for substances only, with edges for interactions >>>> - A genetic interaction graph >>>> >>>> This list is roughly sorted form the one most easy to the most >>>> difficult to provide. >>>> >>>> Take care >>>> Oliver >>>> >>>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse >>>> <martin.preusse at="" googlemail.com="" (mailto:martin.preusse="" at="" googlemail.com)=""> wrote: >>>>> Many biological pathway resourced provide their data in the BioPAX format (http://www.biopax.org/index.php), a special XML format for biological interaction networks. Examples are pathway commons (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org (http://www.reactome.org/)). >>>>> >>>>> A JAVA library for parsing BioPAX files exists: http://www.biopax.org/paxtools.php >>>>> >>>>> Has anybody used BioPAX files with R? Is it possible to read BioPAX files in any R based graph structure? A solution similar to the KEGGgraph package for KEGG pahways would be great, since more and more databases start using BioPAX. >>>>> >>>>> >>>>> Any ideas are appreciated! >>>>> >>>>> Cheers >>>>> Martin >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org (mailto:Bioconductor at r-project.org) >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Oliver Ruebenacker >>>> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >>>> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >>>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >>>> >>> >>> >>> >> >> >> >> -- >> Oliver Ruebenacker >> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Oliver Ruebenacker Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Knowomics, The Bioinformatics Network (http://www.knowomics.com) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
ADD REPLY
0
Entering edit mode
Were you guys planning on using Rredland for this? On Sat, Jun 16, 2012 at 3:10 AM, Oliver Ruebenacker <curoli@gmail.com>wrote: > Hello, > > Thanks a lot for the endorsement! > > I will try to create a prototype in the next days, and then you can > probably advice me on how to turn that into a package of desired > quality. > > Take care > Oliver > > On Fri, Jun 15, 2012 at 6:08 PM, Paul Shannon <pshannon@fhcrc.org> wrote: > > Oliver and Martin, > > > > It would be very helpful to have easy access to BioPAX data in > Biocondcutor. > > > > Just now, at the weekly Bioconductor dev-team meeting, we discussed your > ideas, and want to endorse them. Oliver's proposal to parse the RDF > triples into a data.frame has lots to recommend it. It would be > immediately useful, and yet also allow for more sophisticated uses later. > With these relationships in R, annotated as BioPAX data often are, we can > imagine interested parties writing S4 classes which use the data, which > might provide flexible querying capabilities, and be able to transform > those triples into graphs and networks, for further computation and display. > > > > Please let us know if we can help. > > > > - Paul > > > > > > On Jun 15, 2012, at 12:23 PM, Oliver Ruebenacker wrote: > > > >> Hello Martin, > >> > >> I don't have code in R to test yet, but I do have extensive > >> experience handling BioPAX in Java, so I'm assuming reading BioPAX > >> using RJava should not be too difficult. > >> > >> The best target format depends on what people would like to do with > >> the data. For visualization, a bi-partite graph in a popular > >> graph-layout package should be best. Is there any particular graph > >> package in BioConductor or R in general you would recommend? > >> > >> For actual analysis, people probably have more specific requirements. > >> > >> BioPAX is a format based on RDF/OWL, which in turn is based on > >> organizing data in triples, which could be stored in a three- column > >> data frame (or perhaps a fourth column for data type). For example > >> (incomplete, for illustration only): > >> > >> ex:mapPhosphorylization rdf:type bp:BiochemicalReaction. > >> ex:atp rdf:type bp:SmallMolecule. > >> ex:adp rdf:type bp:SmallMolecule. > >> ex:map rdf:type bp:Protein. > >> ex:mapPhosphorylized rdf:type bp:Protein. > >> ex:mapPhosphorylization bp:left ex:atp. > >> ex:mapPhosphorylization bp:left ex:map. > >> ex:mapPhosphorylization bp:right ex:adp. > >> ex:mapPhosphorylization bp:right ex:mapPhosphorylized. > >> > >> Take care > >> Oliver > >> > >> On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse > >> <martin.preusse@googlemail.com> wrote: > >>> Hi Oliver, > >>> > >>> I think there is a lot interest in a bioconductor package! > >>> > >>> Personally, I would like to read pathways stored in the BioPAX format > into any kind of graph. It's a philosophical question if reactions should > have nodes or should sit on the edges :) So far I have not used any R graph > package. But I assume there are some very generic packages which are > flexible enough to support both direct and bi-partite pathway structure. I > used e.g. the JUNG graph API for JAVA extensively. > >>> > >>> I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only > a format to store a pathway. And I would like to bring it back into its > natural form: a network! > >>> > >>> Do you have any code to test? I have used RJava before. All this RDF > and XML file format stuff kind of puzzles me though :) > >>> > >>> Cheers > >>> Martin > >>> > >>> > >>> > >>> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker: > >>> > >>>> Hello Martin, > >>>> > >>>> I'm currently looking into reading BioPAX into R using RJava and > >>>> OpenRDF Sesame. If there is interest, I may be looking into submitting > >>>> a package to BioConductor. > >>>> > >>>> It would be very helpful if you could tell me what you need the > >>>> BioPAX data for, and in what form it would be best for you. Possible > >>>> options are: > >>>> > >>>> - A data frame of the RDF/OWL triples > >>>> - A graph of the RDF/OWL triples > >>>> - A data frame with one row for each reaction-participant > >>>> - A bi-partite graph with nodes for reactions and nodes for substances > >>>> - A with nodes for substances only, with edges for interactions > >>>> - A genetic interaction graph > >>>> > >>>> This list is roughly sorted form the one most easy to the most > >>>> difficult to provide. > >>>> > >>>> Take care > >>>> Oliver > >>>> > >>>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse > >>>> <martin.preusse@googlemail.com (mailto:martin.preusse@googlemail.com)=""> > wrote: > >>>>> Many biological pathway resourced provide their data in the BioPAX > format (http://www.biopax.org/index.php), a special XML format for > biological interaction networks. Examples are pathway commons ( > http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org ( > http://www.reactome.org/)). > >>>>> > >>>>> A JAVA library for parsing BioPAX files exists: > http://www.biopax.org/paxtools.php > >>>>> > >>>>> Has anybody used BioPAX files with R? Is it possible to read BioPAX > files in any R based graph structure? A solution similar to the KEGGgraph > package for KEGG pahways would be great, since more and more databases > start using BioPAX. > >>>>> > >>>>> > >>>>> Any ideas are appreciated! > >>>>> > >>>>> Cheers > >>>>> Martin > >>>>> > >>>>> _______________________________________________ > >>>>> Bioconductor mailing list > >>>>> Bioconductor@r-project.org mailto:Bioconductor@r-project.org) > >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>>> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >>>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Oliver Ruebenacker > >>>> Bioinformatics Consultant ( > http://www.knowomics.com/wiki/Oliver_Ruebenacker) > >>>> Knowomics, The Bioinformatics Network (http://www.knowomics.com) > >>>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) > >>>> > >>> > >>> > >>> > >> > >> > >> > >> -- > >> Oliver Ruebenacker > >> Bioinformatics Consultant ( > http://www.knowomics.com/wiki/Oliver_Ruebenacker) > >> Knowomics, The Bioinformatics Network (http://www.knowomics.com) > >> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > -- > Oliver Ruebenacker > Bioinformatics Consultant ( > http://www.knowomics.com/wiki/Oliver_Ruebenacker) > Knowomics, The Bioinformatics Network (http://www.knowomics.com) > SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hello Michael, I'm planning to use RJava to drive OpenRDf Sesame, with which I am very familiar. Take care Oliver On Sat, Jun 16, 2012 at 9:54 AM, Michael Lawrence <lawrence.michael at="" gene.com=""> wrote: > Were you guys planning on using Rredland for this? > > > On Sat, Jun 16, 2012 at 3:10 AM, Oliver Ruebenacker <curoli at="" gmail.com=""> > wrote: >> >> ? ? Hello, >> >> ?Thanks a lot for the endorsement! >> >> ?I will try to create a prototype in the next days, and then you can >> probably advice me on how to turn that into a package of desired >> quality. >> >> ? ? Take care >> ? ? Oliver >> >> On Fri, Jun 15, 2012 at 6:08 PM, Paul Shannon <pshannon at="" fhcrc.org=""> wrote: >> > Oliver and Martin, >> > >> > It would be very helpful to have easy access to BioPAX data in >> > Biocondcutor. >> > >> > Just now, at the weekly Bioconductor dev-team meeting, we discussed your >> > ideas, and want to endorse them. ?Oliver's proposal to parse the RDF triples >> > into a data.frame has lots to recommend it. ?It would be immediately useful, >> > and yet also allow for more sophisticated uses later. ?With these >> > relationships in R, annotated as BioPAX data often are, we can imagine >> > interested parties writing S4 classes which use the data, which might >> > provide flexible querying capabilities, and be able to transform those >> > triples into graphs and networks, for further computation and display. >> > >> > Please let us know if we can help. >> > >> > - Paul >> > >> > >> > On Jun 15, 2012, at 12:23 PM, Oliver Ruebenacker wrote: >> > >> >> ? ? Hello Martin, >> >> >> >> ?I don't have code in R to test yet, but I do have extensive >> >> experience handling BioPAX in Java, so I'm assuming reading BioPAX >> >> using RJava should not be too difficult. >> >> >> >> ?The best target format depends on what people would like to do with >> >> the data. For visualization, a bi-partite graph in a popular >> >> graph-layout package should be best. Is there any particular graph >> >> package in BioConductor or R in general you would recommend? >> >> >> >> ?For actual analysis, people probably have more specific requirements. >> >> >> >> ?BioPAX is a format based on RDF/OWL, which in turn is based on >> >> organizing data in triples, which could be stored in a three- column >> >> data frame (or perhaps a fourth column for data type). For example >> >> (incomplete, for illustration only): >> >> >> >> ?ex:mapPhosphorylization ? rdf:type ? bp:BiochemicalReaction. >> >> ?ex:atp ? rdf:type ? bp:SmallMolecule. >> >> ?ex:adp ? rdf:type ? bp:SmallMolecule. >> >> ?ex:map ? rdf:type ? bp:Protein. >> >> ?ex:mapPhosphorylized ? rdf:type ? bp:Protein. >> >> ?ex:mapPhosphorylization ? bp:left ? ex:atp. >> >> ?ex:mapPhosphorylization ? bp:left ? ex:map. >> >> ?ex:mapPhosphorylization ? bp:right ? ex:adp. >> >> ?ex:mapPhosphorylization ? bp:right ? ex:mapPhosphorylized. >> >> >> >> ? ? Take care >> >> ? ? Oliver >> >> >> >> On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse >> >> <martin.preusse at="" googlemail.com=""> wrote: >> >>> Hi Oliver, >> >>> >> >>> I think there is a lot interest in a bioconductor package! >> >>> >> >>> Personally, I would like to read pathways stored in the BioPAX format >> >>> into any kind of graph. It's a philosophical question if reactions should >> >>> have nodes or should sit on the edges :) So far I have not used any R graph >> >>> package. But I assume there are some very generic packages which are >> >>> flexible enough to support both direct and bi-partite pathway structure. I >> >>> used e.g. the JUNG graph API for JAVA extensively. >> >>> >> >>> I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only >> >>> a format to store a pathway. And I would like to bring it back into its >> >>> natural form: a network! >> >>> >> >>> Do you have any code to test? I have used RJava before. All this RDF >> >>> and XML file format stuff kind of puzzles me though ? :) >> >>> >> >>> Cheers >> >>> Martin >> >>> >> >>> >> >>> >> >>> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker: >> >>> >> >>>> Hello Martin, >> >>>> >> >>>> I'm currently looking into reading BioPAX into R using RJava and >> >>>> OpenRDF Sesame. If there is interest, I may be looking into >> >>>> submitting >> >>>> a package to BioConductor. >> >>>> >> >>>> It would be very helpful if you could tell me what you need the >> >>>> BioPAX data for, and in what form it would be best for you. Possible >> >>>> options are: >> >>>> >> >>>> - A data frame of the RDF/OWL triples >> >>>> - A graph of the RDF/OWL triples >> >>>> - A data frame with one row for each reaction-participant >> >>>> - A bi-partite graph with nodes for reactions and nodes for >> >>>> substances >> >>>> - A with nodes for substances only, with edges for interactions >> >>>> - A genetic interaction graph >> >>>> >> >>>> This list is roughly sorted form the one most easy to the most >> >>>> difficult to provide. >> >>>> >> >>>> Take care >> >>>> Oliver >> >>>> >> >>>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse >> >>>> <martin.preusse at="" googlemail.com="">> >>>> (mailto:martin.preusse at googlemail.com)> wrote: >> >>>>> Many biological pathway resourced provide their data in the BioPAX >> >>>>> format (http://www.biopax.org/index.php), a special XML format for >> >>>>> biological interaction networks. Examples are pathway commons >> >>>>> (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org >> >>>>> (http://www.reactome.org/)). >> >>>>> >> >>>>> A JAVA library for parsing BioPAX files exists: >> >>>>> http://www.biopax.org/paxtools.php >> >>>>> >> >>>>> Has anybody used BioPAX files with R? Is it possible to read BioPAX >> >>>>> files in any R based graph structure? A solution similar to the KEGGgraph >> >>>>> package for KEGG pahways would be great, since more and more databases start >> >>>>> using BioPAX. >> >>>>> >> >>>>> >> >>>>> Any ideas are appreciated! >> >>>>> >> >>>>> Cheers >> >>>>> Martin >> >>>>> >> >>>>> _______________________________________________ >> >>>>> Bioconductor mailing list >> >>>>> Bioconductor at r-project.org (mailto:Bioconductor at r-project.org) >> >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>>>> Search the archives: >> >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Oliver Ruebenacker >> >>>> Bioinformatics Consultant >> >>>> (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >> >>>> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >> >>>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >> >>>> >> >>> >> >>> >> >>> >> >> >> >> >> >> >> >> -- >> >> Oliver Ruebenacker >> >> Bioinformatics Consultant >> >> (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >> >> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >> >> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >> >> >> >> _______________________________________________ >> >> Bioconductor mailing list >> >> Bioconductor at r-project.org >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> Search the archives: >> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> >> >> >> -- >> Oliver Ruebenacker >> Bioinformatics Consultant >> (http://www.knowomics.com/wiki/Oliver_Ruebenacker) >> Knowomics, The Bioinformatics Network (http://www.knowomics.com) >> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- Oliver Ruebenacker Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Knowomics, The Bioinformatics Network (http://www.knowomics.com) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
ADD REPLY

Login before adding your answer.

Traffic: 665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6