Bioconductor Digest, Vol 83, Issue 4
0
0
Entering edit mode
@lavinia-gordon-2959
Last seen 10.3 years ago
Hi Waverley In your input you also have Ensembl protein ids (e.g. ENSP00000231338). You could extract these, use them as your input list, use biomaRt to match these up with molecular function GO ids, calculate the frequency of the ids, e.g molecular_function%binding%protein binding 10 molecular_function%molecular transducer activity 6 then: pie.mf <- c(10,6,...) names(pie.mf) <- c("binding%protein binding", "molecular transducer activity", ...) pie(pie.mf) You could also use your SWISS-PROT or TREMBL ids (again, via biomaRt). Note that genes can often have multiple GO terms associated to them. Have a look at some of the other Bioconductor GO packages ([1]http://bioconductor.org/packages/2.5/GO.html and http://www.bioconductor.org/packages/release/bioc/html/topGO.html)w hich suggest some other ways of visualizing GOs. regards Lavinia Gordon. Waverley @ Palo Alto wrote: > Hi, > > I have a list of IPI gene IDs. ?I want to find out whether there is a > package which can map the gene ontology to these IPIs, and plot the > pie chart to demonstrate the molecular function distributions. > > The input is like the following gene IPI IDs: > IPI:IPI00008860.1|SWISS- PROT:Q9BXJ4-1|TREMBL:Q542Y2|ENSEMBL:ENSP0000023133 8;EN > IPI:IPI00019922.5|SWISS- PROT:Q8N0Y2-1|TREMBL:Q53F81|ENSEMBL:ENSP0000033886 0;ENSP00000375594|REFSEQ:NP_060807|H-INV:HIT000028861|VEGA:OTTHUM P00000078 377 > Tax_Id=9606 Gene_Symbol=ZN > IPI:IPI00647423.2|SWISS- PROT:Q8N819-1|REFSEQ:NP_001073870|VEGA:OTTHUMP0000 0076687 > Tax_Id=9606 Gene_Symbol=FLJ40125 Isoform 1 of > IPI:IPI00219000.2|SWISS- PROT:P27658|TREMBL:Q53XI6|ENSEMBL:ENSP00000261037| REFS > IPI:IPI00291878.4|SWISS- PROT:P35247|ENSEMBL:ENSP00000361366|REFSEQ:NP_0030 10|H-INV:HIT000039466|VEGA:OTTHUMP00000019944 > IPI:IPI00013945.1|SWISS- PROT:P07911-1|TREMBL:Q8NHW8|ENSEMBL:ENSP0000030627 9|RE > IPI:IPI00000634.1|SWISS- PROT:Q16204|TREMBL:Q6GSG7|ENSEMBL:ENSP00000263102| REFS > > I want to plot the pie chart of these gene distribution in the GO > molecular function as a pie chart. ?An example is shown in the > following link [2]http://www.proteomesci.com/content/7/1/6/figure/F2?highres=y > > > Can some one help? Not sure that it is this easy. The IPI are protein identifiers. GO categories classify genes. Neither the mapping from protein to gene or gene to GO category is 1:1. GO categories form a hierarchy. So there are significant decisions to be made in representing IPI identifiers in a pie chart of GO terms. Bioconductor maintains 'org' and 'GO' database packages that provide the necessary link between IPI protein ids and GO gene ontology categories, via ENTREZ gene ids. Code might look like ?## once only, to install packages ?source('http://bioconductor.org/biocLite.R') ?biocLite('org.Hs.eg.db', 'GO.db') ?## from IPI to ENTREZ id, not 1:1 ?library(org.Hs.eg.db) ?ipi2eg = revmap(eapply(org.Hs.eg.db, names)) ## NOT 1:1 map ?## Assume ipiIds is, e.g., c('IPI00008860', 'IPI00019922') ?egIds = revmap(ipi2eg[ipiIds]) ?## get GO terms, also not 1:1 ?goIds = eapply(org.Hs.egGO[names(egIds)], names) You're still left with the problem of resolving multiple mappings and the hierarchical relationship between GO terms. Asking on the Bioconductor mailing list ?[3]http://bioconductor.org/docs/mailList.html is likely to lead to helpful answers. Martin Lavinia Gordon Research Officer Bioinformatics Murdoch Childrens Research Institute Royal Children's Hospital Flemington Road Parkville Victoria 3052 Australia telephone: +61 3 8341 6221 [4]www.mcri.edu.au This e-mail and any attachments to it (the "Communication") are, unless otherwise stated, confidential, may contain copyright material and is for the use only of the intended recipient. If you receive the Communication in error, please notify the sender immediately by return e-mail, delete the Communication and the return e-mail, and do not read, copy, retransmit or otherwise deal with it. Any views expressed in the Communication are those of the individual sender only, unless expressly stated to be those of Murdoch Childrens Research Institute (MCRI) ABN 21 006 566 972 or any of its related entities. MCRI does not accept liability in connection with the integrity of or errors in the Communication, computer virus, data corruption, interference or delay arising from or in respect of the Communication. Please consider the environment before printing this email References 1. http://bioconductor.org/packages/2.5/GO.html%20and%20http://www. bioconductor.org/packages/release/bioc/html/topGO.html 2. http://www.proteomesci.com/content/7/1/6/figure/F2?highres=y 3. http://bioconductor.org/docs/mailList.html 4. http://www.mcri.edu.au/
GO biomaRt Category GO biomaRt Category • 878 views
ADD COMMENT

Login before adding your answer.

Traffic: 659 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6