Entering edit mode
Lavinia Gordon
▴
480
@lavinia-gordon-2959
Last seen 10.3 years ago
Hi Waverley
In your input you also have Ensembl protein ids (e.g.
ENSP00000231338). You
could extract these, use them as your input list, use biomaRt to
match these
up with molecular function GO ids, calculate the frequency of the
ids, e.g
molecular_function%binding%protein binding 10
molecular_function%molecular transducer activity 6
then:
pie.mf <- c(10,6,...)
names(pie.mf) <- c("binding%protein binding", "molecular
transducer
activity", ...)
pie(pie.mf)
You could also use your SWISS-PROT or TREMBL ids (again, via
biomaRt). Note
that genes can often have multiple GO terms associated to them.
Have a look at some of the other Bioconductor GO
packages
([1]http://bioconductor.org/packages/2.5/GO.html
and
http://www.bioconductor.org/packages/release/bioc/html/topGO.html)w
hich
suggest some other ways of visualizing GOs.
regards
Lavinia Gordon.
Waverley @ Palo Alto wrote:
> Hi,
>
> I have a list of IPI gene IDs. ?I want to find out whether
there is a
> package which can map the gene ontology to these IPIs, and plot
the
> pie chart to demonstrate the molecular function distributions.
>
> The input is like the following gene IPI IDs:
>
IPI:IPI00008860.1|SWISS-
PROT:Q9BXJ4-1|TREMBL:Q542Y2|ENSEMBL:ENSP0000023133
8;EN
>
IPI:IPI00019922.5|SWISS-
PROT:Q8N0Y2-1|TREMBL:Q53F81|ENSEMBL:ENSP0000033886
0;ENSP00000375594|REFSEQ:NP_060807|H-INV:HIT000028861|VEGA:OTTHUM
P00000078
377
> Tax_Id=9606 Gene_Symbol=ZN
>
IPI:IPI00647423.2|SWISS-
PROT:Q8N819-1|REFSEQ:NP_001073870|VEGA:OTTHUMP0000
0076687
> Tax_Id=9606 Gene_Symbol=FLJ40125 Isoform 1 of
>
IPI:IPI00219000.2|SWISS-
PROT:P27658|TREMBL:Q53XI6|ENSEMBL:ENSP00000261037|
REFS
>
IPI:IPI00291878.4|SWISS-
PROT:P35247|ENSEMBL:ENSP00000361366|REFSEQ:NP_0030
10|H-INV:HIT000039466|VEGA:OTTHUMP00000019944
>
IPI:IPI00013945.1|SWISS-
PROT:P07911-1|TREMBL:Q8NHW8|ENSEMBL:ENSP0000030627
9|RE
>
IPI:IPI00000634.1|SWISS-
PROT:Q16204|TREMBL:Q6GSG7|ENSEMBL:ENSP00000263102|
REFS
>
> I want to plot the pie chart of these gene distribution in the
GO
> molecular function as a pie chart. ?An example is shown in the
> following link
[2]http://www.proteomesci.com/content/7/1/6/figure/F2?highres=y
>
>
> Can some one help?
Not sure that it is this easy. The IPI are protein identifiers.
GO
categories classify genes. Neither the mapping from protein to
gene or
gene to GO category is 1:1. GO categories form a hierarchy. So
there are
significant decisions to be made in representing IPI identifiers
in a
pie chart of GO terms.
Bioconductor maintains 'org' and 'GO' database packages that
provide the
necessary link between IPI protein ids and GO gene ontology
categories,
via ENTREZ gene ids. Code might look like
?## once only, to install packages
?source('http://bioconductor.org/biocLite.R')
?biocLite('org.Hs.eg.db', 'GO.db')
?## from IPI to ENTREZ id, not 1:1
?library(org.Hs.eg.db)
?ipi2eg = revmap(eapply(org.Hs.eg.db, names)) ## NOT 1:1 map
?## Assume ipiIds is, e.g., c('IPI00008860', 'IPI00019922')
?egIds = revmap(ipi2eg[ipiIds])
?## get GO terms, also not 1:1
?goIds = eapply(org.Hs.egGO[names(egIds)], names)
You're still left with the problem of resolving multiple mappings
and
the hierarchical relationship between GO terms. Asking on the
Bioconductor mailing list
?[3]http://bioconductor.org/docs/mailList.html
is likely to lead to helpful answers.
Martin
Lavinia Gordon
Research Officer
Bioinformatics
Murdoch Childrens Research Institute
Royal Children's Hospital
Flemington Road Parkville Victoria 3052 Australia
telephone: +61 3 8341 6221
[4]www.mcri.edu.au
This e-mail and any attachments to it (the "Communication") are,
unless
otherwise stated, confidential, may contain copyright material and
is for
the use only of the intended recipient. If you receive the
Communication in
error, please notify the sender immediately by return e-mail,
delete the
Communication and the return e-mail, and do not read, copy,
retransmit or
otherwise deal with it. Any views expressed in the Communication
are those
of the individual sender only, unless expressly stated to be those
of
Murdoch Childrens Research Institute (MCRI) ABN 21 006 566 972 or
any of its
related entities. MCRI does not accept liability in connection with
the
integrity of or errors in the Communication, computer virus,
data
corruption, interference or delay arising from or in respect of
the
Communication.
Please consider the environment before printing this email
References
1. http://bioconductor.org/packages/2.5/GO.html%20and%20http://www.
bioconductor.org/packages/release/bioc/html/topGO.html
2. http://www.proteomesci.com/content/7/1/6/figure/F2?highres=y
3. http://bioconductor.org/docs/mailList.html
4. http://www.mcri.edu.au/