Entering edit mode
Graphite's native Biocarta pathways seem to have a different node list
than that given by the Biocarta "PROTEIN LIST" link on Biocarta
pathway pages (presumably what the pathway authors consider the 'true'
pathway membership).
There seem to be 2 categories of difference:
(1) Some genes listed by Biocarta are absent from graphite's version
(see ??? marks in the example below).
(2) Because the native format nodes are annotated variously, it's
necessary to do a node conversion. In particular, Biocarta's "PROTEIN
LIST" gives _specific_ members of enzyme families, whereas graphite
seems to replace EC numbers with all family members. However, I have
trouble explaining how some enzymes are on/off the list (see --- marks
in the example below).
Am I misinterpreting things? If not, is there any way to get pathway
graphs with node lists more closely matching what Biocarta lists
online?
Thanks,
Hamid Bolouri
--
http://labs.fhcrc.org/bolouri
Example:
> biocarta[["epo signaling pathway"]]
"epo signaling pathway" pathway from BioCarta
Number of nodes = 10
Number of edges = 24
Type of identifiers = native
Retrieved on = 2011-05-12
> nodes(biocarta[["epo signaling pathway"]])
[1] "EntrezGene:2056" "EntrezGene:2057"
[3] "EntrezGene:2885" "EntrezGene:3265"
[5] "EntrezGene:6464" "EntrezGene:6654"
[7] "EnzymeConsortium:2.7.1.112" "EnzymeConsortium:3.1.3.48"
[9] "EnzymeConsortium:3.1.4.11" "STAT5"
> PE <- convertIdentifiers(biocarta[["epo signaling
pathway"]],type="entrez")
> nodes(PE)
[1] "2056" "2057" "2885" "3265" "6464" "6654" "52"
"993"
[9] "994" "995" "1843" "1844" "1845" "1846" "1847"
"1848"
[17] "1849" "1850" "1852" "5770" "5777" "5778" "5781"
"5787"
[25] "5788" "5792" "5795" "5797" "5798" "5799" "5801"
"5803"
[33] "8555" "8556" "11072" "11221" "56940" "80824" "84867"
"5330"
[41] "5331" "5332" "5333" "5335" "5336" "23236" "84812"
"113026"
> PS <- convertIdentifiers(biocarta[["epo signaling
pathway"]],type="symbol")
> nodes(PS)
[1] "EPO" "EPOR" "GRB2" "HRAS" "SHC1" "SOS1" "ACP1"
"CDC25A"
[9] "CDC25B" "CDC25C" "DUSP1" "DUSP2" "DUSP3" "DUSP4" "DUSP5"
"DUSP6"
[17] "DUSP7" "DUSP8" "DUSP9" "PTPN1" "PTPN6" "PTPN7" "PTPN11"
"PTPRB"
[25] "PTPRC" "PTPRF" "PTPRJ" "PTPRM" "PTPRN" "PTPRN2" "PTPRR"
"PTPRZ1"
[33] "CDC14B" "CDC14A" "DUSP14" "DUSP10" "DUSP22" "DUSP16" "PTPN5"
"PLCB2"
[41] "PLCB3" "PLCB4" "PLCD1" "PLCG1" "PLCG2" "PLCB1" "PLCD4"
"PLCD3"
Compare the above with what I get from:
http://www.biocarta.com/pathfiles/PathwayProteinList.asp?showPFID=69
<nb the="" header="" is="" mine="" &="" i="" reordered="" the="" table="" to="" group="" similar="" cases="">
<genedescription entrezid="" ***="=HBcomment">
erythropoietin 2056 ***
erythropoietin receptor 2057 ***
growth factor receptor-bound protein 2 2885 ***
son of sevenless homolog 1 (Drosophila) 6654 ***
v-Ha-ras Harvey rat sarcoma viral oncogene homolog 3265 ***
signal transducer and activator of transcription 5A 6776 ***
signal transducer and activator of transcription 5B 6777 ***
SHC (Src homology 2 domain containing) transforming protein 1 6464
***
v-fos FBJ murine osteosarcoma viral oncogene homolog 2353 ???
v-raf-1 murine leukemia viral oncogene homolog 1 5894 ???
ELK1, member of ETS oncogene family 2002 ???
jun oncogene 3725 ???
casein kinase 2, alpha 1 polypeptide 1457 ???
Janus kinase 2 (a protein tyrosine kinase) 3717 ???
mitogen-activated protein kinase 3 5595 ---
mitogen-activated protein kinase 8 5599 ---
mitogen-activated protein kinase kinase 1 5604 ---
phospholipase C, gamma 1 5335 ok
protein tyrosine phosphatase, non-receptor type 6 5777 ok
HBcomment: ***== in graphite, ???==missing from graphite,
---==specific enzymes in Biocarta are mapped to large (& urnrelated?)
families in graphite
###
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] graphite_1.2.0 AnnotationDbi_1.18.1 Biobase_2.16.0
[4] BiocGenerics_0.2.0 RSQLite_0.11.1 DBI_0.2-5
[7] graph_1.34.0
loaded via a namespace (and not attached):
[1] IRanges_1.14.3 org.Hs.eg.db_2.7.1 stats4_2.15.0
tools_2.15.0
###