Entering edit mode
Sara JC Gosline
▴
60
@sara-jc-gosline-3831
Last seen 10.2 years ago
Hello again,
I have recently installed and used RpsiXML to successfully parse the
latest xml files from intact. However, when I try the same functions
with the latest version of Biogrid (to obtain assay-specific
interactions instead of experiment-specific), I get a graph with a
single node ?NA? and 1 interaction. SessionInfo is at the end of the
email.
***Parsing xml files to graph:
I used the ?PCA? file since it is relatively short:
>
g<-psimi25XML2Graph('../biogrid/psiml25/BIOGRID-SYSTEM-
PCA-2.0.59.psi25.xml',BIOGRID.PSIMI25,type='interaction',verbose=T)
1 Entries found
Parsing entry 1
Parsing experiments: ...............................................
Parsing interactors:
100% ========================================>
Parsing interactions:
100% ========================================>
> g
[1] "psimi25Graph"
attr(,"package")
[1] "RpsiXML"
> nodes(g)
[1] "NA"
> edges(g)
$`NA`
[1] "NA"
***Parsing xml file without graph:
To determine if this is something wrong with the parsing, I redo the
parsing without formatting to a graph object:
>
g<-parsePsimi25Interaction('../biogrid/psiml25/BIOGRID-SYSTEM-
PCA-2.0.59.psi25.xml',BIOGRID.PSIMI25,verbose=T)
Here is the first bit of output:
> g
==================================
interaction entry ( 2009-11-25 ):
==================================
[ organism ]: Arabidopsis thaliana Saccharomyces cerevisiae
Schizosaccharomyces pombe
[ taxonomy ID ]: 3702 4932 4896
[ interactors ]: there are 1214 interactors in total, here are the
first
few ones:
sourceDb sourceId shortLabel uniprotId organismName taxId
<na> "" "1" "BZR1" NA "Arabidopsis thaliana" "3702"
<na> "" "2" "GRF6" NA "Arabidopsis thaliana" "3702"
<na> "" "3" "FUN14" NA "Saccharomyces cerevisiae" "4932"
<na> "" "4" "UIP4" NA "Saccharomyces cerevisiae" "4932"
<na> "" "5" "ALO1" NA "Saccharomyces cerevisiae" "4932"
<na> "" "6" "SPO7" NA "Saccharomyces cerevisiae" "4932"
...
[ interactions ]: there are 2736 interactions in total, here are the
first few ones:
[[1]]
interaction ( NA ):
---------------------------------
[ source database ]:
[ source experiment ID ]: 1
[ interaction type ]: protein complementation assay
[ experiment ]: pubmed 17681130
[ participant ]: NA NA
[ bait ]: 1
[ bait UniProt ]: NA
[ prey ]: 2
[ prey UniProt ]: NA
So the interactors and interactions are being parsed correctly, but
not
being retrieved properly. When I look at the attributes of each
interaction I get mostly NA?s:
attributes(g at interactions[[1]])
$sourceDb
[1] ""
$sourceId
[1] NA
$interactionType
[1] "protein complementation assay"
$expPubMed
[1] "17681130"
$expSourceId
[1] "1"
$confidenceValue
[1] NA
$participant
<na> <na>
NA NA
$bait
[1] "1"
$baitUniProt
[1] NA
$prey
[1] "2"
$preyUniProt
[1] NA
$inhibitor
[1] NA
$neutralComponent
[1] NA
$class
[1] "psimi25Interaction"
attr(,"package")
[1] "RpsiXML"
***Conclusion:
Is there an easy workaround for this? Maybe where I can manually look
up
identifiers?
Thanks,
sara
***SessionInfo:
> sessionInfo()
R version 2.8.1 (2008-12-22)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US
.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N
AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI
FICATION=C
attached base packages:
[1] grid splines tools stats graphics grDevices utils
[8] datasets methods base
other attached packages:
[1] gtools_2.5.0-1 multicore_0.1-3 ppiStats_1.8.0
[4] RColorBrewer_1.0-2 lattice_0.17-17 ScISI_1.14.0
[7] apComplex_2.8.0 ppiData_0.1.13 Rgraphviz_1.20.4
[10] org.Sc.sgd.db_2.2.6 GOstats_2.8.0 Category_2.8.4
[13] genefilter_1.22.0 survival_2.34-1 GO.db_2.2.5
[16] RSQLite_0.7-1 DBI_0.2-4 RpsiXML_1.0.0
[19] RBGL_1.20.0 hypergraph_1.14.0 graph_1.20.0
[22] XML_2.3-0 annotate_1.20.1 xtable_1.5-6
[25] AnnotationDbi_1.4.3 Biobase_2.2.2
loaded via a namespace (and not attached):
[1] cluster_1.11.11 GSEABase_1.4.0