STRING_id mismatch with STRINGdb
0
0
Entering edit mode
Samuel Lee • 0
@samuel-lee-18221
Last seen 9 months ago
Melbourne

Hi, I'm trying to use STRINGdb for parsing the STRING PPI network in R. However, I've run into an issue where the get_neighbors() method fails with the following error Error in as.igraph.vs(graph, v) : Invalid vertex names. It seems non of the mapped STRING_id's are included in the graph despite the network being able to be plotted.

My end goal is to be able to programmatically retrieve neighbors (and their interactions) of an arbitrary path length from specified genes.

Any insight as to how I can fix this, or if there is an alternative method I should use, would be appreciated.

library(STRINGdb)

sbd <- STRINGdb$new(
  version = "10",
  species = 9606, 
  score_threshold = 0, 
  input_directory = "A:/~~~~~~~"
  )

data(diff_exp_example1)
# example data from STRINGdb vignette 

example1_mapped <- sbd$map(diff_exp_example1, "gene", removeUnmappedRows = TRUE )

dim(example1_mapped)
# [1] 17748     4

sbd$plot_network(example1_mapped$STRING_id[1:100])
# This works, 100 vertices, 106 edges

sbd$get_neighbors(example1_mapped$STRING_id[1:100])
# Error in as.igraph.vs(graph, v) : Invalid vertex names

sgrph <- sbd$get_graph()

length(igraph::V(sgrph))
# [1] 19247

sum(example1_mapped$STRING_id %in% igraph::V(sgrph))
# [1] 0

sessionInfo()
# R version 3.5.3 (2019-03-11)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows >= 8 x64 (build 9200)
# 
# Matrix products: default
# 
# locale:
#   [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
# [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] STRINGdb_1.22.0
# 
# loaded via a namespace (and not attached):
#   [1] igraph_1.2.4       hash_2.2.6.1       Rcpp_1.0.1         magrittr_1.5       bit_1.1-14         blob_1.1.1        
# [7] plyr_1.8.4         caTools_1.17.1.2   tools_3.5.3        png_0.1-7          plotrix_3.7-4      KernSmooth_2.23-15
# [13] DBI_1.0.0          gtools_3.8.1       yaml_2.2.0         bit64_0.9-7        digest_0.6.18      RColorBrewer_1.1-2
# [19] bitops_1.0-6       RCurl_1.95-4.12    memoise_1.1.0      RSQLite_2.1.1      gsubfn_0.7         gdata_2.18.0      
# [25] compiler_3.5.3     gplots_3.0.1.1     chron_2.3-53       sqldf_0.4-11       proto_1.0.0        pkgconfig_2.0.2 
STRINGdb igraph • 1.5k views
ADD COMMENT
0
Entering edit mode

I realised that the line

sum(example1_mapped$STRING_id %in% igraph::V(sgrph))

should be

sum(example1_mapped$STRING_id %in% get.vertex.attribute(sgrph, "name"))
# [1] 17528

which does show that the vertex names are in fact comparable... sadly it doesn't get me any closer to working out why the get_neighbors() method fails.

ADD REPLY
0
Entering edit mode

Seems that igraph crashes when one of the requested nodes (proteins) is not connected to anything (is not part of the graph) The solution would be to query get_neighbors method with one node at a time and use try/catch around the call.

We are in a process of updating the whole package. In the next release this problem should be solved.

ADD REPLY

Login before adding your answer.

Traffic: 582 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6