Hello! I am using GOstats to do gene set enrichment and I would like to make subset GO term graphs (graphNEL) to then import into Cytoscape. My problem is that when I select desirable nodes from the html report or a CSV file made from the summary I get an error that the nodes aren't present. The nodes could be in any of the other graphNEL made by termGraphs Maybe this is not the best way to do this. Is there a way to get the graphNEL for a group of nodes (GOBPIDs) in an easier way?
params1 <- new("GOHyperGParams", geneIds=q_selectedEntrezIds, universeGeneIds=chipEntrezUniverse, annotation="org.Rn.eg.db", ontology="BP", pvalueCutoff=hgCutoff, conditional=FALSE, testDirection="over")
hgOver1 <- hyperGTest(params1)
## Generate hmtl result files
htmlReport(hgOver1, file=paste(dt,"_MH_malevfemale1_hgOver_BP_q_v_all_over.html", sep=""))
# Get the tables out
dat <- as.data.frame(summary(hgOver1))
write.table(dat, file=paste(dt, "_MH_malevfemale1_hgOver_BP_q_v_all_over.txt", sep=""), sep="\t", col.names=T, row.names=T)
# Ok get the graph out
y1 = termGraphs(hgOver1, use.terms=FALSE, pvalue=hgCutoff)
a1 <- y1$`1`
sub_list <- read.csv("GOBP_list_subset.txt", sep="\t", header=T)
sub <- as.vector(unlist(sub_list$GOBPID))
## I've tried a number of ways to subset and the common error (if it works) is snodes missing. I can confirm that the nodes exist as results in the html report.
test <- subGraph(sub, a1)
Error in subGraph(sub, a1) :
'snodes' contains nodes not in graph: ‘GO:0051649’, ‘GO:0051654’, ‘GO:0051668’, ‘GO:0071294’, ‘GO:0071420’, ‘GO:0071702’, ‘GO:0071705’, ‘GO:0072384’, ‘GO:0072657’, ‘GO:0097106’, ‘GO:0097112’, ‘GO:0098693’, ‘GO:0098814’, ‘GO:0098930’, ‘GO:0099185’, ‘GO:0099538’, ‘GO:0099552’, ‘GO:0099553’, ‘GO:0140056’, ‘GO:0140238’, ‘GO:1900273’, ‘GO:1901385’, ‘GO:1902632’, ‘GO:1903421’, ‘GO:1905874’, ‘GO:1990778’, ‘GO:1903937’, ‘GO:1904071’, ‘GO:1990708’, ‘GO:0098953’, ‘GO:0098970’, ‘GO:0099628’, ‘GO:0007196’, ‘GO:0099011’, ‘GO:0099525’, ‘GO:0045162’, ‘GO:0098943’, ‘GO:0006682’, ‘GO:0019375’, ‘GO:0031632’, ‘GO:1901634’, ‘GO:1990926’, ‘GO:1990504’, ‘GO:0046684’, ‘GO:0099502’, ‘GO:0106020’, ‘GO:1902474’, ‘GO:0007158’, ‘GO:0048791’, ‘GO:2000969’, ‘GO:0014051’, ‘GO:1990709’, ‘GO:0021578’,
## OR this
testgr <- inducedTermGraph(hgOver1, id=sub)
Error in subGraph(wantedNodes, g) :
'snodes' contains nodes not in graph: ‘GO:0051654’, ‘GO:0071294’, ‘GO:0071420’, ‘GO:0072384’, ‘GO:0097112’, ‘GO:0098930’, ‘GO:0099185’, ‘GO:0099538’, ‘GO:0099552’, ‘GO:0099553’, ‘GO:1901385’, ‘GO:1903421’, ‘GO:1905874’, ‘GO:1903937’, ‘GO:1904071’, ‘GO:0099011’, ‘GO:0099525’, ‘GO:0006682’, ‘GO:0019375’, ‘GO:0031632’, ‘GO:1901634’, ‘GO:1990926’, ‘GO:1990504’, ‘GO:0099502’, ‘GO:0106020’, ‘GO:1902474’, ‘GO:0048791’, ‘GO:2000969’, ‘GO:1990709’, ‘GO:0046476’, ‘GO:0099541’, ‘GO:0099542’, ‘GO:0031630’, ‘GO:1901632’, ‘GO:0006681’, ‘GO:0019374’, ‘GO:0048790’, ‘GO:0061669’, ‘GO:0016082’, ‘GO:0014049’, ‘GO:0048172’, ‘GO:0098917’, ‘GO:0021516’, ‘GO:0034776’, ‘GO:0099151’, ‘GO:0031340’, ‘GO:0033604’, ‘GO:0045956’, ‘GO:0097091’
sessionInfo( )
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] Rgraphviz_2.40.0 xtable_1.8-4 genefilter_1.78.0
[4] GOstats_2.62.0 graph_1.74.0 Category_2.62.0
[7] Matrix_1.5-3 annotate_1.74.0 XML_3.99-0.12
[10] GO.db_3.15.0 org.Rn.eg.db_3.15.0 AnnotationDbi_1.58.0
[13] IRanges_2.30.1 S4Vectors_0.34.0 Biobase_2.56.0
[16] BiocGenerics_0.42.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.9 compiler_4.2.1 GenomeInfoDb_1.32.4
[4] XVector_0.36.0 bitops_1.0-7 tools_4.2.1
[7] zlibbioc_1.42.0 bit_4.0.5 RSQLite_2.2.19
[10] memoise_2.0.1 lattice_0.20-45 pkgconfig_2.0.3
[13] png_0.1-7 rlang_1.0.6 DBI_1.1.3
[16] cli_3.4.1 fastmap_1.1.0 GenomeInfoDbData_1.2.8
[19] httr_1.4.4 Biostrings_2.64.1 vctrs_0.5.1
[22] bit64_4.0.5 GSEABase_1.58.0 R6_2.5.1
[25] survival_3.4-0 RBGL_1.72.0 blob_1.2.3
[28] splines_4.2.1 KEGGREST_1.36.3 AnnotationForge_1.38.1
[31] RCurl_1.98-1.9 cachem_1.0.6 crayon_1.5.2
Thanks!
Hi James! Thank you! So interestingly if you make an html report or csv from the summary it will have GO terms that you would like as nodes in a directed acyclic graph. The subset comes from the summary. when I try to make a subset graph, I get 'contains nodes not in graph error' even though the nodes were taken from the summary. Is this not possible. Is there another way to make DAG from GO IDs using a bioconductor resource (GO.db)?
Did the code I provided not do what you are asking?
Hello! that is the code I used, except I am looking at Rat data. Its hard without real data. So there are GOBPID that are in the html reports and summary that are not accessible in the graphNEL. There can be a number of subgraphs
1
2
and so on. Which there isn't really a good explanation for them, but in the one with the largest numbers of nodes you would expect to have nodes with the largest odds ratio, but they are not there AND they are not in any of the subgraphs. Basically, I was wondering if there was a way to 'hack' it to have all of the nodes for GO BP in one graphNEL to then subset however. I really appreciate your help.I still have no idea what you are talking about. It works identically for rat.
Those are either the
graphNEL
for the first GO term, or thegraphNEL
for all of the GO terms in the 'top' object.Hi James! I apologize for not being clear. Is there a way to get all of the potential nodes into a single graphNEL which can subsequently be subset? In your example above it splits the top 6 into a number of graphs.
OK, maybe I get what you want. I think you want all of the nodes from a summary object in one graph, right?
The above assumes you have already run the code based on the Rat
orgDb
that I already supplied.Hi James! yup that was it. thank you!