Hi,
I wanted to replicate the GO analysis done thru' (http://geneontology.org/), but using R/Bioconductor. For the 'original' GO analysis, I just copy and pasted my genes on the website, used all defaults (i.e. 'biological process', 'Homo Sapiens', panther.db), and click on 'Launch'. This resulted in a set of significant GO terms.
I am now trying to replicate this anaylsis with R/Bioconductor with topGO package, but I get an error. My code is:
library(topGO)
library(PANTHER.db)
pthOrganisms(PANTHER.db) <- "HUMAN"
PANTHER.db
allpanther <- keys(PANTHER.db,keytype="ENTREZ")
## myentrezGenes - my genes of interest
idx <- allpanther %in% myentrezGenes
genesidx <- factor(as.integer(aidx))
names(genesidx) <- allpanther
tgd <- new( "topGOdata", ontology="BP", allGenes = genesidx, nodeSize=5,
annot=annFUN.org, mapping="PANTHER.db")
> Building most specific GOs .....
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'conn' in selecting a method for function 'dbGetQuery': object 'PANTHER_dbconn' not found
My questions:
Is this the best method to try to replicate results I get from the GO website (http://geneontology.org/)? Or is there an API that I can use to programmatically get the results? Or some other bioconductor package?
How can I rectify my code?
Thanks for your help!!
My sessioninfo is:
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] PANTHER.db_1.0.10 RSQLite_2.2.0 AnnotationHub_2.20.2 BiocFileCache_1.12.1
[5] dbplyr_1.4.4 org.Hs.eg.db_3.11.4 topGO_2.40.0 SparseM_1.78
[9] GO.db_3.11.4 AnnotationDbi_1.50.3 IRanges_2.22.2 S4Vectors_0.26.1
[13] Biobase_2.48.0 graph_1.66.0 BiocGenerics_0.34.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 later_1.1.0.1
[3] BiocManager_1.30.10 compiler_4.0.2
[5] pillar_1.4.6 tools_4.0.2
[7] digest_0.6.25 bit_4.0.4
[9] memoise_1.1.0 tibble_3.0.3
[11] lifecycle_0.2.0 lattice_0.20-41
[13] pkgconfig_2.0.3 rlang_0.4.7
[15] shiny_1.5.0 DBI_1.1.0
[17] rstudioapi_0.11 yaml_2.2.1
[19] curl_4.3 fastmap_1.0.1
[21] httr_1.4.2 dplyr_1.0.2
[23] rappdirs_0.3.1 generics_0.0.2
[25] vctrs_0.3.4 bit64_4.0.5
[27] grid_4.0.2 tidyselect_1.1.0
[29] glue_1.4.2 R6_2.4.1
[31] purrr_0.3.4 blob_1.2.1
[33] magrittr_1.5 promises_1.1.1
[35] htmltools_0.5.0 matrixStats_0.56.0
[37] ellipsis_0.3.1 assertthat_0.2.1
[39] xtable_1.8-4 mime_0.9
[41] interactiveDisplayBase_1.26.3 httpuv_1.5.4
[43] BiocVersion_3.11.1 crayon_1.3.4
if you provide a function to
geneSel
it will not be used (see this other questions: https://support.bioconductor.org/p/91273/, https://support.bioconductor.org/p/105667/#105733)The geneSel argument isn't applicable if you are doing a KS test, because that uses the scores directly instead of generating a contingency table. Try your example using the fisher test.