Hello all,
I am having a lot of trouble accessing biomaRt/ Ensembl at the moment
I have GRCh38 chr:pos SNPs that I would like the db151 rsIDs for, and to this end have been trying poll Version 96 of Ensembl.
This has worked once earlier today for 10 SNPs: 7:128935744:128935744, 3:119518880:119518880, 12:132463597:132463597, 4:40305571:40305571, 17:45379362:45379362, 6:32143247:32143247, 2:191084261:191084261, 6:32286483:32286483, 6:31331721:31331721, 6:32714358:32714358,
Subsequently, I have been unable to alter the attributes I seek, as it returns
Error: biomaRt has encountered an unexpected server error.
Consider trying one of the Ensembl mirrors (for more details look at ?useEnsembl)
I cannot use a different mirror, as it is an archived version of Ensembl.
In an attempt to debug I have moved to the most modern version of Ensembl, and am now getting
Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached: [asia.ensembl.org:443] Operation timed out after 300004 milliseconds with 0 bytes received
despite trying different mirrors (www, asia, useast)
The current pipeline for GRCh38 SNPs returning dbSNP154 rsIDs is:
library(gwasrapidd)
library(biomaRt)
SLE <- get_variants(efo_id = "EFO_0002690")
SLEsnps <- c(paste(SLE@variants[1,4][[1]], SLE@variants[1,5][[1]], SLE@variants[1,5][[1]], sep = ":"))
for (i in 2:10){
SLEsnps <- append(SLEsnps, paste(SLE@variants[i,4][[1]], SLE@variants[i,5][[1]], SLE@variants[i,5][[1]], sep = ":"))
}
ensembl <- useEnsembl(biomart = 'snps', dataset = 'hsapiens_snp', mirror = "www")
getBM(attributes = c("refsnp_id"), #stable code
filters = c("chromosomal_region"),
values = list(SLEsnps),
mart = ensembl)
Obviously for dbSNP151 rsIDs I would use ensembl <- useEnsembl(biomart = 'snps', dataset = 'hsapiens_snp', version = 96)
What on earth am I doing wrong to have recurrent time-outs and internal server errors?
sessionInfo( )
R version 4.0.4 (2021-02-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] BSgenome_1.58.0 rtracklayer_1.50.0 Biostrings_2.58.0
[4] XVector_0.30.0 GenomicRanges_1.42.0 GenomeInfoDb_1.26.7
[7] IRanges_2.24.1 S4Vectors_0.28.1 BiocGenerics_0.36.1
[10] Matrix_1.3-4 biomaRt_2.46.3 gwasrapidd_0.99.11
loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 lattice_0.20-44
[3] prettyunits_1.1.1 Rsamtools_2.6.0
[5] assertthat_0.2.1 utf8_1.2.2
[7] BiocFileCache_1.14.0 R6_2.5.1
[9] RSQLite_2.2.8 httr_1.4.2
[11] pillar_1.6.2 zlibbioc_1.36.0
[13] rlang_0.4.11 progress_1.2.2
[15] curl_4.3.2 rstudioapi_0.13
[17] blob_1.2.2 BiocParallel_1.24.1
[19] stringr_1.4.0 RCurl_1.98-1.4
[21] bit_4.0.4 tinytex_0.33
[23] DelayedArray_0.16.3 compiler_4.0.4
[25] xfun_0.25 pkgconfig_2.0.3
[27] askpass_1.1 SummarizedExperiment_1.20.0
[29] openssl_1.4.5 tidyselect_1.1.1
[31] tibble_3.1.4 GenomeInfoDbData_1.2.4
[33] matrixStats_0.60.1 XML_3.99-0.7
[35] fansi_0.5.0 withr_2.4.2
[37] crayon_1.4.1 dplyr_1.0.7
[39] dbplyr_2.1.1 GenomicAlignments_1.26.0
[41] bitops_1.0-7 rappdirs_0.3.3
[43] grid_4.0.4 lifecycle_1.0.0
[45] DBI_1.1.1 magrittr_2.0.1
[47] cli_3.0.1 stringi_1.7.4
[49] cachem_1.0.6 xml2_1.3.2
[51] ellipsis_0.3.2 generics_0.1.0
[53] vctrs_0.3.8 tools_4.0.4
[55] bit64_4.0.5 Biobase_2.50.0
[57] glue_1.4.2 purrr_0.3.4
[59] MatrixGenerics_1.2.1 hms_1.1.0
[61] fastmap_1.1.0 AnnotationDbi_1.52.0
[63] BiocManager_1.30.16 memoise_2.0.0
Hey team,
Brief update --
I think the errors I am having are related to Ensembl (?) server load and time-outs
By limiting the request to 5 SNPs I get reliable responses on Ensembl 104 (most recent build), with no time-outs
Unfortunately, using archived versions is still a bit flaky. e.g.
useEnsembl(biomart = 'snps', dataset = 'hsapiens_snp', version = 96)
throws an internal server error after trying several servers, butuseEnsembl(biomart = 'snps', dataset = 'hsapiens_snp', version = 95)
works fine for theuseEnsemble()
portion, but then throwsError: biomaRt has encountered an unexpected server error.
when agetBM()
query is submitted for 5 SNPs. It works fine for a single SNP.All a little odd -- is biomaRt usually this limited in its throughput?