customProDB: function PrepareAnnotationEnsembl() not working (biomart issue?)
1
0
Entering edit mode
@laurafancello-22293
Last seen 5.0 years ago

Hi I am trying to use the PrepareAnnotationEnsembl() function from customProDB package using this code:

library(customProDB) options(download.file.method="libcurl") ensembl <- useMart("ENSEMBLMARTENSEMBL", dataset="hsapiensgeneensembl") annotationpath <- getwd() customProDB::PrepareAnnotationEnsembl(mart=ensembl, annotationpath = annotationpath, dbsnp=NULL, splicematrix=TRUE, COSMIC=FALSE)

I tried several times and sometimes I got this error:

done
Build TranscriptDB object (txdb.sqlite) ... OK
Download and preprocess the 'chrominfo' data frame ... OK Batch submitting query [===================================================================>--------------------] 78% eta: 8mError in curl::curlfetchmemory(url, handle = handle) :
Timeout was reached: [www.ensembl.org:80] Operation timed out after 300000 milliseconds with 0 bytes received*

and sometimes this other error:

Batch submitting query [============================================>-----------------------------------------------------] 46% eta: 23mError in getBM(attributes = attributes.id, mart = mart, filters = "ensembltranscriptid", : The query to the BioMart webservice returned an invalid result: biomaRt expected a character string of length 1. Please report this on the support site at http://support.bioconductor.org

this is my sessionInfo():

R version 3.6.1 (2019-07-05) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale: [1] LCCOLLATE=FrenchFrance.1252 LCCTYPE=FrenchFrance.1252 LCMONETARY=FrenchFrance.1252 LCNUMERIC=C [5] LCTIME=French_France.1252

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] customProDB1.24.0 biomaRt2.40.5
AnnotationDbi1.46.1 Biobase2.44.0 IRanges2.18.3
S4Vectors
0.22.1 [7] BiocGenerics_0.30.0

loaded via a namespace (and not attached): [1] SummarizedExperiment1.14.1 progress1.2.2
VariantAnnotation1.30.1 lattice0.20-38 vctrs0.2.0
[6] rtracklayer
1.44.4 GenomicFeatures1.36.4 blob1.2.0 XML3.98-1.20 rlang0.4.0 [11] pillar1.4.2 DBI1.0.0
BiocParallel1.18.1 bit640.9-7
matrixStats0.55.0 [16] GenomeInfoDbData1.2.1
plyr1.8.4 stringr1.4.0
zlibbioc1.30.0 Biostrings2.52.0 [21] AhoCorasickTrie0.1.0 memoise1.1.0
GenomeInfoDb1.20.0 curl4.2 Rcpp1.0.2
[26] backports
1.1.5 BSgenome1.52.0
DelayedArray
0.10.0 XVector0.24.0 bit1.1-14
[31] Rsamtools2.0.3 hms0.5.1
digest0.6.21 stringi1.4.3
GenomicRanges1.36.1 [36] grid3.6.1
tools3.6.1 bitops1.0-6 magrittr1.5
RCurl
1.95-4.12 [41] tibble2.1.3
RSQLite
2.1.2 crayon1.3.4
pkgconfig
2.0.3 zeallot0.1.0 [46] Matrix1.2-17 xml21.2.2
prettyunits
1.0.2 assertthat0.2.1 httr1.4.1
[51] rstudioapi0.10 R62.4.0
GenomicAlignments1.20.1 compiler3.6.1

do you have any idea on to solve the issue? Thank you!!

customProDB Ensembl connection biomart • 897 views
ADD COMMENT
0
Entering edit mode
Mike Smith ★ 6.6k
@mike-smith
Last seen 1 hour ago
EMBL Heidelberg

This has been an increasing problem recently with Ensembl BioMart, where it doesn't seem to cope with large queries.

In the short term you can try using one of the mirror sites, which seem a little more responsive via:

ensmart <- useEnsembl(biomart = "ensembl", 
                      dataset = 'hsapiens_gene_ensembl', 
                      mirror = 'useast')

I would also suggest upgrading to biomaRt version 2.42.0 as this implements a cache which should allow you to resume failed queries from the point at which they stopped, hopefully saving some time.

I don't know the customProDB, but in the longer term it may be there is a more efficient way for that package to grab this data, BioMart is not really designed for bulk data download, but Ensembl have other was of accessing their data.

ADD COMMENT

Login before adding your answer.

Traffic: 869 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6