I am trying to build an up to date EnsDB following the vignette. I have the ensembl PERL API installed. fetchTablesFromEnsembl() is running, but extremely slowly. After about 2 hours, I have 3 meg of text files.
> fetchTablesFromEnsembl(90, species = "mouse")
Connecting to ensembldb.ensembl.org at port 5306
# get_gene_transcript_exon_tables.pl version 0.3.0:
Retrieve gene models for Ensembl version 90, species mouse from Ensembl database at host: ensembldb.ensembl.org
Start fetching data
$ du -shc *.txt
512B ens_chromosome.txt
15K ens_entrezgene.txt
611K ens_exon.txt
36K ens_gene.txt
860K ens_protein.txt
477K ens_protein_domain.txt
249K ens_tx.txt
604K ens_tx2exon.txt
79K ens_uniprot.txt
2.9M total
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
76085 esiefker 1 20 0 139M 76464K sbwait 0 1:28 0.91% perl
Perl is spending all its time in 'sbwait'. (FreeBSD 11) Any ideas on how to improve this?
> sessionInfo()
Would be included but the forum is telling me:
"Language "af" is not one of the supported languages ['en']!"
> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: amd64-portbld-freebsd11.0 (64-bit)
Running under: FreeBSD bio 11.0-STABLE FreeBSD 11.0-STABLE #0 r321665+25fe8ba8d06(freenas/11.0-stable): Mon Sep 25 06:24:11 UTC 2017 root@gauntlet:/freenas-11-releng/freenas/_BE/objs/freenas-11-releng/freenas/_BE/os/sys/FreeNAS.amd64 amd64
Matrix products: default
LAPACK: /usr/local/lib/R/lib/libRlapack.so
locale:
[1] C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] ensembldb_2.0.4 AnnotationFilter_1.0.0 GenomicFeatures_1.28.5
[4] AnnotationDbi_1.38.2 Biobase_2.36.2 GenomicRanges_1.28.6
[7] GenomeInfoDb_1.12.3 IRanges_2.10.5 S4Vectors_0.14.7
[10] BiocGenerics_0.22.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.13 BiocInstaller_1.26.1
[3] AnnotationHub_2.8.3 compiler_3.4.2
[5] XVector_0.16.0 ProtGenerics_1.8.0
[7] bitops_1.0-6 tools_3.4.2
[9] zlibbioc_1.22.0 biomaRt_2.32.1
[11] digest_0.6.12 bit_1.1-12
[13] RSQLite_2.0 memoise_1.1.0
[15] tibble_1.3.4 lattice_0.20-35
[17] rlang_0.1.2 Matrix_1.2-11
[19] shiny_1.0.5 DelayedArray_0.2.7
[21] DBI_0.7 curl_3.0
[23] yaml_2.1.14 GenomeInfoDbData_0.99.0
[25] httr_1.3.1 rtracklayer_1.36.6
[27] Biostrings_2.44.2 bit64_0.9-7
[29] grid_3.4.2 R6_2.2.2
[31] XML_3.98-1.9 BiocParallel_1.10.1
[33] blob_1.1.0 htmltools_0.3.6
[35] Rsamtools_1.28.0 matrixStats_0.52.2
[37] GenomicAlignments_1.12.2 SummarizedExperiment_1.6.5
[39] xtable_1.8-2 mime_0.5
[41] interactiveDisplayBase_1.14.0 httpuv_1.3.5
[43] RCurl_1.95-4.8 lazyeval_0.2.0
>