Enter the body of text here
I have 26 Gb of RAM available on my laptop and I am trying to import a vcf file of 1.8GB using the readVcf function from the Variant Annotation package. So far the process was stopped every time. I guess it might be for some memory issue but I don't understand why because I shoud have enough memory I think. I am running this in a terminal on ubuntu.
Any ideas ?
Code should be placed in three backticks as shown below
# include your problematic code here with any corresponding output
vcf <- readVcf('EA1EA2_all90_filt_map.recode.vcf')
Processus arrĂȘtĂ©
# please also include the results of running the following in an R session
sessionInfo( )
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] rtracklayer_1.46.0 VariantAnnotation_1.32.0
[3] GenomicAlignments_1.22.1 Rsamtools_2.2.3
[5] Biostrings_2.54.0 XVector_0.26.0
[7] SummarizedExperiment_1.16.1 DelayedArray_0.12.3
[9] BiocParallel_1.20.1 matrixStats_0.61.0
[11] Biobase_2.46.0 GenomicRanges_1.38.0
[13] GenomeInfoDb_1.22.1 IRanges_2.20.2
[15] S4Vectors_0.24.4 BiocGenerics_0.32.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 lattice_0.20-44 prettyunits_1.1.1
[4] assertthat_0.2.1 utf8_1.2.2 BiocFileCache_1.10.2
[7] R6_2.5.1 RSQLite_2.2.8 httr_1.4.2
[10] pillar_1.6.2 zlibbioc_1.32.0 rlang_0.4.11
[13] GenomicFeatures_1.38.2 progress_1.2.2 curl_4.3.2
[16] blob_1.2.2 Matrix_1.3-4 stringr_1.4.0
[19] RCurl_1.98-1.5 bit_4.0.4 biomaRt_2.42.1
[22] compiler_3.6.3 pkgconfig_2.0.3 askpass_1.1
[25] openssl_1.4.5 tidyselect_1.1.1 tibble_3.1.4
[28] GenomeInfoDbData_1.2.2 XML_3.99-0.3 fansi_0.5.0
[31] crayon_1.4.1 dplyr_1.0.7 dbplyr_2.1.1
[34] bitops_1.0-7 rappdirs_0.3.3 grid_3.6.3
[37] lifecycle_1.0.0 DBI_1.1.1 magrittr_2.0.1
[40] stringi_1.7.4 cachem_1.0.6 ellipsis_0.3.2
[43] generics_0.1.0 vctrs_0.3.8 tools_3.6.3
[46] bit64_4.0.5 BSgenome_1.54.0 glue_1.4.2
[49] purrr_0.3.4 hms_1.1.0 fastmap_1.1.0
[52] AnnotationDbi_1.48.0 memoise_2.0.0
Hi,
Thanks for the comment. I updated R and Bioconductor, here is the sessionInfo()
sessionInfo() R version 4.1.1 (2021-08-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.3 LTS
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale: [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base
other attached packages: [1] rtracklayer_1.52.1 VariantAnnotation_1.38.0
[3] GenomicAlignments_1.28.0 Rsamtools_2.8.0
[5] Biostrings_2.60.2 XVector_0.32.0
[7] SummarizedExperiment_1.22.0 Biobase_2.52.0
[9] MatrixGenerics_1.4.3 matrixStats_0.61.0
[11] GenomicRanges_1.44.0 GenomeInfoDb_1.28.4
[13] IRanges_2.26.0 S4Vectors_0.30.0
[15] BiocGenerics_0.38.0
loaded via a namespace (and not attached): [1] Rcpp_1.0.7 lattice_0.20-45 prettyunits_1.1.1
[4] png_0.1-7 assertthat_0.2.1 digest_0.6.27
[7] utf8_1.2.2 BiocFileCache_2.0.0 R6_2.5.1
[10] RSQLite_2.2.8 httr_1.4.2 pillar_1.6.2
[13] zlibbioc_1.38.0 rlang_0.4.11 GenomicFeatures_1.44.2 [16] progress_1.2.2 curl_4.3.2 rstudioapi_0.13
[19] blob_1.2.2 Matrix_1.3-4 BiocParallel_1.26.2
[22] stringr_1.4.0 RCurl_1.98-1.5 bit_4.0.4
[25] biomaRt_2.48.3 DelayedArray_0.18.0 compiler_4.1.1
[28] pkgconfig_2.0.3 tidyselect_1.1.1 KEGGREST_1.32.0
[31] tibble_3.1.4 GenomeInfoDbData_1.2.6 XML_3.99-0.8
[34] fansi_0.5.0 crayon_1.4.1 dplyr_1.0.7
[37] dbplyr_2.1.1 bitops_1.0-7 rappdirs_0.3.3
[40] grid_4.1.1 lifecycle_1.0.0 DBI_1.1.1
[43] magrittr_2.0.1 stringi_1.7.4 cachem_1.0.6
[46] xml2_1.3.2 ellipsis_0.3.2 filelock_1.0.2
[49] vctrs_0.3.8 generics_0.1.0 rjson_0.2.20
[52] restfulr_0.0.13 tools_4.1.1 bit64_4.0.5
[55] BSgenome_1.60.0 glue_1.4.2 purrr_0.3.4
[58] hms_1.1.0 yaml_2.2.1 fastmap_1.1.0
[61] AnnotationDbi_1.54.1 memoise_2.0.0 BiocIO_1.2.0
I also tried to use the VcfFile function, but I stilll have the same issue.
In parallel I ran the same script on a cluster and it worked without problem, so it is either a problem of memory (but again I have 26Gb of RAM available and the vcf file is 1.6GB) or there is something wrong with my installation maybe ?
Any idea ?