Hello,
I have been using recount3 a several times for downloading TCGA and GTEX data and it worked perfectly! Thanks for the package! I now wanted to download a different dataset. I have located it in the study explorer and use the R code that is generated to access the data:
library("recount3")
rse_data <- recount3::create_rse_manual(
project = "SRP045225",
project_home = "data_sources/sra",
organism = "human",
annotation = "gencode_v29",
type = "gene"
)
traceback()
sessionInfo()
Unfortunately, I run into the following error and, as a result, the data cannot be downloaded. I have tried using different annotations available, but I keep getting the same error messages and no data can be downloaded. This did not happen when I download TCGA and GTEX data or other datasets from data_sources sra. I am wondering whether there is a problem with this particular dataset. Has anybody had a similar error? Any suggestions to solve it? Many thanks! in advance to everybody and the package developers/maintainers!
Here is the evaluated code:
> rse_data <- recount3::create_rse_manual(
+ project = "SRP045225",
+ project_home = "data_sources/sra",
+ organism = "human",
+ annotation = "gencode_v29",
+ type = "gene"
+ )
2024-05-21 16:34:55.952222 downloading and reading the metadata.
2024-05-21 16:34:56.595507 caching file sra.sra.SRP045225.MD.gz.
adding rname 'http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/25/SRP045225/sra.sra.SRP045225.MD.gz'
Error in BiocFileCache::bfcrpath(bfc, url, exact = TRUE, verbose = verbose) :
not all 'rnames' found or unique.
In addition: Warning messages:
1: download failed
web resource path: 'http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/25/SRP045225/sra.sra.SRP045225.MD.gz'
local file path: 'C:\Users\Garcia\AppData\Local/R/cache/R/recount3/5f5410c23fc1_sra.sra.SRP045225.MD.gz'
reason: Received HTTP/0.9 when not allowed
2: bfcadd() failed; resource removed
rid: BFC352
fpath: 'http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/25/SRP045225/sra.sra.SRP045225.MD.gz'
reason: download failed
3: In value[[3L]](cond) :
trying to add rname 'http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/25/SRP045225/sra.sra.SRP045225.MD.gz' produced error:
bfcadd() failed; see warnings()
> traceback()
8: stop("not all 'rnames' found or unique.")
7: BiocFileCache::bfcrpath(bfc, url, exact = TRUE, verbose = verbose)
6: BiocFileCache::bfcrpath(bfc, url, exact = TRUE, verbose = verbose)
5: FUN(X[[i]], ...)
4: vapply(url, file_retrieve, character(1), bfc = bfc, verbose = verbose)
3: file_retrieve(url = locate_url(project = project, project_home = project_home,
type = "metadata", organism = organism, annotation = annotation,
recount3_url = recount3_url), bfc = bfc, verbose = verbose)
2: read_metadata(file_retrieve(url = locate_url(project = project,
project_home = project_home, type = "metadata", organism = organism,
annotation = annotation, recount3_url = recount3_url), bfc = bfc,
verbose = verbose))
1: recount3::create_rse_manual(project = "SRP045225", project_home = "data_sources/sra",
organism = "human", annotation = "gencode_v29", type = "gene")
> sessionInfo()
R version 4.3.3 (2024-02-29 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8 LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.utf8
time zone: Europe/Berlin
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] recount3_1.12.0 SummarizedExperiment_1.32.0 Biobase_2.62.0 GenomicRanges_1.54.1
[5] GenomeInfoDb_1.38.8 IRanges_2.36.0 S4Vectors_0.40.1 BiocGenerics_0.48.1
[9] MatrixGenerics_1.14.0 matrixStats_1.3.0
loaded via a namespace (and not attached):
[1] rjson_0.2.21 lattice_0.22-6 vctrs_0.6.5 tools_4.3.3 bitops_1.0-7
[6] generics_0.1.3 curl_5.2.1 parallel_4.3.3 tibble_3.2.1 fansi_1.0.6
[11] RSQLite_2.3.6 blob_1.2.4 pkgconfig_2.0.3 R.oo_1.26.0 Matrix_1.6-5
[16] data.table_1.15.4 dbplyr_2.5.0 lifecycle_1.0.4 GenomeInfoDbData_1.2.11 compiler_4.3.3
[21] Rsamtools_2.18.0 Biostrings_2.70.2 codetools_0.2-20 RCurl_1.98-1.14 yaml_2.3.8
[26] pillar_1.9.0 crayon_1.5.2 R.utils_2.12.3 BiocParallel_1.36.0 DelayedArray_0.28.0
[31] cachem_1.0.8 sessioninfo_1.2.2 abind_1.4-5 tidyselect_1.2.1 purrr_1.0.2
[36] dplyr_1.1.4 restfulr_0.0.15 fastmap_1.1.1 grid_4.3.3 cli_3.6.2
[41] SparseArray_1.2.4 magrittr_2.0.3 S4Arrays_1.2.1 XML_3.99-0.16.1 utf8_1.2.4
[46] withr_3.0.0 filelock_1.0.3 bit64_4.0.5 XVector_0.42.0 httr_1.4.7
[51] bit_4.0.5 R.methodsS3_1.8.2 memoise_2.0.1 BiocIO_1.12.0 BiocFileCache_2.10.2
[56] rtracklayer_1.62.0 rlang_1.1.3 glue_1.7.0 DBI_1.2.2 rstudioapi_0.16.0
[61] R6_2.5.1 GenomicAlignments_1.38.2 zlibbioc_1.48.0
Hello Leonardo,
Thanks for commenting on possible solutions. I noticed comparing your session info that I was using the previous version of R. I have first installed the newest version of R and bioc 3.19, updated and tried your suggested solutions. I do, however, obtain the same error:
I tried using recount3_cache_rm() and still got the error:
Do you maybe have other guesses of what might be going on? Thanks a lot for the help!!
Hi,
What do you get if you run the following
httr::HEAD()
command? I noticed that you havelibcurl
version 8.3.0 fromcurl::curl_version()$version
. I just installed version 8.8.0 earlier today (see https://github.com/Bioconductor/BiocFileCache/issues/48#issuecomment-2124935008 for all the details) but when I replied earlier I was using version 8.6.0. Given what I learned recently aboutBiocFileCache
's internal functions, eventually it useshttr:HEAD()
so I'm curious if that's where the issue lies given the error message you get aboutreason: Received HTTP/0.9 when not allowed
.libcurl
version 8.8.0 is available from https://github.com/curl/curl/releases/tag/curl-8_8_0 and https://snyk.io/blog/how-to-update-curl/ has some instructions for Windows users. Note that it says that "anything less than8.4.0
will need to be updated", so maybe there's a known issue with version 8.3.0 that I'm unaware of (I'm by far not alibcurl
connoisseur).Best, Leo