But once the data is saved and I load it again I get an error in almost any specific method for EnsDb.
I am not sure what can be the problem as the file is correctly saved and loaded (is(aqh) works).
Some search results point to C++ pointers or something similar, which I don't know how might result in the error in the R code.
I would appreciate any help to understand what is going on and how to improve my code (Using BiocFileCache?). Thanks
I'm curious as to why you are doing a saveRDS and readRDS? AnnotationHub already caches and saves files in the background already so it isn't necessary. This also currently isn't using BiocFileCache as you are just saving locally and trying to re-read it in. It might help to show an ERROR for a method as you indicated so we can see the ERROR method as well as provide your sessionInfo() so we know what versions you are running.
aqh <- ah[["AH116291"]]
> aqh
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.10
|Creation time: Tue Jan 16 10:37:47 2024
|ensembl_version: 111
|ensembl_host: localhost
|Organism: Homo sapiens
|taxonomy_id: 9606
|genome_build: GRCh38
|DBSCHEMAVERSION: 2.2
|common_name: human
|species: homo_sapiens
| No. of genes: 72035.
| No. of transcripts: 278721.
|Protein data available.
I am using saveRDS and readRDS because when executing this inside a quarto report it failed sometimes to connect to the remote server (before using the localHub = TRUE argument).
I know I am not using BiocFileCache.
The error message is what I have above after loading the object with readRDS and printing with aqh I see "Error: bad_weak_ptr" and not the data information you provided.
Here is my sessionInfo() of the project, I'm using renv, so if you wish I could post the lock file to reproduce the R environment:
You cannot save and reuse an EnsDb object like that, as it's mostly just a pointer to a SQLite Db and functions to query it.
library(AnnotationHub)
hub <- AnnotationHub()
z <- hub[["AH116291"]]
> class(z)
[1] "EnsDb"
attr(,"package")
[1] "ensembldb"
> z
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.10
|Creation time: Tue Jan 16 10:37:47 2024
|ensembl_version: 111
|ensembl_host: localhost
|Organism: Homo sapiens
|taxonomy_id: 9606
|genome_build: GRCh38
|DBSCHEMAVERSION: 2.2
|common_name: human
|species: homo_sapiens
| No. of genes: 72035.
| No. of transcripts: 278721.
|Protein data available.
> dbconn(z)
<SQLiteConnection>
Path: C:\Users\jmacdon\AppData\Local\R\cache\R\AnnotationHub\599041ba51cf_123037
Extensions: TRUE
## try saving
> saveRDS(z, "tmp.Rds")
> zz <- readRDS("tmp.Rds")
> zz
Error: bad_weak_ptr
> class(zz)
[1] "EnsDb"
attr(,"package")
[1] "ensembldb"
> dbconn(zz)
<SQLiteConnection>
DISCONNECTED
I sometimes do what you are trying to do, to ensure that a download AnnotationHub object remains static (for clients who won't understand if any annotations change), by going off-reservation and doing something like this:
> file.copy(dbconn(z)@dbname, "./ensdb.sqlite")
## and then later, in an Rmd file
> library(ensembldb)
> zzz <- EnsDb("ensdb.sqlite")
> zzz
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.10
|Creation time: Tue Jan 16 10:37:47 2024
|ensembl_version: 111
|ensembl_host: localhost
|Organism: Homo sapiens
|taxonomy_id: 9606
|genome_build: GRCh38
|DBSCHEMAVERSION: 2.2
|common_name: human
|species: homo_sapiens
| No. of genes: 72035.
| No. of transcripts: 278721.
|Protein data available.
> dbconn(zzz)
<SQLiteConnection>
Path: C:\Users\jmacdon\Desktop\ensdb.sqlite
Extensions: TRUE
But ideally you would just use the cached SQLite file that AnnotationHub should find for you.
I suspected this. Thanks for the answer, that solution might work! And at least now I understand why this happens (but I wish there would be a better error message from R).
My main goal is avoiding problems connecting to the server and redownloading it again when it is the same (I think I avoided that with localHub = TRUE already) but I'm fine with updating the references.
I am using saveRDS and readRDS because when executing this inside a quarto report it failed sometimes to connect to the remote server (before using the
localHub = TRUE
argument). I know I am not using BiocFileCache.The error message is what I have above after loading the object with readRDS and printing with aqh I see "Error: bad_weak_ptr" and not the data information you provided.
Here is my sessionInfo() of the project, I'm using renv, so if you wish I could post the lock file to reproduce the R environment:
Many thanks for your assistance Lori.
You cannot save and reuse an
EnsDb
object like that, as it's mostly just a pointer to a SQLite Db and functions to query it.I sometimes do what you are trying to do, to ensure that a download
AnnotationHub
object remains static (for clients who won't understand if any annotations change), by going off-reservation and doing something like this:But ideally you would just use the cached SQLite file that
AnnotationHub
should find for you.I suspected this. Thanks for the answer, that solution might work! And at least now I understand why this happens (but I wish there would be a better error message from R). My main goal is avoiding problems connecting to the server and redownloading it again when it is the same (I think I avoided that with
localHub = TRUE
already) but I'm fine with updating the references.It seems to be an issue with the saveRDS and readRDS and not the hubs