You're correct that H5Ldelete is not available in the current version of rhdf5, and I don't think there's a straight-forward way to do this with the current package version. For data types other than compound you're able to overwrite existing entries, so I guess this hasn't come up before.
Given this I've added an implementation of H5Ldelete()
and a more high-level h5delete()
to the very developmental version of the package. You can install this from Github using:
BiocInstaller::biocLite('grimbough/Rhdf5lib')
BiocInstaller::biocLite('grimbough/rhdf5', ref = 'H5Ldelete')
Now we can test with your example:
library(rhdf5)
h5fl <- tempfile(fileext=".h5")
h5createFile(file=h5fl)
df <- data.frame(a=1:4, b=c(1.1, 2.1, 3.1, 4.1), d=42:45)
df_mod <- data.frame(a=1:4, b=c(5.1, 2.1, 3.1, 4.1), d=42:45)
h5write(df, h5fl, "dfcompound")
h5read(h5fl, "dfcompound")
a b d
1 1 1.1 42
2 2 2.1 43
3 3 3.1 44
4 4 4.1 45
Now use h5delete()
to remove the "dfcompound" dataset and verify it doesn't exist.
h5delete(file = h5fl, name = "dfcompound")
h5read(h5fl, "dfcompound")
Error in h5read(h5fl, "dfcompound") :
Object 'dfcompound' does not exist in this HDF5 file.
It's now possible to write a new dataset with the same name.
h5write(df_mod, h5fl, "dfcompound")
h5read(h5fl, "dfcompound")
a b d
1 1 5.1 42
2 2 2.1 43
3 3 3.1 44
4 4 4.1 45
It would be nice to only overwrite the subset of the dataset that's changed, but I don't know why the original rhdf5 maintainer prevented this - there may be a technical limitation in HDF5 that I'm not aware of for compound datasets. For now hopefully this is sufficient for your needs.
Please let me know if you experience any unexpected issues with it, and if it seems stable I'll incoporated it into the main branch of rhdf5.
On an tangential note, I really don't recommend running H5close()
with this version of the package - it's behaviour is likely to break everything. There's now the h5closeAll()
function that achieves the same goal, although in an ideal world you wouldn't have to use it at all. If you find you get lots of references to files already being open please let me know, I'm trying to stop that happening.
My plan was to let you test the new version of the code, then move it into the release version. You should be able to install from Github by running these two lines:
If you get an error message report it back here and we'll work through it.