h5ls has traversal errors when traversing an HD5 structure created from the MinION. The errors don't seem to cause problems with the output, but there's no way to hide them:
> library(rhdf5);
> fast5.files <- list.files(pattern="\\.fast5");
> fastFile <- fast5.files[1];
> fastH5F <- H5Fopen(fastFile);
> h5ls(fastH5F, datasetinfo=FALSE, recursive=3)
HDF5-DIAG: Error detected in HDF5 (1.8.7) thread 0:
#000: H5O.c line 246 in H5Oopen(): unable to open object
major: Symbol table
minor: Can't open object
#001: H5O.c line 1355 in H5O_open_name(): object not found
major: Symbol table
minor: Object not found
#002: H5Gloc.c line 430 in H5G_loc_find(): can't find object
major: Symbol table
minor: Object not found
#003: H5Gtraverse.c line 905 in H5G_traverse(): internal path traversal failed
major: Symbol table
minor: Object not found
#004: H5Gtraverse.c line 664 in H5G_traverse_real(): special link traversal failed
major: Links
minor: Link traversal failure
#005: H5Gtraverse.c line 467 in H5G_traverse_special(): symbolic link traversal failed
major: Links
minor: Link traversal failure
#006: H5Gtraverse.c line 329 in H5G_traverse_slink(): unable to follow symbolic link
major: Symbol table
minor: Object not found
#007: H5Gtraverse.c line 799 in H5G_traverse_real(): component not found
major: Symbol table
minor: Object not found
HDF5-DIAG: Error detected in HDF5 (1.8.7) thread 0:
#000: H5O.c line 1078 in H5Oclose(): not a valid file object ID (dataset, group, or datatype)
major: Invalid arguments to routine
minor: Unable to release object
group name otype dclass dim
0 / Analyses H5I_GROUP
1 /Analyses Basecall_2D_000 H5I_GROUP
2 /Analyses/Basecall_2D_000 BaseCalled_2D H5I_GROUP
3 /Analyses/Basecall_2D_000 BaseCalled_complement H5I_GROUP
4 /Analyses/Basecall_2D_000 BaseCalled_template H5I_GROUP
5 /Analyses/Basecall_2D_000 Configuration H5I_GROUP
6 /Analyses/Basecall_2D_000 HairpinAlign H5I_GROUP
7 /Analyses/Basecall_2D_000 InputEvents H5I_BADID
8 /Analyses/Basecall_2D_000 Log H5I_DATASET
9 /Analyses/Basecall_2D_000 Summary H5I_GROUP
10 /Analyses EventDetection_000 H5I_GROUP
11 /Analyses/EventDetection_000 Configuration H5I_GROUP
12 /Analyses/EventDetection_000 Reads H5I_GROUP
13 / Sequences H5I_GROUP
14 /Sequences Meta H5I_GROUP
15 / UniqueGlobalKey H5I_GROUP
16 /UniqueGlobalKey channel_id H5I_GROUP
17 /UniqueGlobalKey context_tags H5I_GROUP
18 /UniqueGlobalKey tracking_id H5I_GROUP
An example fast5 file can be found here:
http://www.gringene.org/data/mimr_minion_James_4T1p0SC_mtDNA_2014_Oct_03_4652_1_ch132_file25_strand.fast5
I'm trying to make a nanopore library using rhdf5, and these errors swallow up the screen with messages that are completely meaningless to people who just want to extract event data. I need to traverse the tree because some directory names are not predictable (e.g. it could be Basecall_2D_001 instead). Does anyone know of any way I can traverse the HDF5 tree without producing errors in the output?
----
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux stretch/sid
locale:
[1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_NZ.UTF-8 LC_COLLATE=en_NZ.UTF-8
[5] LC_MONETARY=en_NZ.UTF-8 LC_MESSAGES=en_NZ.UTF-8
[7] LC_PAPER=en_NZ.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rhdf5_2.12.0 BiocInstaller_1.18.4 RichPoreTK_1.0.5
[4] devtools_1.8.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.0 digest_0.6.8 R6_2.1.1 git2r_0.11.0
[5] httr_1.0.0 zlibbioc_1.14.0 curl_0.9.3 xml2_0.1.2
[9] tools_3.2.2 stringr_0.6.2 compiler_3.2.2 rversions_1.0.2
[13] tcltk_3.2.2 memoise_0.2.1
A better error handling has been introduced since version 2.13.2. However, this does not remove the error message. It just makes the message readable.
The problem was that h5ls was not able to deal with soft links. I changed the code accordingly, such that h5ls, h5dump, and h5read are now able to deal with soft links in an appropriate way. The updates will be available in version 2.13.6 that will appear in about 1 day in the developmental version of bioconductor.