Short Description
I have an issue where an hdf5 file has been written on a Windows machine and is unable to be opened on a Linux machine. The error message is "OSError: Unable to open file (bad superblock version number)". (As such, this issue may not be related to h5py at all, but rather a general linux/windows compatibility issue in python file open).
Long Description
A python virtual environment with the following packages was used on both Windows and Linux:
- Flask-0.12.2
- Flask-RESTful-0.3.6
- Jinja2-2.10
- MarkupSafe-1.0
- Werkzeug-0.14.1
- aniso8601-3.0.0
- click-6.7
- h5py-2.7.1
- h5py-cache-1.0
- itsdangerous-0.24
- lockfile-0.12.2
- numpy-1.14.0
- pytz-2018.3
- six-1.11.0
On Windows, the file could be opened and read without issues, but on Linux it couldn't, throwing an OSError. Simply starting a new python session and typing the following is enough:
import h5py
f1 = h5py.File("myfile.hdf5", "r")
Full error:
Traceback (most recent call last):
File "stdin", line 1, in module
File "/usr/local/lib/python3.6/site-packages/h5py/_hl/files.py", line 312, in _ _ init_ _
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "/usr/local/lib/python3.6/site-packages/h5py/_hl/files.py", line 142, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 78, in h5py.h5f.open
OSError: Unable to open file (bad superblock version number)
The HDF5 file is written on a Windows machine with Java HDF5 Library from a program that I cannot modify, with hdf5 java 1.10.0 in SWMR (single-write, multiple-reader) mode.
It is possible that the program doesn't close the file properly before sending it further to my program, a light-weight Linux application.
In http://web.mit.edu/fwtools_v3.1.0/www/H5.format.html, the "Version Number of the Super Block" is described as follows...
This value is used to determine the format of the information in the super block. When the format of the information in the super block is changed, the version number is incremented to the next integer and can be used to determine how the information in the super block is formatted.
Values of 0 and 1 are defined for this field.
This field is present in version 0+ of the superblock.
...which doesn't help me understand what the bad superblock version number error could be about.
Here's a sample file I'm trying to open: https://drive.google.com/open?id=10hpbWj4HBwIMq0X6Rq7yVzJATOiYHJcc
Why make a stackoverflow question out of this?
This issue might affect everyone who, on a Linux machine, want to read hdf5 files generated on a Windows machine and not correctly closed/formatted/etc. I'd like to know the reason(s) why this happens and how to work around this on my end, on Linux. If the only solution is "It needs to be fixed by the Windows program generating the HDF5 file, as this cannot be fixed afterwards", then that is an acceptable answer as well. Is that the case here?
Actions Taken
- Upgrading to h5py to 2.8.0rc1 doesn't solve the issue
- Other HDF5 files can be opened as expected
Related Topics
I've looked at the following topics and sites for possible reasons, but have come up empty handed:
1) https://support.hdfgroup.org/HDF5/faq/bkfwd-compat.html
2) h5py OSError: Unable to open file (File signature not found)
3) HDF5 file created with h5py can't be opened by h5py
4) https://github.com/h5py/h5py/issues/757
5) http://web.mit.edu/fwtools_v3.1.0/www/H5.format.html
Edit 1:
Thanks to @Tom de Geus, I tried HDF View on Linux and Windows and discovered that the sample file cannot be opened on Linux HDF View, but it can be opened with Windows HDF View. This suggests that the issue is in the file and HDF, not h5py.