Hierarchical Data Format: Difference between revisions

Content deleted Content added
distinguish HDF/HDFS
 
(15 intermediate revisions by 11 users not shown)
Line 36:
==Early history==
 
The quest for a portable scientific data format, originally dubbed AEHOO (All Encompassing Hierarchical Object Oriented format) began in 1987 by the Graphics Foundations Task Force (GFTF) at the National Center for Supercomputing Applications (NCSA). NSF grants received in 1990 and 1992 were important to the project. Around this time [[NASA]] investigated 15 different file formats for use in the [[Earth Observing System]] (EOS) project. After a two-year review process, HDF was selected as the standard data and information system.<ref>{{cite web|url=http://www.hdfgroup.org/about/history.html|title=History of HDF Group|archive-url=https://web.archive.org/web/20160821013712/http://www.hdfgroup.org/about/history.html | access-date=15 July 2014|archive-date=21 August 2016 }}</ref>
 
==HDF4==
Line 42:
HDF4 is the older version of the format, although still actively supported by The HDF Group. It supports a proliferation of different data models, including multidimensional arrays, [[Raster graphics|raster images]], and tables. Each defines a specific aggregate data type and provides an [[Application Programming Interface|API]] for reading, writing, and organizing the data and metadata. New data models can be added by the HDF developers or users.
 
HDF is self-describing, allowing an application to interpret the structure and contents of a file with no outside information. One HDF file can hold a mix of related objects which can be accessed as a group or as individual objects. Users can create their own grouping structures called "vgroups."<ref name="foldoc">{{foldoc|Hierarchical+Data+Format}}</ref>
The HDF4 format has many limitations.<ref>[http://www.hdfgroup.org/h5h4-diff.html How is HDF5 different from HDF4?] {{webarchive|url=https://web.archive.org/web/20090330052722/http://www.hdfgroup.org/h5h4-diff.html |date=2009-03-30 }}</ref><ref>{{Cite web |url=http://www.hdfgroup.org/HDF-FAQ.html#6b |title=Are there limitations to HDF4 files? |access-date=2009-03-29 |archive-url=https://web.archive.org/web/20160419122423/http://www.hdfgroup.org/HDF-FAQ.html#6b |archive-date=2016-04-19 |url-status=dead }}</ref> It lacks a clear object model, which makes continued support and improvement difficult. Supporting many different interface styles (images, tables, arrays) leads to a complex API. Support for metadata depends on which interface is in use; ''SD'' (Scientific Dataset) objects support arbitrary named attributes, while other types only support predefined metadata. Perhaps most importantly, the use of 32-bit signed integers for addressing limits HDF4 files to a maximum of 2 GB, which is unacceptable in many modern scientific applications.
Line 70:
*Dataset data cannot be freed in a file without generating a file copy using an external tool (h5repack).<ref>{{cite web|last1=Rossant|first1=Cyrille|title=Moving away from HDF5|url=http://cyrille.rossant.net/moving-away-hdf5/|website=cyrille.rossant.net|access-date=21 April 2016}}</ref>
 
===Officially supported APIs===
==Interfaces==
 
===Officially supported APIs===
* [[C (programming language)|C]]
* [[C++]]
* [[Common Language Infrastructure|CLI]] - .NetNET
* [[Fortran]], [[Fortran 90]]
* HDF5 Lite (H5LT) – a light-weight interface for C
Line 83 ⟶ 81:
* HDF5 Dimension Scale (H5DS) – allows dimension scales to be added to HDF5
*[[Java (programming language)|Java]]
 
===Third-party bindings===
* [[CGNS]] uses HDF5 as main storage
* [[Common Lisp]] library [https://github.com/HDFGroup/hdf5-cffi hdf5-cffi]
* [[D (programming language)|D]] offers [https://github.com/Laeeth/d_hdf5 bindings to the C API], with a high-level h5py style D wrapper under development
* [[Dymola]] introduced support for HDF5 export using an implementation called [[Scientific Data Format|SDF]] (Scientific Data Format) with release Dymola 2016 FD01
* [[Erlang (programming language)|Erlang]], [[Elixir (programming language)|Elixir]], and [[LFE (programming language)|LFE]] may use the [https://github.com/RomanShestakov/erlhdf5 bindings for BEAM languages]
* [[GNU Data Language]]
* [[Go (programming language)|Go]] - [https://github.com/gonum gonum]'s [https://github.com/gonum/hdf5 hdf5] package.
* [http://www.hdfql.com HDFql] enables users to manage HDF5 files through a high-level language (similar to SQL) in C, C++, Java, Python, C#, Fortran and R.
* [[Haskell]] offers [https://hackage.haskell.org/package/hdf5 bindings to the C API].
* [[Huygens Software]] uses HDF5 as primary storage format since version 3.5
* [[IDL (programming language)|IDL]]
* [[IGOR Pro]] offers [http://www.wavemetrics.com/products/igorpro/dataaccess/hdf5.htm full support of HDF5] files.
* JHDF5,<ref>[https://wiki-bsse.ethz.ch/display/JHDF5 JHDF5 library]</ref> an alternative [[Java (programming language)|Java]] binding that takes a different approach from the official HDF5 Java binding which some users find simpler
* [http://jhdf.io jHDF] A pure [[Java (programming language)|Java]] implementation providing read-only access to HDF5 files
* [[JSON]] through [http://hdf5-json.readthedocs.org hdf5-json].
* [[Julia (programming language)|Julia]] support for HDF5 is available through the [https://github.com/JuliaIO/HDF5.jl HDF5] and [https://github.com/JuliaIO/JLD2.jl JLD2] packages.
* [[LabVIEW]] can gain HDF support through third-party libraries, such as [http://h5labview.sourceforge.net/ h5labview] and [http://www.upvi.net/main/index.php/products/lvhdf5 lvhdf5].
* [[Lua (programming language)|Lua]] through the [http://colberg.org/lua-hdf5 lua-hdf5] library.
* [[MATLAB]], [[Scilab]] or [[GNU Octave|Octave]] – use HDF5 as primary storage format in recent releases
* [[Mathematica]]<ref>[http://reference.wolfram.com/mathematica/ref/format/HDF.html HDF Import and Export] Mathematica documentation</ref> offers immediate analysis of HDF and HDF5 data
* [[Perl]]<ref>[https://metacpan.org/release/PDL-IO-HDF5 PDL::IO::HDF5]</ref>
* [[Python (programming language)|Python]] supports HDF5 via [http://www.h5py.org h5py] (both high- and low-level access to HDF5 abstractions) and via [https://pytables.github.io/index.html PyTables] (a high-level interface with advanced indexing and database-like query capabilities). HDF4 is available via [https://pypi.python.org/pypi/python-hdf4 Python-HDF4] and/or [http://hdfeos.org/software/pyhdf.php PyHDF] for both Python 2 and Python 3. The popular data manipulation package [[Pandas (software)|pandas]] can import from and export to HDF5 via {{Proper name|PyTables}}.
* [[R (programming language)|R]] offers support in the [http://bioconductor.org/packages/release/bioc/html/rhdf5.html rhdf5] and [https://CRAN.R-project.org/package=hdf5r hdf5r] packages.
* [[Rust (programming_language)|Rust]] can gain HDF support through third-party libraries like [https://crates.io/crates/hdf5 hdf5].
 
==Tools==
* [https://github.com/HDFGroup/hdf5-spark-connector Apache Spark HDF5 Connector] HDF5 Connector for Apache Spark
* [https://github.com/apache/drill/tree/master/contrib/format-hdf5 Apache Drill HDF5 Plugin] HDF5 Plugin for Apache Drill enables SQL Queries over HDF5 Files.
* [https://wiki.earthdata.nasa.gov/display/HPD/HDF+Product+Designer/ HDF Product Designer] Interoperable HDF5 data product creation GUI tool
* [http://www.space-research.org/ HDF Explorer] A data visualization program that reads the HDF, HDF5 and netCDF data file formats
* [http://www.hdfgroup.org/hdf-java-html/hdfview/ HDFView] A browser and editor for HDF files
* [http://www.vitables.org/ ViTables] A browser and editor for HDF5 and PyTables files written in Python
* [https://www.giss.nasa.gov/tools/panoply/ Panoply] A netCDF, HDF and GRIB Data Viewer
* [https://github.com/silx-kit/silx silx] A browser for HDF files specifically for synchrotron X-ray data
 
==See also==
Line 133 ⟶ 95:
==External links==
*{{Official website}}
*[https://web.archive.org/web/20180806024407/https://support.hdfgroup.org/HDF5/whatishdf5.html What is HDF5?]
*[http://hdfeos.org/ HDF-EOS Tools and Information Center]
*[http://www.opennavsurf.org/ Open Navigation Surface]