Transcript Slide 1

The HDF Group
Introduction to HDF5
Session Two
Data Model Comparison
HDF5 File Format
Copyright © 2010 The HDF Group. All Rights Reserved
1
www.hdfgroup.org
Our Purpose Today
1) Familiarize you
with HDF5 and its
capabilities.
2) Help you understand
how HDF5 might be
applied to your data
management challenges.
Copyright © 2010 The HDF Group. All Rights Reserved
2
www.hdfgroup.org
HDF5 Data Model
File
Link
Dataset
Group
Datatype
Dataspace
Attribute
HDF5 Objects
Copyright © 2010 The HDF Group. All Rights Reserved
3
www.hdfgroup.org
Developing a Project Data Model
Project Domain Concepts
Relational
Logical Data Model
HDF5 Data
Model
A Relational
Database
Physical Instantiation
HDF5 File
Copyright © 2010 The HDF Group. All Rights Reserved
4
www.hdfgroup.org
Logical Data Models
Copyright © 2010 The HDF Group. All Rights Reserved
5
www.hdfgroup.org
HDF5 / Directories and Files
HDF5
Directories (Folders) and Files
file
filesystem
dataset
file
datatype
~ file type or extension
dataspace
~ file size
attribute
~ properties (Windows)
group
directory (Unix) or folder (Windows)
link
hard links & symbolic links (Unix); ~shortcuts (Windows)
• Both support hierarchies for organizing information
(and to some degree, directed graphs)
Copyright © 2010 The HDF Group. All Rights Reserved
6
www.hdfgroup.org
HDF5 / XML
HDF5
XML
file
document
dataset
element
datatype
simple or complex type definitions in XML Schema
dataspace
~ minOccurs, maxOccurs in XML Schema
attribute
attribute
group
~ element with sub-elements
link
~ IDREF
• Both support rich metadata and allow new types to be
defined
• HDF5 objects designed for numeric data; XML objects
designed for text
Copyright © 2010 The HDF Group. All Rights Reserved
7
www.hdfgroup.org
HDF5 / Relational Databases
HDF5
Relational Database
file
database
dataset
data table
datatype
char, varchar, number, blob, raw, date, …
dataspace
~ records
attribute
?
group
?
link
?
• HDF5 supports multi-dimensional arrays with common
datatypes in the cells; locate by offset
• RDB support rows with different data types in fields;
locate by primary key
Copyright © 2010 The HDF Group. All Rights Reserved
8
www.hdfgroup.org
HDF5 Technology Platform
• HDF5 data model
• The “building blocks” for data
organization and specification
• HDF5 software
• Library, language interfaces, tools
• HDF5 file format
• Bit-level organization of HDF5 file
Copyright © 2010 The HDF Group. All Rights Reserved
9
www.hdfgroup.org
HDF5 File Format
• Defined by the HDF5 File Format Specification
• Specifies the bit-level organization of an HDF5 file on
storage media
• Maps the data model objects to a linear address space
• Other representations of the data model objects are also
possible, but those are not the HDF5 format
• Self-describing
• All the information necessary to read and reconstruct the
data model objects is specified by the format
• Designed to work well with other technologies
• Designed for speed and storage efficiency
• Binary format
Copyright © 2010 The HDF Group. All Rights Reserved
10
www.hdfgroup.org
HDF5 File Format Specification
Introduction
You can have the power of the format without worrying about
the details of the specification.
Copyright © 2010 The HDF Group. All Rights Reserved
11
www.hdfgroup.org
Developing a Project Data Model
Project Domain Concepts
Relational
Logical Data Model
HDF5 Data
Model
A Relational
Database
Physical Instantiation
HDF5 File
Copyright © 2010 The HDF Group. All Rights Reserved
12
www.hdfgroup.org
Physical Instantiations
Copyright © 2010 The HDF Group. All Rights Reserved
13
www.hdfgroup.org
HDF5 / Filesystem
• Both allow traversal of objects in the hierarchy
• Both include internal metadata for fast access to
subsets of the data
• Both can handle variety of data
• HDF5 file can be easily migrated or shared
Copyright © 2010 The HDF Group. All Rights Reserved
14
www.hdfgroup.org
HDF5 / “Binary Flat File”
• “Binary Flat File” = A sequence of bytes representing
(primarily) numeric data. Often written by scientific
and engineering applications to save results from
simulations or experiments.
• A binary flat files usually represents the fastest way to
write numeric data. Read performance varies
depending on access patterns.
• Unlike HDF5, binary flat files are not self-describing or
portable across architectures.
Copyright © 2010 The HDF Group. All Rights Reserved
15
www.hdfgroup.org
HDF5/XML
• Both HDF5 and XML are self-describing and portable
• XML is text-based and requires contents to be
accessed sequentially
• HDF5 is binary and supports random access and
subsetting
Copyright © 2010 The HDF Group. All Rights Reserved
16
www.hdfgroup.org
HDF5/PDF
• Both HDF5 and PDF formats are published and open
• Both can include heterogeneous types of information
• PDF focused on documents
• HDF5 focused on collections of different types, with strong
support for multi-dimensional arrays of numeric data
• Both are portable across architectures
Copyright © 2010 The HDF Group. All Rights Reserved
17
www.hdfgroup.org
HDF5 / Relational Databases
• RDB provides access control features; HDF5 does not
• RDB transaction based; HDF5 is not
• Transactions / Logging introduce overhead that may not be
needed
• HDF5 not designed for many writers to ‘random’ locations
• RDB provides built-in indices to values
• HDF5 provides navigation to datasets / subsets within
datasets
• HDF5 files portable across platforms
Copyright © 2010 The HDF Group. All Rights Reserved
18
www.hdfgroup.org
Discussion
• How could daily temperature measurements made at
various locations throughout a building be modeled in
different formats? Filesytem, Binary Flat File, XML,
PDF, Relational Database
• What are some pros/cons
of each?
Copyright © 2010 The HDF Group. All Rights Reserved
19
www.hdfgroup.org
Review
• HDF5 consists of
• file format
• self-describing
• many internal structures to support high-performance
• software
• data model
• file, dataset, datatype, dataspace, attribute, group, link
• HDF5 designed to support
• management of high-volume, complex data
• data sharing and preservation
Copyright © 2010 The HDF Group. All Rights Reserved
20
www.hdfgroup.org
The HDF Group
HDF5 Data Model
Example
ENSIGHT
Automotive Crash Simulation
Copyright © 2010 The HDF Group. All Rights Reserved
21
www.hdfgroup.org
Automotive Crash Simulation
22
www.hdfgroup.org
Automotive Crash Simulation
23
www.hdfgroup.org
Automotive Crash Simulation
24
www.hdfgroup.org
Solid Modeling
25
www.hdfgroup.org
Solid Modeling
26
www.hdfgroup.org
Modeled in HDF5
Copyright © 2010 The HDF Group. All Rights Reserved
27
www.hdfgroup.org
Mesh Example in HDFView
Copyright © 2010 The HDF Group. All Rights Reserved
28
www.hdfgroup.org
Stretch Break
Copyright © 2010 The HDF Group. All Rights Reserved
29
www.hdfgroup.org