Transcript Slide 1
The HDF Group
Introduction to HDF5
Session ?
HDF5 Mathematical Concepts
Copyright © 2010 The HDF Group. All Rights Reserved
1
www.hdfgroup.org
Fundamental HDF5 Objects
• Groups
• Containers of links
• Allow creating arbitrary directed graphs, including nontreelike and cyclic structures
• Datasets
• Multi-dimensional arrays (currently)
• Based on mathematical concept of “fiber bundle” –
representing the values of a field over a space
Copyright © 2010 The HDF Group. All Rights Reserved
2
www.hdfgroup.org
Groups - Overview
• Groups are container objects in a file that follow a
“set” data structure semantic:
• Groups contain links
• No two links in a group can have the same name
• Links have two components:
• Name
• Destination
• Three types of links currently:
• Hard – Destination is object in same file
• Soft – Destination is path to object in same file
• External – Destination is path to object in another file
Copyright © 2010 The HDF Group. All Rights Reserved
3
www.hdfgroup.org
Groups
• Tree, with hard links
Copyright © 2010 The HDF Group. All Rights Reserved
4
www.hdfgroup.org
Groups
• Non-Tree, with hard links
Copyright © 2010 The HDF Group. All Rights Reserved
5
www.hdfgroup.org
Groups
• Cyclic, with hard links
Copyright © 2010 The HDF Group. All Rights Reserved
6
www.hdfgroup.org
Groups
• Tree, with soft links
Copyright © 2010 The HDF Group. All Rights Reserved
7
www.hdfgroup.org
Groups
• Tree, with external links
Copyright © 2010 The HDF Group. All Rights Reserved
8
www.hdfgroup.org
Groups - Discussion
• What would happen if links didn’t have names, but
objects had names?
• What other types of links are useful?
Copyright © 2010 The HDF Group. All Rights Reserved
9
www.hdfgroup.org
Datasets - Overview
• Datasets are objects in an HDF5 that represent “real”
application data
• Array-like currently
• Datasets have three components:
• Dataspace describes current and maximum dimensions of
array
• Datatype describes type of elements in array
• Elements are the values stored in the array
Copyright © 2010 The HDF Group. All Rights Reserved
10
www.hdfgroup.org
Datasets – Measurement Example
• Think of algebraic concept of independent and
dependent variables
• X-Y Plot:
Copyright © 2010 The HDF Group. All Rights Reserved
11
www.hdfgroup.org
Dataset – Measurement Example, 2
• X-Y Plot data in Database:
Copyright © 2010 The HDF Group. All Rights Reserved
12
www.hdfgroup.org
Dataset – Measurement Example, 3
• X-Y Plot data in HDF5 Dataset:
Copyright © 2010 The HDF Group. All Rights Reserved
13
www.hdfgroup.org
Dataset – Measurement Example, 4
• In HDF5, independent variables are implicit and not
stored (they are the coordinates of elements in array)
• In Database, independent variables are explicitly
stored in each record
• A “packed” HDF5 dataset of N dimensions is up to N
times smaller than database table storing the same
data.
Copyright © 2010 The HDF Group. All Rights Reserved
14
www.hdfgroup.org
Datasets - Discussion
• When would storing data in a database table be better
than storing the same data in an HDF5 dataset?
• If you were measuring two dependent values at each
coordinate, what are the trade-offs between storing
them as a pair for each element in a single dataset and
storing each one in a separate dataset?
Copyright © 2010 The HDF Group. All Rights Reserved
15
www.hdfgroup.org
Review
• Fundamental HDF5 Objects are:
• Groups
• Containers of links to objects
• Create arbitrary directed graph structures
• Datasets
• Multi-dimensional arrays of elements
• Based on mathematical concept of fiber bundles, but can be
thought of in terms of independent and dependent variables
Copyright © 2010 The HDF Group. All Rights Reserved
16
www.hdfgroup.org
Stretch Break
Copyright © 2010 The HDF Group. All Rights Reserved
17
www.hdfgroup.org
Dataset – Fiber Bundles
• HDF5 Datasets actually based on mathematical
concept of “fiber bundles”
A fiber bundle consists of the data (E, B, π, F), where E, B, and F are
topological spaces and π : E → B is a continuous surjection satisfying
a local triviality condition outlined below. The space B is called the
base space of the bundle, E the total space, and F the fiber.
Copyright © 2010 The HDF Group. All Rights Reserved
18
www.hdfgroup.org
Dataset – Fiber Bundles, 2
Mathematic
HDF5
Fiber Bundle
Dataset
Base Space
Dataspace
Fiber Space
Datatype
Section
Elements
Copyright © 2010 The HDF Group. All Rights Reserved
19
www.hdfgroup.org