ODB Training 2005
Download
Report
Transcript ODB Training 2005
Introduction to
Observational DataBase
(ODB)
Sami Saarinen, Paul Burton
ECMWF
22-Mar-2006
ODB Training 2006
slide 1
ECMWF
Overview
Introduction to ODB
Creating a simple database
Use of simulobs2odb –program
Visualizing data using odbviewer, ODBTk
The bigger picture
ODB within IFS/4DVAR-system
A more complex database
Manipulating ODB from Fortran90
Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf
ODB Training 2006
slide 2
ECMWF
Overview
Introduction to ODB
Creating a simple database
Use of simulobs2odb –program
Visualizing data using odbviewer, ODBTk
The bigger picture
ODB within IFS/4DVAR-system
A more complex database
Manipulating ODB from Fortran90
Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf
ODB Training 2006
slide 3
ECMWF
Introduction to ODB
ODB is a tailor made database software developed at
ECMWF to manage very large observational data volumes
through the IFS/4DVAR-system, and to enable flexible postprocessing of observational data
Observational database usually contains following items:
Observation identification, position and time coordinates
Observation value, pressure levels, channel numbers
Various quality control flags
Obs. departures from background and analysis fields
Satellite specific information
Other closely related information
ODB Training 2006
slide 4
ECMWF
AMSU-A data before screening
ODB Training 2006
slide 5
ECMWF
Basic components of ODB
ODB/SQL-language
Data Definition Language: To describe what data items
belong to database, what are their data types and how
they are related (if any) to each other
Data Query Language: To query and return a subset of
data which satisfies certain user specified conditions.
This is the key feature of the ODB software !!
Fortran90 interface layer
Data manipulation : create, update & remove data
Execute ODB/SQL-queries and retrieve filtered data
To control MPI and OpenMP-parallelization
ODB Training 2006
slide 6
ECMWF
ODB/SQL compilation system
ODB Training 2006
slide 7
ECMWF
Typical ODB usage patterns
Database can be created interactively or in batch mode
We usually run our in-house BUFR2ODB in batch
New observation types can also be fed in via text file
Complete database manipulation currently prefers using
Fortran90-interface, but read/only database can also be
accessed via rudimentary client-server –interface (C/C++)
When database has been created, the application program
normally queries data and places the result (also known as
view) into a data matrix allocated by the user
There can be virtually any number of active views at any
given time. These can be updated and fed back to database
ODB Training 2006
slide 8
ECMWF
Overview
Introduction to ODB
Creating a simple database
Use of simulobs2odb –program
Visualizing data using odbviewer, ODBTk
The bigger picture
ODB within IFS/4DVAR-system
A more complex database
Manipulating ODB from For
Tools: odbless, odbdif, odbcompress, odbdup, odb2netcdf
ODB Training 2006
slide 9
ECMWF
Creating a simple database
We will create a very simple database using text files
The 3 text files describe
Data layout i.e. what data items comprise this ODB
Location and time information of observations
Actual observation measurement information for each
location at the given pressure levels
Feed these files into simulobs2odb-program
Discover the data values in database by using odbviewer
ODB Training 2006
slide 10
ECMWF
Data definition layout : MYDB.ddl
CREATE TABLE hdr AS (
CREATE TABLE body AS
(
seqno
pk1int,
entryno
pk1int,
obstype
pk1int,
varno
pk1int,
codetype pk1int,
vertco_type
pk1int,
lat
pk9real,
press
pk9real,
lon
pk9real,
obsvalue
pk9real,
date
yyyymmdd,
time
hhmmss,
body
@LINK,
);
);
ODB Training 2006
slide 11
ECMWF
Input file#2 : hdr.txt
#hdr
obstype = 2
codetype = 141
seqno lat lon
1
45 -15
ODB Training 2006
date
time
body.len
20041101
000000
1
slide 12
ECMWF
Input file#3 : body.txt
#body
entryno
varno
vertco_type
press
obsvalue
2
1
50000
251.0
1
ODB Training 2006
slide 13
ECMWF
Running simulobs2odb
Initialize ODB interactive environment :
use odb
Create database using the following simple command :
simulobs2odb –l MYDB –i hdr.txt –i body.txt
As a result of these commands, a small database called
MYDB has been created and it contains one data pool with
two tables hdr and body, which are linked (related) to each
other via special @LINK data type
It is now easy to extend database by providing more data,
or specifying more data items, or adding more tables, or all
above at the same time
ODB Training 2006
slide 14
ECMWF
Visualizing with odbviewer
History: odbviewer was originally written to be used as a
debugging tool for ODB software development
Linked with ECMWF graphics package MAGICS/MAGICS++
it displays coverage plots
Also a textual report generator
Displays output of data queries
“Sensitive” to ODB/SQL-language : tries automatically
produce both coverage plot and textual report for the user
Textual report itself can be invaluable source of information
for further post-processing tasks
ODB Training 2006
slide 15
ECMWF
Running odbviewer
Go to database directory
cd MYDB
Run
odbviewer –q ‘SELECT lat,lon,press,obsvalue\
FROM hdr, body \
WHERE obstype = 2’
ODB Training 2006
slide 16
ECMWF
odbviewer coverage plot
Our observation !!
ODB Training 2006
slide 17
ECMWF
Some odbviewer [options]
-h
List of options (gimme some “help” !)
-q ‘SQL-stmt’
Provide ODB/SQL-statement inline
-v viewname/poolno
Choose SQL name (& optionally pool number)
-p “1-10,12,15”
Choose from a subset of pools
-R
No radians-to-degrees conversion for (lat,lon)
-r
Enforce radians-to-degrees conversion
-c
Clean start (i.e. recompile all)
-e editor
Choose preferred editor
-e batch
Run in batch mode (same as –e pipe)
-N
Do not produce a report at all
-I
Do not show plot immediately
-P projection
Change projection
-C file.cmap
Supply a color map file
-A plot_area
Choose plotting area
ODB Training 2006
slide 18
ECMWF
ODBTk : The ODB Toolkit
GUI based ODB visualisation tool
Easy way for non-experts to build SQL
Interactive viewing of observational data
Can refine SQL “WHERE” statement as you
view the data
Portable, lightweight application
Requires ODB, perl, Fortran90 & C compilers
ODB Training 2006
slide 19
ECMWF
ODBTk : Building an SQL
Twin views on structure
Hierarchical structure
Allows relationship between
tables/columns to be seen
“Flat structure”
Easy to find a given column/member
or table
Allows user to sort structure
SQL library
Both local & shared
ODB Training 2006
slide 20
ECMWF
Visualising Coverage
ODB Training 2006
slide 21
ECMWF
Visualising X-Y plots
ODB Training 2006
slide 22
ECMWF
Overview
Introduction to ODB
Creating a simple database
Use of simulobs2odb –program
Visualizing data using odbviewer, ODBTk
The bigger picture
ODB within IFS/4DVAR-system
A more complex database
Manipulating ODB from Fortran90
Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf
ODB Training 2006
slide 23
ECMWF
AMSU-A data after screening
Under 10% left active !!
ODB Training 2006
slide 24
ECMWF
ODB within IFS/4DVAR-system
ECMA/ODB
CCMA/ODB
Output BUFRs
ODB Training 2006
slide 25
ECMWF
A more complex database
In the real world a database may contain many more tables
(>>5) than in the simple example earlier
Each table can contain 10—50 data columns
There can also be a sophisticated data hierarchy (next
slide) to describe potentially complex relationships
between tables
In order to provide a good parallel performance on
supercomputers, data tables are furthermore divided into
data pools
They behave like sub-databases within a database
Allows much bigger data sets than otherwise possible
ODB Training 2006
slide 26
ECMWF
Comprehensive data hierarchy
ODB Training 2006
slide 27
ECMWF
ECMWF BUFR to ODB conversion
ODBs at ECMWF are normally created by using bufr2odb
Enables MPI-parallel database creation efficient
Allows retrospective inspection of Feedback BUFR data
by converting it into ODB
bufr2odb can also be used interactively, for example:
bufr2odb –i bufr_input_file –I 1-20 –n 4
The preceding example creates 4 pools of ECMA database
from the given BUFR input file, but includes only BUFR
subtypes from 1 to 20 (inclusive)
Feedback BUFR to ODB works similarly:
fb2odb –i feedback_bufr_file –n 8 –u 2
ODB Training 2006
slide 28
ECMWF
Overview
Introduction to ODB
Creating a simple database
Use of simulobs2odb –program
Visualizing data using odbviewer, ODBTk
The bigger picture
ODB within IFS/4DVAR-system
A more complex database
Manipulating ODB from Fortran90
Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf
ODB Training 2006
slide 29
ECMWF
Manipulating ODB from Fortran90
Currently Fortran90 is the only way to fill an ODB database
simulobs2odb is also a Fortran90-program underneath
likewise odbviewer or practically any other ODB-tool
Also: to fetch and update data, Fortran90 is necessary
ODB Fortran90 interface layer offers a comprehensive set
of functions to
Open & close database
Attach to & execute precompiled ODB/SQL queries
Load, update & store queried data
ODB Training 2006
slide 30
ECMWF
An example ODB program
program main
use odb_module
implicit none
integer(4) :: h, rc, nra, nrows, ncols, npools, j, jp
real(8), allocatable :: x(:,:)
npools = 0
h = ODB_open(‘MYDB’, ’OLD’, npools=npools)
< data manipulation loop ; see next page >
rc = ODB_close(h, save=.TRUE.)
end program main
ODB Training 2006
slide 31
ECMWF
Data manipulation loop
DO jp=1,npools
! Execute SQL, allocate space, get data into matrix
rc = ODB_select(h,’sqlview’,nrows,ncols,poolno=jp)
allocate(x(nrows,0:ncols))
rc = ODB_get(h,’sqlview’,x,nrows,ncols,poolno=jp)
! Update data, put back to DB, deallocate space
call update(x,nrows,ncols) ! Not an ODB-routine
rc = ODB_put(h,’sqlview’,x,nrows,ncols,poolno=jp)
deallocate(x)
rc = ODB_cancel(h,’sqlview’,poolno=jp)
! Use the following only with READONLY-databases
! rc = ODB_release(h,poolno=jp)
ENDDO
ODB Training 2006
slide 32
ECMWF
Compile, link and run
(1) use odb
# once per session
(2) odbcomp MYDB.ddl
# once only;often from file MYDB.sch
(3) odbcomp sqlview.sql # recompile only when changed
(4) odbf90 main.F90 update.F90 –lMYDB –o main.x # link
(5) ./main.x
ODB Training 2006
# run
slide 33
ECMWF
Overview
Introduction to ODB
Creating a simple database
Use of simulobs2odb –program
Visualizing data using odbviewer, ODBTk
The bigger picture
ODB within IFS/4DVAR-system
A more complex database
Manipulating ODB from Fortran90
Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf
ODB Training 2006
slide 34
ECMWF
odbless
A textual browser that allows to look at ODB data page-bypage –basis (a little like Unix less-command):
By default calculates statistical summary for each
retrieved data column
Cheap with near-optimal ODB data access pattern
User has a choice of specifying starting row
Usage:
odbless –q ‘SELECT column(s) FROM table(s) WHERE …’ \
–s starting_row –n number_of_rows_to_display \
[–b buffer_size –X]
ODB Training 2006
slide 35
ECMWF
odbdiff
Enables to compare two ODB databases for differences
Very useful tool when trying to identify errors/differences
between operational and experimental 4DVAR runs
Usage:
odbdiff –q ‘SELECT …’ DATABASE1 DATABASE2
By default brings up an xdiff-window with respect to diffs
If latitude and longitude were given in the data query, then
also produces a difference plot using odbviewer-tool
ODB Training 2006
slide 36
ECMWF
odbcompress
Enables creation of very compact database from the
existing one for
archiving purposes, or for smaller footprint
Makes post-processing considerably faster
At this point the user has choices of both
Truncating the data precision
Leaving out columns that are less of importance
Early tests show that this new tool achieves compression
factors from 2.5X to 11X
the higher compression being for satellite data !!
ODB Training 2006
slide 37
ECMWF
odbdup
Duplicates database(s) by copying metadata (low volume),
but shares the actual data (high volume)
Allows database sharing between multiple users
Over shared (e.g. NFS mounted) disk
Enables creation of time-series database, for example:
odbdup –i “200601*/ECMA.conv” –o USERDB
The previous example creates a new database labelled as
USERDB, which presumably spans over all the
conventional observations during January 2006
The heureka is : user has now access to a whole month
of data as if it was situated in one single database !!
ODB Training 2006
slide 38
ECMWF
odb2netcdf
Translates the given ODB-query (or whole ODB-table) into a
series of NetCDF-files, by default one file for each ODB
data pool
Usage:
odb2netcdf –q ‘SELECT …’
The result files can be viewed with standard NetCDF tools
like ncdump and ncview
The files can also be produced in NetCDF packed format
(with a caveat of truncated precision)
ODB Training 2006
slide 39
ECMWF
Also … Some interesting facts
Written mainly in C-language
Except Fortran90-interface and IFS/4DVAR interface
Except BUFRODB (by Milan Dragosavac)
ODB/SQL is currently converted into C-code
10 lines of SQL generates >> 100 lines of C-code
Standalone ODB installation (w/o IFS) is also available
Can be built in about 30 minutes for Linux/laptop
Tested at least on the following machines
SGI/Altix, IBM Power3/4, Linux Intel/AMD, VPP, …
Automatic binary data conversion guarantees database
portability between different machines
ODB Training 2006
slide 40
ECMWF
… and some ODB “limitations”
ODB software is clearly meant for large scale computation
since – given lots of memory and disk space, fast CPUs:
A single program can handle up to 2^31 ODB databases
A single database can have up to 2^31 data pools
A single database can have any number of tables
A single table in a data pool can have up to 2^31 rows
and (by default) 9999 columns
A single ODB/SQL-query over active data pools can
retrieve up to 2^31 rows in one go
These really big numbers show that ODBs potential is on
parallel computers, but we haven’t forgotten desktop PCs!
ODB Training 2006
slide 41
ECMWF
Finally…
ODB software is developed to allow unprecedented amounts
of satellite data through the IFS/4DVAR system
Software has been operational at ECMWF since
June’2000, but is still evolving
Emphasis is now on graphical post-processing and how
to enable fast access to very large amounts of data
Other ECMWF member states and co-operating countries
that are also using or just becoming users of ODB
MeteoFrance, DWD, Hungary, Aladin/HIRLAM-nations
MetOffice is considering via collaboration with BoM
University of Vienna via re-analysis ERA40 collaboration
ODB Training 2006
slide 42
ECMWF