Towards Service Oriented Geoscience SEE Grid and APAC Grid

Download Report

Transcript Towards Service Oriented Geoscience SEE Grid and APAC Grid

Towards Service Oriented Geoscience
SEE Grid and APAC Grid
Dr Robert Woodcock
Executive Manager, e-Science
www.csiro.au
Outline
•
Industry drivers
•
Inefficiencies in “geoscience” modelling workflow
•
The Solid Earth and Environment Grid
•
The APAC (Geoscience) Grid
•
Putting it all together: pmd*CRC Modelling Workflow for
Industry problems
•
Results and what might the future hold?
2
Australian National Research Priorities
Frontier Technologies for Building and Transforming
Australian Industries:
Stimulating the growth of world-class Australian industries using innovative
technologies developed from cutting-edge research
Priority Goal 4: Smart information use
Improved data management for existing and new business applications and
creative applications for digital technologies
 ICT applications are providing huge opportunities to deliver new
systems, products, business solutions, and to make more efficient
use of infrastructure
 The ability of organisations to operate virtually and collaborate across
huge distances in Australia and internationally hinges on our
capabilities in this area
3
Key points from case studies and support letters
•
Show the diversity of use cases for the same data
type throughout the mining value chain
•
Show a strong business case for interoperability
for management of your data in the external world
•
Show an even stronger business case for
interoperability for internal data management
•
Show why standards need to be developed by
groups working together as part of a community
•
Highlight the emerging issue that responsibility of
data quality becoming a legislative issue
4
Key Driver: Input to the Minerals Exploration Action Agenda –
July 2003
Industry input highlighted
 problems in gaining access to pre-competitive
geoscience information
 described existing information as commonly
incomplete and fragmented across eight
government agencies, each with its own
information management systems and
structures
 noted that the disparate systems lead to
inefficiencies causing higher costs, reduced
effectiveness and increased risk incurred by the
industry and its service providers
5
Source: http://www.industry.gov.au/assets/documents/itrinternet/minerals_aa_finalreport_July2003.pdf
Modelling Workflow
Define the geological problem
Build the model
Run the model
strong
View and Interpret Results
Iterate to achieve
Understanding
strong
Tensile failure
very
weak
weak
Report and feed into
knowledge base
What is the role of:
…Must be repeatable, robust
and timely
6
Block model of dilation:
• Competency
showing
impact contrasts?
of Fault set “A” Dip
variation
• Permeability?
mod.
strong
• Pore fluid pressure & flow fields?
mod.
strong
mod.
strong
mod.
strong
Inefficiencies in the Workflow
Information is scattered across:
 Organisations – company, geological survey, etc
 Resources – different hardware and software platforms
 Geography – geological surveys in each state and territory (region) in
Australia
Can these issues be removed?
 Cost of data integration is high, in some situations
exceeding all other costs
• Computational resources:
 Different architectures suit different numerical codes better
 Are often available but outside your organisations direct control
 Are setup in different ways
 Cost of adapting an investigators specific toolkit to use
multiple sites is often prohibitive
7
The Solid Earth and Environment Grid
Obtaining information…
www.csiro.au
The SEE Grid Community
Working together (loosely) to develop a
toolkit for interoperability for the Solid Earth
and Environmental Sciences
 Together… because our information and services
need to be shared more easily to achieve our goals
 Loosely… because ultimately we are separated by
political and economic boundaries
 Toolkit… because our World is dynamic and we
need tools that can be reconfigured and chained
together quickly to answer our questions
…in this context we must reduce the barriers
to becoming a part of the community
9
Pre-competitive geoscience data - The trouble is…
Proprietary
Software
Versions of
Software
Client
Data
Structures
10
Slide courtesy of Stuart Girvan
Our aim…
Client
XML
GML/XMML
11
Slide courtesy of Stuart Girvan
GA Reports
Application
WebMap
Composer
CLIENT
APPLICATIONS
Common Interface Binding – GML/XMML
DATA
ACCESS
SERVICES
DOIR
Web Feature Translation
Service (WFS)
Geoserver
(Open Source)
DATA
SOURCES
12
DOIR
Geochemistry
Little or
Feature
Data Source
PostGIS
(Open Source)
GA
to standards here
Web Feature Service (WFS)
PIRSA
Web Feature Service (WFS)
no
PIRSA
change
required
Geochemistry
Feature
Data Source
PostGIS
(Open Source)
here
GA
Geochemistry
Feature
Data Source
Oracle
pmd*CRC
Model Tools
GA Reports
Application
CLIENTS
WebMap
Composer
FracSIS
?
Common Interface Binding – GML/XMML
DOIR
WFS
GA
WFS
PIRSA
WFS
DATA SERVICES
NSWDPI
WFS
DATA SOURCES
13
NRM
WFS
MRT
WFS
NTGS
WFS
VICDPI
WFS
The Solid Earth and Environment Grid
Information - Implementation and Examples
www.csiro.au
Common Interface Binding - Details
Two parts
1. Service interface standard – how you communicate with
the service, sending requests and receiving results
2. Information standards – how information is encoded in a
community agreed form
We use and develop Open Geospatial Consortium and the
Exploration and Mining Mark-up Language and its
successor, GeosciML
15
Open Geospatial Consortium
Web Feature Service (WFS)
Application (web based or desktop)
Get Capabilities Request
XML/
KVP
Get Capabilities Response
XML
XML/
KVP
Describe Feature Type Request
Describe Feature Type Response
XML/
KVP
Get Feature Request
Get Feature Response
GML
Schema
GML
Web
Feature
Service
Config
Files
Data
Source
http protocol
16
Response in Geography Mark-up
Language (GML)
- Or more usefully, a GML
Application Schema
Features – Geoscience Community (XMML & GeoSciML)
Borehole








collar location
shape
Fault
collar diameter
 shape
length
 surface trace
operator
Basin?
 displacement
logs
 formations
 age
related observations
 shape – time dependent
 …
…
 resource estimate
 …
Ore-body
Observation







location
subject/specimen/station
property/theme
method
operator
date/time
result (+ type/reference
system/scale/classification)
 …
17






commodity
deposit type
host formation
shape
resource estimate
…
Data source to community schemas
Community schemas provide the common or shared
model
All data providers have their own local data model
 All data providers must map data from local source
(database) to community schema, irrespective of
technology implementation
18
19
Why XML?
 Extensibility
 Self describing
 Ability to be (remotely) validated against schema
 XML Schema provides “loose tolerances”
 All software languages have tools to deal with XML
But…
Problematic for large data sets…
though nobody said you can’t use binary as well (even over
WFS)  Community agreement is what matters
20
How would you use an interoperable service?
Rendered
into a map
layer AND
queried by
a user or….
A user makes a
request and gets back
GML based data which
can be ….
…
formatted
into a
report or
….
… read and
used by any
enabled
application
21
Slides courtesy Stuart Girvan – Geoscience Australia
Web Map Interface (courtesy of Social Change Online)
Bounding Box
Known Layers
22
Tabular Reports by Source
(courtesy of Geoscience Australia)
23
Desktop Visualisation (courtesy of Fractal
Technologies)
24
High Performance Computing in Exploration
and Mining
www.csiro.au
Why use simulation and modelling?
•Mineral exploration has considerable uncertainty
•We use simulation and modelling to analyse an ensemble
of possible geological structures and histories that could
have produced the observations seen today
•The result is reduced uncertainty and some quantification
of risk
This same approach applies to many fields – hazards,
environment, … which is why we formed SEE Grid
community
26
Our toolkit…
Our toolkit contains a variety of
codes (usually more than one
each type) for
Darcy flow and Streamlines
 Mechanics
 Chemistry
 Transport
 Thermal
 Fluid flow
Some of these can be coupled together:
 Reactive Transport – Chemistry+Transport+Thermal+Fluid flow
Some scenarios only require a subset…
It becomes very computationally intensive when using many…
AND we run many scenarios
Grid Computing provides a solution
27
Drill Core
Analysis Workflow
Client
Applications
Tsunami
Workflow
Mantle
Convection
Modelling
Workflow
Reactive
Transport
Workflow
Community Agreed Service Interfaces and Information Models
Gateway
Services
APAC Web
Feature Service (WFS)
Industry Web
Feature Service (WFS)
Geological Survey
Web Feature Service (WFS)
Facilities
APAC Data and
Compute Grid
28
Government
Geological
Surveys
Data and
Knowledge Grid
Industry
Data and
Knowledg
e
Grid
Grid Technology Layers
pmd*CRC
Community-specific Knowledge Environments
and Networks for Research and Education
Customised for discipline- and project specific applications
eg, 3D models, Geophysics, Thermodynamics, Fluids, Geochronology
SEE Grid
e-Science and e-Geoscience Layer
Data and
Information
Infrastructure
APAC Grid
Visualisation
3-D models
Application
Portals
Base Computing Technologies
Networks,
Communications
29
Data and
Knowledge
Portals
High
performance
computing
High Volume
Storage
Middleware
Architecture
The Grid Application… Service Interactions
User
Workflow...
Client
Edit Problem
Description
Login
Authentication
Run
Simulation
Resource
Registry
Job
Monitor
Job Management
Service
Local
Repository
Archive
Search
Data Management
Service
Community
Infrastructure
Information
Geology W.A
Geology S.A
30
Computation
Geochem W.A
Geochem
N.S.W
Physical Resource
FastfloRT
Service
Escript
Service
HPC Repository
Physical Resource
Traditional Mechanical Modelling Workflow
• Models (mesh + data files)
are individually and
laboriously constructed
• The manual process is
error prone
• “Powerful” desktop
computes several
models at a time
• Limitations are in
the order of ~2
models per week
Slide courtesy of Robert Cheung and Warren Potma
31
• Results are manually
visualised one at a time
• Screenshots are manually
taken and made into “movies”
• Very little, if any, standardised data archiving is
done. This results in potential confusion or loss of
the originating conditions of the experiments,
making it unrepeatable in the long term
New Refined Workflow
Parameterised
Geometry
Creation
• Parameterised template or wizard
driven model geometry/mesh
creation
• Boundary condition & model properties
parameter sweep utilities
Automated
generation of
visualisations
Automated
movie
generation
Automated
archiving
• automatically creates a “family” of
model, data files based on varying
a set of parameters
• Inversion algorithms
• determine input parameters of
future iterations automatically
based on the user ranking of
previous results
32
Multi-site data storage via
Storage Resource Broker
Slide courtesy of Robert Cheung and Warren Potma
3D Time
varying
volume
visualisation
Results to Date
For one Investigator, on one investigation:
•
500 Models in 4 months (100x more!)
•
Inversion/parameter sweep algorithms – semi-automated
model creation; faster, less errors
•
Automated post-processing/visualisation – all views X all
timescale X all models await the investigator
automatically
•
Automated archiving – metadata searchable, more
accurate store of experimental conditions, delivered to
your store!
33
Results
Major inefficiencies have been removed by:
• Integrating the pmd*CRC geoscience modelling workflow
with the:
• Solid Earth and Environment Grid, and
• APAC (Geoscience) Grid
Industry response to approach is supportive as evidenced
by SEE Grid Roadshow survey results and pmd*CRC
applications
34
Name
Dr Robert Woodcock
Title
Executive Manager, e-Science
Phone
+61 8 6436 8780
Email
[email protected]
Web
www.csiro.au
www.seegrid.csiro.au
Thank You
Contact CSIRO
Phone
1300 363 400
+61 3 9545 2176
Email
[email protected]
Web
www.csiro.au
www.csiro.au