Mining Complex Evolutionary Phenomena
Download
Report
Transcript Mining Complex Evolutionary Phenomena
Mining Complex Evolutionary Phenomena
M. Jiang, M. Coatney, S. Mehta, S. Parthasarthy, R. Machiraju
Computer and Information Science
The Ohio State University
D. Thompson, B. Gatlin
Center for Computational Sytems
Mississippi State University
T-S. Choy, S. Barr, J. Wilkins
Department of Physics
Ohio State University
Complex Evolutionary Data
2
Insights Into Evolutions
Study evolution through simulations
Model them using continuum models
Obtain discrete models and solve
Generate data
However, …
3
Data Horror Stories …
4.5 million points
1500 time steps with full volume
output every 4 time steps (375
solutions)
750 MB per solution
281.25GB of data
O(108) grid points
Generates >10 Terabytes per day (every day)
Write to disk every 1/1000 time steps (99.9% discarded)
Final database ~1 Terabyte
All analysis is done after final database is obtained
…
4
Analysis, Mining Visualization
5
Solutions !
Get the rings of the smoke
Track them in time
Mine their properties
Use some science drivers
6
Driver 1 - CFD
Vortices
7
Swirling Features
8
Swirling Features
9
CFD Of Interest – Bronchial Flow
Complex Non-rigid, Fractal-like Geometry
Deep recursive branching structure
Need insights into how flow changes
Study Vortices, swirling flow
Q: Persistence of vortex ?
Implications
Pulmonary drug delivery
Carcinogen Deposition
10
Flow Evolution – Internal Flow
11
Object of Study: Vortices
Swirling regions
Core (Center of vortex) and swirling streamlines …
12
Driver 2 - Material Formation
Grain
Grain
Boundary
13
MD Of Interest – Defect Evolution
Active Device sizes (Si-based transistors) passive
components (alloys) are shrinking
At sub-micron levels extended defects effect
performance
Extended defects
Si is doped with Boron in a “Hot Bath”
Non-uniform solidification
Arise from point defects
Study evolution of point defects and formation of
extended defects
Q: What structures finally remain ?
14
Defect Evolution
15
Object of Study: Defects
Defect Atoms - Red !
Point defects – interstitial and vacancy
Interstitial – Si atoms located at non-bulk position
16
Problem Statement
Need – Locating, Characterizing & Tracking Structures in Large
Domains.
Acts of Discovery and Perseverance!
Approach desired
Tied to simulations
Multiple time scales
Organized Search
Encode Structure, dynamics and relationships
Incorporate complex physics in discovery
Classification and categorization (similarity)
Verification of discovered entities for veracity
Generalize to other domains
17
Framework
Application
CFD, MD, …
Sensor
Multires
Transforms
Meta-stability
Detection
Transient
Detection
Feature Mining
Event Detection
Feature Tracking
Spatio-temporal
Rule Mining
Catalog
18
Components
Sensors –
Monitoring a stream
Swirl (CFD), Energy (MD)
Multiresolution Analysis
Temporal wavelet transform
Casual transforms
Eulerian Framework
Can be used with a spatial sub-division
Event Detection
Changes in Feature Demographics
Birth, death, continuation
Aggregation, bifurcation
Has impact on tracking
19
Tracking - Correspondence
Lagrangian
Framework
20
Feature Mining Mechanics
Do not just use raw data
Features – A feature is a manifestation of the
correlations between various parameters
Feature Mining –
Extract meta-stable features using underlying
physics
Describe features as tangible shapes
21
Shapes
Point cloud
Proximity graphs
Conical frusta
22
Similar Efforts - CFD
Marusic, Kumar, Karypis, Interrante, U of Minn.
Frequent subgraphs
23
Similar Efforts - MD
I1 Defect !
Alloys (Ni3Al)
Defect is infrequent, atomsets of bulk are not !
Run common substructure discovery algorithm
Get bulk !
Remove atoms contained in common substructure atomsets
Remainder of structure is defect!
24
Our Efforts
Finding Needles In HayStack
25
Feature Mining 1
Data
Transform
Tour Grid
Aggregate
Denoise
Operator
Classify Points
Rank
Track
Catalog
ROIs
Classify-Aggregate
26
Applying To Defect Detection
Visit all atom sites
Atom-site: Is it part of defect ?
Spatially aggregate atoms in located areas !
Works for quenched defects (local equilibria)
27
Feature Mining for Defects
Build spatially local classifiers
Define Bulk
Form Rules to define Bulk --C1, C2,…,Cn
Typical Rules:
C1 = prescribed bond length
C2 = prescribed bond angle
Defect is not bulk
28
Feature Mining for Defects
Core Defect Atoms will satisfy
C = ~C1 AND ~C2 AND ~C3 … AND ~Cn
Find neighborhood by locating atoms which
satisfy
D = ~C1 OR ~ C2 OR ~C3 …. OR ~Cn
Defect = Embed C graph in D graph
D is needed to deal with noise and uncertainty
of conditions Ci
Cluster all atoms in D
29
Results – I3 Defects
I3A Defect
I3B Defect
30
Related Work - SAL
Original
Yip&Zhao 96
Aggregate
Classify
Redescribe
31
Does It Work Always ?
Im( )
V
conv
Compute Swirl
Local Classification Method
Swirling regions contain
vortices
False Positives !
Cannot extract structures !
Classify-Aggregate
32
Solution - Feature Mining 2
Data
Transform
Tour Grid
Operator
Verify
Denoise
Aggregate
Rank
Track
Catalog
ROIs
Aggregate-Classify (Verify)
33
Classify-Aggregate
Yellow: Good Green:Bad
Yellow ones really swirl !
34
Classifier
Simple and efficient !
Can be error prone
Since One verifies
Point-based approach:
Label neighbors
Combinatorial:
Locally check for
complete triangles
35
Verification
2 Swirling Criteria
Verification Tools
36
Non-verifiable Regions
37
Defects at Finite Temp.
Visit all atom sites
Atom-site Is part of defect ?
Spatially aggregate atoms in located areas !
Quench defect to verify
38
Current Work
Streaming
Tracking and Correspondence
Shape Descriptors
Data Structures for Data Management
Spatio-temporal associations
39
Summary
Computational Sciences need computational
instruments
Need to be scalable and use all lessons learned from
parallel, distributed, streaming and out-of-core
implemenations
Need to exploit underlying source of data
Should provide good hooks to data-mining and
intelligent systems
Need very Interdisciplinary work !
40
Questions ?
41