Mining Complex Evolutionary Phenomena

Download Report

Transcript Mining Complex Evolutionary Phenomena

Mining Complex Evolutionary Phenomena
M. Jiang, M. Coatney, S. Mehta, S. Parthasarthy, R. Machiraju
Computer and Information Science
The Ohio State University
D. Thompson, B. Gatlin
Center for Computational Sytems
Mississippi State University
T-S. Choy, S. Barr, J. Wilkins
Department of Physics
Ohio State University
Complex Evolutionary Data
2
Insights Into Evolutions
 Study evolution through simulations
 Model them using continuum models
 Obtain discrete models and solve
 Generate data
 However, …
3
Data Horror Stories …
4.5 million points
1500 time steps with full volume
output every 4 time steps (375
solutions)
750 MB per solution
281.25GB of data
O(108) grid points
Generates >10 Terabytes per day (every day)
Write to disk every 1/1000 time steps (99.9% discarded)
Final database ~1 Terabyte
All analysis is done after final database is obtained
…
4
Analysis, Mining Visualization
5
Solutions !
 Get the rings of the smoke
 Track them in time
 Mine their properties
 Use some science drivers
6
Driver 1 - CFD
Vortices
7
Swirling Features
8
Swirling Features
9
CFD Of Interest – Bronchial Flow
 Complex Non-rigid, Fractal-like Geometry
 Deep recursive branching structure
 Need insights into how flow changes
 Study Vortices, swirling flow
 Q: Persistence of vortex ?
 Implications
 Pulmonary drug delivery
 Carcinogen Deposition
10
Flow Evolution – Internal Flow
11
Object of Study: Vortices

Swirling regions

Core (Center of vortex) and swirling streamlines …
12
Driver 2 - Material Formation
Grain
Grain
Boundary
13
MD Of Interest – Defect Evolution
 Active Device sizes (Si-based transistors) passive
components (alloys) are shrinking
 At sub-micron levels extended defects effect
performance
 Extended defects
 Si is doped with Boron in a “Hot Bath”
 Non-uniform solidification
 Arise from point defects
 Study evolution of point defects and formation of
extended defects
 Q: What structures finally remain ?
14
Defect Evolution
15
Object of Study: Defects
Defect Atoms - Red !


Point defects – interstitial and vacancy
Interstitial – Si atoms located at non-bulk position
16
Problem Statement
 Need – Locating, Characterizing & Tracking Structures in Large
Domains.
 Acts of Discovery and Perseverance!
 Approach desired
 Tied to simulations
 Multiple time scales
 Organized Search
 Encode Structure, dynamics and relationships
 Incorporate complex physics in discovery
 Classification and categorization (similarity)
 Verification of discovered entities for veracity
 Generalize to other domains
17
Framework
Application
CFD, MD, …
Sensor
Multires
Transforms
Meta-stability
Detection
Transient
Detection
Feature Mining
Event Detection
Feature Tracking
Spatio-temporal
Rule Mining
Catalog
18
Components
 Sensors –
 Monitoring a stream
 Swirl (CFD), Energy (MD)
 Multiresolution Analysis
 Temporal wavelet transform
 Casual transforms
 Eulerian Framework
 Can be used with a spatial sub-division
 Event Detection
 Changes in Feature Demographics
 Birth, death, continuation
 Aggregation, bifurcation
 Has impact on tracking
19
Tracking - Correspondence
Lagrangian
Framework
20
Feature Mining Mechanics
 Do not just use raw data
 Features – A feature is a manifestation of the
correlations between various parameters
 Feature Mining –
 Extract meta-stable features using underlying
physics
 Describe features as tangible shapes
21
Shapes
Point cloud
Proximity graphs
Conical frusta
22
Similar Efforts - CFD
Marusic, Kumar, Karypis, Interrante, U of Minn.
Frequent subgraphs
23
Similar Efforts - MD
I1 Defect !





Alloys (Ni3Al)
Defect is infrequent, atomsets of bulk are not !
Run common substructure discovery algorithm
Get bulk !
Remove atoms contained in common substructure atomsets
Remainder of structure is defect!
24
Our Efforts
Finding Needles In HayStack
25
Feature Mining 1
Data
Transform
Tour Grid
Aggregate
Denoise
Operator
Classify Points
Rank
Track
Catalog
ROIs
Classify-Aggregate
26
Applying To Defect Detection
Visit all atom sites
Atom-site: Is it part of defect ?
Spatially aggregate atoms in located areas !
Works for quenched defects (local equilibria)
27
Feature Mining for Defects
 Build spatially local classifiers
 Define Bulk
 Form Rules to define Bulk --C1, C2,…,Cn
 Typical Rules:
 C1 = prescribed bond length
 C2 = prescribed bond angle
 Defect is not bulk
28
Feature Mining for Defects
 Core Defect Atoms will satisfy
C = ~C1 AND ~C2 AND ~C3 … AND ~Cn
 Find neighborhood by locating atoms which
satisfy
D = ~C1 OR ~ C2 OR ~C3 …. OR ~Cn
Defect = Embed C graph in D graph
D is needed to deal with noise and uncertainty
of conditions Ci
Cluster all atoms in D
29
Results – I3 Defects
I3A Defect
I3B Defect
30
Related Work - SAL
Original
Yip&Zhao 96
Aggregate
Classify
Redescribe
31
Does It Work Always ?
Im( )

V
conv


Compute Swirl
Local Classification Method
Swirling regions contain
vortices


False Positives !

Cannot extract structures !
Classify-Aggregate
32
Solution - Feature Mining 2
Data
Transform
Tour Grid
Operator
Verify
Denoise
Aggregate
Rank
Track
Catalog
ROIs
Aggregate-Classify (Verify)
33
Classify-Aggregate
Yellow: Good Green:Bad
Yellow ones really swirl !
34
Classifier
Simple and efficient !
Can be error prone 
Since One verifies
Point-based approach:
 Label neighbors
 Combinatorial:
 Locally check for
complete triangles




35
Verification
2 Swirling Criteria
Verification Tools
36
Non-verifiable Regions
37
Defects at Finite Temp.
Visit all atom sites
Atom-site Is part of defect ?
Spatially aggregate atoms in located areas !
Quench defect to verify
38
Current Work
 Streaming
 Tracking and Correspondence
 Shape Descriptors
 Data Structures for Data Management
 Spatio-temporal associations
39
Summary





Computational Sciences need computational
instruments
Need to be scalable and use all lessons learned from
parallel, distributed, streaming and out-of-core
implemenations
Need to exploit underlying source of data
Should provide good hooks to data-mining and
intelligent systems
Need very Interdisciplinary work !
40
Questions ?
41