Data Mining

Transcript Data Mining

Impromptu Data
Extraction and Analysis
Data Mining and Analytics
Framework for VLSI Designs
Sandeep P ([email protected]); +91 80 2507 5492
Anand Ananthanarayanan ([email protected]); +91 80 2507 5774
Intel Corporation
Author Affiliations
2
Author
Affiliation
Phone Number
Email Address
Sandeep P
Intel Corporation
+91 80 2507 5492
[email protected]
Anand Ananthanarayanan
Intel Corporation
+91 80 2507 5774
[email protected]
, FOR INTERNAL USE ONLY
Intel Information Technology
Abstract
Design processes from Logic design, validation, backend implementation and verification require a plethora of CAD tools. These tools generate reports,
debug information in its own form and content. Designers need to parse and review data from multiple sources and tools to make design calls. When
implementing backend design, a designer or a methodology owner needs to understand the patterns seen in the design. Data like number of paths
dominated by low leakage, Slope profile for cells with margin > x ps, Drive strength profile of cells in timing path, etc., are critical to make design
decisions, optimize design collaterals and ensure design with robust electrical functionality. Many of the data can be obtained only through data mining
of results and logs of multiple tools. Data mining is also a constant activity from technology readiness to execution and post silicon debug phases. Data
mining problem gets compounded when data is needed from different PV domains. For example, a designer looking to optimize power would need
dynamic power information, path margin and max cap information all generated by different tools in different formats in different locations. Data
mining has been historically done by adhoc scripts to parse through different reports, and log files. Data generated is post processed and then
visualized. Any requirement change in data mining would need changes in the scripts. There is no data mining model which supports multiple tools
with different output formats. There is no methodology which supports cross domain analysis.
We present IDEA (Impromptu Data Extraction and Analysis). IDEA is data mining and data analysis framework in a highly interactive web application
platform. It supports assimilating data from different tools and formats into one data organization in the form of SQL tables. SQL enables compact
organization and faster queries. IDEA framework is built using the Linux-Apache™-Mysql™-Perl (LAMP) packages and uses the R language for
performing statistical analysis on the data. R language enables handling huge amount of data with support for different statistical plots like pie-charts,
histograms, box plots, scatter plots, Linear regression etc. IDEA data mining completes in minutes compared to hours/days with conventional
approaches like scripts. IDEA is highly interactive web application with all the data extraction and plotting functionalities abstracted using highly
interactive widgets. IDEA has been used to data mine power savings post Optimization, Analysis of power distribution, Profile the speed paths, Review
standard cells usage, Utilization of cell sizes across the design space, RC delays per path stage and has multiple other usages.
Large precious unorganized data lies unexploited. Structured Data Mining essential for competitive VLSI design. Increasing complexity makes data
analytics a must-have for quality design. No EDA tool exists today to do this critical data mining. IDEA fills this gap and provides valuable data mining
capability. It is time to think of Data Mining as a EDA product
3
, FOR INTERNAL USE ONLY
Intel Information Technology
Design Process
Design Reports
Timing Reports
Extraction reports
Route utilization
Cell utilization
DRC reports
Layout Checks
Logic
Implement
Verify
Functional Circuit Design
Multiple Tools
Multiple Reports
Multiple Formats
Large Data gets generated requiring interpretation and Analysis
4
, FOR INTERNAL USE ONLY
Intel Information Technology
Data
Conundrum
Design Quality Increasingly Dependent on Multiple Parameters
5
, FOR INTERNAL USE ONLY
Intel Information Technology
Data Mining - A Constant Activity
Tech Readiness
Design execution
Post silicon
• Data mining is done to generate design heuristics
• Data mining is done to determine delta changes on
design limits
• Data mining needed for optimization
• Data mining is done to root cause and understand the
PV-Silicon miscorrelation
Formal Data Mining Tool or Model Not Currently Available In Industry
6
, FOR INTERNAL USE ONLY
Intel Information Technology
To solve this Data Mining problem, we present
7
, FOR INTERNAL USE ONLY
Intel Information Technology
IDEA
Impromptu Data Extraction and Analysis (IDEA) is
 Web application for Data mining on an open architecture
 Linked Data Caching SQL databases
 Common Xml interface for data manipulation
 Statistical analysis capability with ‘R’ Language
 Practically unlimited capacity with ‘R’ Language
 Data visualization capability
 Histograms, pie charts, density/scatter plots, dot charts
 Faster turn around time (no text parsing scripts)
 Intuitive, web based user interface
Highly Interactive Application for Data Mining
8
, FOR INTERNAL USE ONLY
Intel Information Technology
IDEA
Web Based Data Mining Platform
9
, FOR INTERNAL USE ONLY
Intel Information Technology
IDEA Architecture
Application Tier
Presentation Tier
DataBase
Storage Tier
Three Tiered Web Application
10
, FOR INTERNAL USE ONLY
Intel Information Technology
Architecture – Idea Client
Data
Extraction
Control
Center
Data
Manipulation
Data
Viewer
AJAX Calls
JSON for data
transfer
Experiments
Apps
Idea Server
Report
Viewer
Statistical Analysis
Spreadsheet Generation
PDF Converter
Simple Client with Powerful Capabilities
11
, FOR INTERNAL USE ONLY
Intel Information Technology
Basic Usage Flow
Open IDEA App
12
Select Project
Select Blocks
Manipulate Data
Generate PDF or
export to
spreadsheets
Run Statistical
Analysis and
Reports
Run pre-selected
Queries OR Query
interactively
, FOR INTERNAL USE ONLY
Intel Information Technology
IDEA Usage and Benefits
Datamine
power
savings
post Optim
RC delays
per path
stage
Utilization
of cell sizes
across the
design
space
Analysis of
power
distribution
IDEA
Usage
Profile the
speed paths
Review
standard
cells usage
- Data Mining Simplified 13
, FOR INTERNAL USE ONLY
Intel Information Technology
Summary
•
•
•
•
•
Large precious unorganized data lies unexploited
Structured Data Mining essential for competitive VLSI design
Increasing complexity makes data analytics a must-have for quality design
No EDA tool exists today to do this critical data mining
IDEA fills this gap and provides valuable data mining capability
- It is time to think of Data Mining as a EDA product 14
, FOR INTERNAL USE ONLY
Intel Information Technology
Acknowledgements
 Everyone at Intel who contributed to this work
15
, FOR INTERNAL USE ONLY
Intel Information Technology

Data Mining

Transcript Data Mining

Directory