Transcript Document

Data East company profile
Company of about 85 employees
Based in Akademgorodok (Novosibirsk, Russia),
Founded from the “Novosibirsk Regional Center
of Geoinformation Technologies of the Russian
Academy of Sciences”
Own products and services
GIS software development service
Data preparation service
Extensions for ArcGIS
Drive Time Engine
Personal Internet Map Server
Map Engine
Well Tracking
Map Engine
DoubleGis products’ line:
– Desktop system
– PocketPC application
Map Engine
Atlas of Siberian Region
- Navigation system for Siberian region
- Data East products (CityExplorer, PersonalIMS, etc.)
Personal IMS
Data preparation service
Partners and customers worldwide
ESRI, Inc. (USA)
GlobeXplorer, Inc. (USA)
NewFields, Inc. (USA)
Exponent (USA)
InstallShield, Inc. (USA)
The Crown Estate (UK)
ChevronTexaco (USA)
Shell Group
De Beers Group
U.S. Army Corps of Engineers (USA)
Bowater (Canada)
Rotorua District Council (New Zealand)
Geoscience Australia (Australia)
Bristol City Council (UK)
Newcastle City Council (UK)
Bureau of Land Management (USA)
U.S. Fish and Wildlife Service (USA)
Tauw bv (Netherlands)
Washington State Department of Ecology (USA)
and more…
Data Mining in Geoinformation Systems
Data Mining Tasks:
Associations Discovery
Sequence-based Analysis
On-Line Analytical Processing (OLAP)
Forecast sales for new store location
Target variable – sales
Properties of stores:
• Size
• Number of employees
• Number of parking spaces
Trade area attributes:
• Demographic variables like
income, age, educational
obtainment, ethnicity
• Intersections with
Prediction Task: 7 Steps to Glory
Step 1: Preparation of datasets
• The set of objects must be homogeneous
• The same measurement for different objects should be
measured in the same scale
• The set of measurements should be complete for every
• Cannot use the target variable while calculation the values
for source variables
• The number of objects should be reach enough
Prediction Task: 7 Steps to Glory
Step 2: Calibration of variables
Types of variables:
• Boolean variable (multi-valued logics is allowed)
• Nominal variable
• Ordered nominal variable
• Discrete variable
• Continuous variable
• Continuous variable with constraints
• Continuous variable of exp-type
Prediction Task: 7 Steps to Glory
Step 3: Statistical Analysis
• Calculate the mean value, the standard deviation for every
• Calculate the correlation matrix
Step 4: Normalization of source variables
Step 5: Reduction of source variables
Step 6: Thinning data and finding outliers
Step 7: Constructing a predictor
• Calculate the predictor with minimal complexity
• Test the predictor on independent sample dataset
On-Line Analytical Processing
Datasets for Analysis
• Fact table
• Categorization of
columns to be
mapped to
dimensions of the
On-Line Analytical Processing
Cube structure:
• Measures
• Dimensions
categorized in
• Attributes of
Query language:
• Specialized
Spatial OLAP for ArcGIS Desktop
Select a spatial dimension
Spatial OLAP for ArcGIS Desktop
Select a geoprocessor
Spatial OLAP for ArcGIS Desktop
Specify a request to OLAP provider
Spatial OLAP for ArcGIS Desktop
Select dimension
Spatial OLAP for ArcGIS Desktop
Select attributes of feature layer
Spatial OLAP for ArcGIS Desktop
Splines for Data Mining under
SDM Data:
Core objects (vectors, vector collections)
Solvers of SLAEs
SDM Mining:
Core Data Mining (statistics, outlier analysis, Least Squares fitter)
Transformations of variables
Approximation (polynomial regression, radial basic functions)
SDM Splines:
Univariate polynomial splines (interpolation, smoothing, averaging)
Multivariate analytic splines (interpolation, smoothing, regression,
Splines for Data Mining under
Contact information
At Data East we are always open for cooperation and new partnership!
Data East, LLC
P.O. Box 664, Novosibirsk 630090, Russia
+7 (383) 3-320-320
+7 (383) 3-325-785
[email protected]
[email protected]