intro_to_ai - CIMMS - OU Cooperative Institute for

Download Report

Transcript intro_to_ai - CIMMS - OU Cooperative Institute for

Data-driven methods in
Environmental Sciences
Exploration of Artificial Intelligence Techniques
[email protected]
[email protected]
1
Data Driven Methods
What is Artificial Intelligence?
Common AI techniques
Choosing between AI techniques
Pre and post processing
[email protected]
2
What is AI?

Machines that perceive, understand and
react to their environment
Goal of Babbage, etc.
 Oldest endeavor in computer science


Machines that think
Robots: factory floors, home vacuums
 Still quite impractical

[email protected]
3
AI vs. humans

AI applications built on Aristotlean logic



Computers never as good as humans



Induction, semantic queries, system of logic
Human reasoning involves more than just induction
In reasoning and making sense of data
In obtaining a holistic view of a system
Computers much better than humans


In processing reams of data
Performing complex calculations
[email protected]
4
Successful AI applications

Targeted tasks more amenable to
automated methods

Build special-purpose AI systems
Determine appropriate dosage for a drug
 Classify cells as benign or cancerous


Called “expert systems”
Methodology based on expert reasoning
 Quick and objective ways to obtain answers

[email protected]
5
Data Driven Methods
What is Artificial Intelligence?
Common AI techniques
Choosing between AI techniques
Pre and post processing
[email protected]
6
Fuzzy logic

Fuzzy logic addresses key problem in
expert systems
How to represent domain knowledge
 Humans use imprecisely calibrated terms
 How to build decision trees on imprecise
thresholds

[email protected]
7
Fuzzy logic example
Source: Matlab fuzzy logic toolbox tutorial
http://www.mathworks.com/access/helpdesk/help/toolbox/fuzzy/fp350.html
[email protected]
8
Advantages of fuzzy logic

Considerable skill for little investment

Fuzzy logic systems piggy bank on human
analysis
Humans encode rules after intelligent analysis of
lots of data
 Verbal rules generated by humans are robust


Simple to create
Not much need for data or ground truth
 Logic tends to be easy to program


Fuzzy rules are human understandable
[email protected]
9
Where not to use fuzzy logic

Do not use fuzzy logic if:




Humans do not understand the system
Different experts disagree
Knowledge can not be expressed with verbal rules
Gut instinct is involved


Not just objective analysis
A fuzzy logic system is limited


Piece-wise linear approximation to a system
Non-linear systems can not be approximated

Many environment applications are non-linear
[email protected]
10
Neural Networks

Neural networks can approximate nonlinear systems

Evidence-based


Weights chosen through optimization procedure
on known dataset (“training”)
Works even if experts can’t verbalize their
reasoning, or if there is ground truth
[email protected]
11
A example neural network
Diagram from:
http://www.codeproject.com/useritems/GA_ANN_XOR.asp
[email protected]
12
Advantages of neural networks

Can approximate any smooth function


Can yield true probabilities


Training process is well understood
Fast in operations


If output node is a sigmoid node
Not hard to train


The three-layer neural network
Training is slow, but once trained, the network can
calculate the output for a set of inputs quite fast
Easy to implement

Just a sum of exponential functions
[email protected]
13
Disadvantages of neural networks

A black box



Measure of skill needs to be differentiable



The final set of weights yields no insights
Magnitude of weights doesn’t mean much
RMS error, etc.
Can not use Probability of Detection, for example
Training set has to be complete



Unpredictable output on data unlike training
Need lots of data
Need expert willing to do lot of truthing
[email protected]
14
Recap:

Fuzzy logic
Humans provide the rules
 Not optimal


Neural network
Humans can not understand system
 Optimal


Middle ground?
Genetic Algorithms
 Decision Trees

[email protected]
15
Genetic algorithms

In genetic algorithms
One fixes the model (rule base, equations,
class of functions, etc.)
 Optimize the parameters to model on
training data set
 Use optimal set of parameters for unknown
cases

[email protected]
16
An example genetic algorithm
Sources:
http://tx.technion.ac.il/~edassau/web/genetic_algorithms.htm
http://cswww.essex.ac.uk/research/NEC/
[email protected]
17
Advantages of genetic algorithms

Near-optimal parameters for given model
Human-understandable rules
 Best parameters for them


Cost function need not be differentiable


The process of training uses natural
selection, not gradient descent
Requires less data than a neural network

Search space is more limited
[email protected]
18
Disadvantages of genetic algorithms

Highly dependent on class of functions

If poor model is chosen, poor results


Optimization may not help at all
Known model does not always lead to
better understanding
Magnitude of weights, etc. may not be
meaningful if inputs are correlated
 Problem may have multiple parametric
solutions

[email protected]
19
Decision trees

Can automatically build decision trees from
known data
Root
 Prune trees
30 50
 Select thresholds
 Choose operators
 Disadvantages
T < 10C
T > 10C
 Piece-wise linear, so typically less
skilled than neural networks
20 15
10 35
 Large decision trees are effectively a
blackbox
 Can not do regression, only
classification
Z > 45 Z < 45
V<5 V>5
 Advantages:
18 2
2 13
82
2 33
 Fast to train
 New advances: bagged, boosted
decision trees approach skill of neural
networks, but are no longer fast to train
[email protected]
20
Radial Basis Functions
Diagram from: A. W. Jayawardena & D. Achela K. Fernando 1998: Use of Radial Basis Function Type Artificial Neural
Networks for Runoff Simulation, Computer-Aided Civil and Infrastructure Engineering 13:2



Radial Basis Functions are a form of neural network
 Localized gaussians
 Linear sum of non-linear functions
Advantage: Can be solved by inverting a matrix, so very fast
Disadvantage: Not a general-enough model
[email protected]
21
Data Driven Methods
What is Artificial Intelligence?
Common AI techniques
Choosing between AI techniques
Pre and post processing
[email protected]
22
Typical data-driven application
Which features?
Input Data
How do we find f()
Features
AI application
in run-time
f(features)
Result
[email protected]
23
What is the role of the data?

Validation


Test known model
Technique:


Calibration


Find parameters to model with desired structure
Technique:



Difference between model output and ground truth helps to validate the
model
Tuned fuzzy logic method
Genetic algorithms
Induction


Find model and parameters from just data
Technique:

Neural network methods, bagged/boosted decision trees, support vector
machines, etc.
[email protected]
24
What is the problem to solve?

Do you have a bunch of data and want to:

Estimate an unknown parameter from it?




Classify what the data correspond to?





True rainfall based on radar observations?
Amount of liquid content from in-situ measurements of
temperature, pressure, etc?
Regression
A water surge?
A temperature inversion?
A boundary?
Classification
Regression and classification aren’t that different

Classification: estimate probability of an event

A function from 0-1
[email protected]
25
Which AI technique?

Do you have expert knowledge?


Humans have a “model” in their head? Should the final f() be
understandable?
Create fuzzy logic rules from experts’ reasoning


Aggregate the individual fuzzy logic rules
Can tune the fuzzy rules based on data




Many times the original rules are just fine
Do you already know the model?


A power-law relationship? Gaussian? Quadratic? Rules?
Just need to find parameters to this model?




Using regression, decision trees or neural networks for RMS error criterion
Genetic algorithms for error criteria like ROC, economic cost, etc.
If linear, just use linear regression
If non-linear: use genetic algorithms
Use continuous GAs
Both of these can be used for regression (therefore, also
classification)
[email protected]
26
Which AI technique (contd.)

Do you know nothing about the data?



Not the suspected equation/model (GA)?
Not the suspected rules (fuzzy logic)?
Use a AI technique that supplies its equations/rules


For classification, use:

Bagged decision trees or Support Vector Machines




“black box”.
If output is probabilistic, remember to apply Platt scaling
Summary statistics on bagged DTs can help answer “why”
Neural Networks
For regression, use:

Neural networks
[email protected]
27
Where do your data come from?

Observed data


Compute features
Choose AI technique


The 4 choices in the previous two slides
Simulated data:




Example: trying to replicate a very complex model
Throw randomly-generated data at model
Compute features
Choose AI technique:


GA for parametric approximations
NN when you don’t know how to approximate
[email protected]
28
Where do you get your inputs?

What type of data do you have?

Individual observations?


Sparse observations in a time series?



Sample them (choose at random) and use directly
Generate time-based features (1D moving windows)
Signal processing features from time series
Data from remotely sensed 2D grids?


Generate image-based features using convolution filters
Do you need:

Pixel-based regression/classification?
Use convolution features directly
Object-based regression/classification?
Identify regions using region growing
Use region-aggregate features




[email protected]
29
Typical data-driven application
Observed data
Signal/image processing;sampling
Features
normalize/create chromosome/
determine confidences
f()
FzLogic/GenAlg/NN/DecTree
Platt method/region-average/threshold
A data-driven application
in run-time
Result
[email protected]
30
Data Driven Methods
What is Artificial Intelligence?
Common AI techniques
Choosing between AI techniques
Pre and post processing
[email protected]
31
Preprocessing

Often can not use pixel data directly



Different data sets may not be collocated



Need to interpolate to line them up
Mapping, objective analysis
Noise in data may need to be reduced



Too much data, too highly correlated
May need to segment pixels into objects and use features
computed on the objects
Smoothing
Present statistic of data, rather than data itself
Features need to be extracted from data

Human experts often good source of ideas on signatures to
extract from data
[email protected]
32
Postprocessing

The output of an expert system may be grid
point by grid point

May need to provide output on objects



Storms, forests, etc.
Can average outputs over objects’ pixels
May need probabilistic output


Scale output of maximum marginal techniques
Use a sigmoid function

Called Platt scaling
[email protected]
33
Summary

What is Artificial Intelligence?


Common AI techniques


Fuzzy logic, neural networks, genetic algorithms, decision
trees
Choosing between AI techniques




Data-driven methods to perform specific targeted tasks
Understand the role of your data
Do experts understand the system? (have a model)
Do experts expect to understand the system? (readability)
Pre and post processing

Image processing techniques on spatial grids
[email protected]
34