Metody Inteligencji Obliczeniowej

Download Report

Transcript Metody Inteligencji Obliczeniowej

Neural network applications:
The present and the future
Włodzisław Duch
Department of Informatics,
Nicolaus Copernicus University, Toruń, Poland
Google: W. Duch
ICONIP’08 Panel Discussion
In the year 1900 at the
International Congress of
Mathematicians in Paris David
Hilbert delivered what is now
considered the most important
talk ever given in the history of
mathematics, proposing 23 major
problems worth working at in
future. 100 years later the impact
of this talk is still strong: some
problems have been solved, new
problems have been added, but
the direction once set - identify
the most important problems and
focus on them - is still important.
It became quite obvious that this
new field also requires a series of
challenging problems that will
give it a sense of direction.
• Wlodzislaw Duch,
•
•
•
•
•
•
•
What Is Computational Intelligence and Where Is It Going?
Jurgen Schmidhuber,
New Millennium AI and the Convergence of History
Ron Sun, The Challenges of Building Computational Cognitive
Architectures
James A. Anderson et al.
Programming a Parallel Computer: The Ersatz Brain Project
JG Taylor,
The Human Brain as a Hierarchical Intelligent Control System
Soo-Young Lee, Artificial Brain and OfficeMateTR based on Brain
Information Processing Mechanism
Stan Gielen, Natural Intelligence and Artificial Intelligence: Bridging
the Gap between Neurons and Neuro-Imaging to Understand
Intelligent Behaviour
DeLiang Wang,
Computational Scene Analysis
• Nikola Kasabov, Brain-, Gene-, and Quantum Inspired Computational
•
•
•
•
•
•
•
•
Intelligence: Challenges and Opportunities
Robert P.W. Duin, Elżbieta Pękalska, The Science of Pattern
Recognition. Achievements and Perspectives
Wlodzislaw Duch, Towards Comprehensive Foundations of
Computational Intelligence
Witold Pedrycz,
Knowledge-Based Clustering in Computational Intelligence
Vera Kurkova,
Generalization in Learning from Examples
Lei Xu, A Trend on Regularization and Model Selection in Statistical
Learning: A Bayesian Ying Yang Learning Perspective
Jacek Mańdziuk, Computational Intelligence in Mind Games
Xindi Cai and Donald C. Wunsch II,
Computer Go: A Grand Challenge to AI
Lipo Wang and Haixiang Shi,
Noisy Chaotic Neural Networks for Combinatorial Optimization
Grand challenges
Our discipline is broad, and there many grand challenges for the
next 20 years.
•
•
•
•
•
•
•
Foundations for CI theory, integrating all methods.
Learning from data in difficult cases
Complex models, structured data, natural perception
Understanding brain/mind relations, neuromorphic models
Natural language processing
Combining CI (perception) with AI (systematic reasoning)
Towards artificial minds
Artificial Minds (AMs), or personoids, are software and robotic agents
that humans can talk to and relate to in a similar way as they relate to
other humans.
Neurocognitive informatics!
Current projects
• Learning data with inherent complex logic,
general theory of CI and meta-learning.
•
•
•
•
•
Infant lab for developing perfect babies, testing for problems/talents,
and other neuroengineering projects – observing real behavior and
understanding these observations.
Understanding real brains: breaking neural code, brain stem model,
priming in cortex, generative disease models.
Brain-inspired cognitive architectures, avatars with artificial minds,
emotions, creativity & hi-level cognition.
Neurocognitive inspirations in natural language processing, large
scale semantic memories, word games, structuring information,
precisiation of queries, semantic web, text annotation, bibliography,
literature-based discovery.
Interactive art projects, computer games.
What is there to learn?
Brains ... what is in EEG? What happens in the brain?
Industry: what happens?
Genetics, proteins ...
What can we learn?
Good part of CI is about learning.
What can we learn?
Neural networks are universal approximators and evolutionary
algorithms solve global optimization problems – so everything
can be learned? Not quite ...
Duda, Hart & Stork, Ch. 9, No Free Lunch + Ugly Duckling Theorems:
• Uniformly averaged over all target functions the expected error for all
learning algorithms [predictions by economists] is the same.
• Averaged over all target functions no learning algorithm yields
generalization error that is superior to any other.
• There is no problem-independent or “best” set of features.
“Experience with a broad range of techniques is the best insurance for
solving arbitrary new classification problems.”
Data mining packages
GhostMiner, data mining tools from our lab + Fujitsu:
http://www.fqspl.com.pl/ghostminer/
• DM packages: Weka, Yale, RapidMiner, Orange, Knime ...
>180 packages on the-data-mine.com list!
Hundreds of components ... thousands of combinations ...
Our treasure box is full, although computer vision, BCI
and other problems are not solved.
• We can data mine forever … and publish forever!
• Neural networks are universal approximators and evolutionary
algorithms solve global optimization problems – so everything
can be learned? Not quite ...
Are we really
so good?
Surprise!
Almost nothing
can be learned
using such tools!
What have we tried: SBM
Similarity-Based Methods (SBMs) organized in a framework:
p(Ci|X;M) posterior classification probability or y(X;M) approximators,
models M are parameterized in increasingly sophisticated way.
Why? (Dis)similarity:
• more general than feature-based description,
• no need for vector spaces (structured objects),
• more general than fuzzy approach (F-rules are reduced to P-rules),
• includes kNN, MLPs, RBFs, separable function networks, SVMs,
kernel methods and many others!
Components => Models; systematic search selects optimal combination
of parameters and procedures, opening different types of optimization
channels, trying to discover appropriate bias for a given problem.
Start from kNN, k=1, all data & features, Euclidean distance, end with a
model that is a novel combination of procedures and parameterizations.
Transformation-based framework
Extend SBM adding fine granulation of methods and relations between
them to enable meta-learning by search in the model space.
For example, transformations (layers) frequently do:
• linear projection: unsupervised - PCA, ICA … or supervised –
•
•
•
•
•
FDA, LDA, linear SVM generate useful linear components;
non-linear preprocessing transformation, ex. MLP;
feature selector, based on information filter;
matching pursuit network for signal decomposition;
logical rules to handle unusual situations;
evaluate similarity (RBF).
DM requires more transformations!
More meta-learning
Meta-learning: learning how to learn, replace experts who search for
best models making a lot of experiments.
Search space of models is too large to explore it exhaustively, design
system architecture to support knowledge-based search.
•
•
•
•
Abstract view, uniform I/O, uniform results management.
Directed acyclic graphs (DAG) of boxes representing scheme
placeholders and particular models, interconnected through I/O.
Configuration level for meta-schemes, expanded at runtime level.
An exercise in software engineering for data mining!
Intemi, Intelligent Miner
Meta-schemes: templates with placeholders
•
•
•
•
•
•
May be nested; the role decided by the input/output types.
Machine learning generators based on meta-schemes.
Granulation level allows to create novel methods.
Complexity control: Length + log(time)
A unified meta-parameters description ...
InteMi, intelligent miner, coming “soon”.
How much can we learn?
Linearly separable or almost separable problems are relatively
simple – deform or add dimensions to make data separable.
How to define “slightly non-separable”?
There is only separable and the vast realm of the rest.
Spying on networks
After initial transformation, what still needs to be done?
Conclusion: separability in the hidden space is perhaps too much to
desire ... rules, similarity or linear separation, depending on the case.
Parity n=9
Simple gradient learning; quality index shown below.