brochure - Department of Knowledge Technologies

Download Report

Transcript brochure - Department of Knowledge Technologies

Our mission is to advance cutting-edge research and
applications of knowledge technologies that support the
analysis, modeling and management of knowledge and
data.
We have authored and edited numerous scientific books
and coordinated several EU projects.
Our technologies have been successfully applied to many
practical problems.
We are active in education and transfer of knowledge, and
act as a bridge between science and industry in Slovenia
and abroad.
Department of
KNOWLEDGE TECHNOLOGIES
Jožef Stefan Institute
Contents
Basic Information …………………………….……... 2
Scientific highlights ………………………….……… 6
Relevance highlights ………………….……….…. 12
Vision ………………………………………..………….. 23
Basic information
30 years of research tradition
– Founded as Department of Artificial Intelligence in 1979
– Department of Knowledge Technologies since 2004
– 30 researchers, 20 students/external, 5 support staff
Research areas
– Data Mining
– Text, Web and Multimedia
Mining
– Semantic Web
– Human Language
Technologies
– Decision Support
– Knowledge Management
Application areas
–
–
–
–
–
–
–
–
Ecology, Geology
Medicine, Health care
Biomedicine, Systems biology
Agriculture, Forestry
Telecommunications
Digital libraries
Cultural heritage
eGov, eBusiness, eLearning
2
Collaborations
In Slovenia
– Center for Knowledge Transfer in IT at Jožef
Stefan Institute (JSI)
– Jožef Stefan International Postgraduate
School
– Spin-offs: Temida and Quintelligence
– Cycorp Europe established in 2007 at JSI
International
– Collaboration with over 100 partners of EU
projects, academic and industrial
– Strong ties with over 20 other partners,
including CMU, Stanford University, NASA
Ames, Microsoft Research and Osaka
University
Industry
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
British Telecom
France Telecom
Siemens Business
Solutions
Empolis/Bertelsman
Atos Origin
Software AG
UN FAO
Dassault Aviation
BRGM (Bureau de
recherches géologiques
et minières)
SINTEF (Norway)
iSOCO (Spain)
Ontoprise (Germany)
SIRMA (Bulgaria)
Helsinki Institute of
Technology
FZI Karlsruhe
CRF FIAT
3
Education and knowledge transfer
Teaching
– MSc and PhD courses in major research areas of
knowledge technologies
– Supervision of BSc, MSc, and PhD students
Institutions
– Jožef Stefan International Postgraduate School
– Universities of Nova Gorica, Maribor, Ljubljana, Primorska,
Graz University
videolectures.net
– World leading video lectures Web portal
Summer schools
– Semantic Web Summer School 2004 – 100 attendees
– AI Summer School ACAI-05 – 100 attendees
High school student competitions
– Yearly Computer Science competitions
– Books of tasks and solutions
4
Selected publications before 2004
5
Department of
KNOWLEDGE TECHNOLOGIES
Scientific Highlights
2004-2008
6
Scientific results 2004 - 2008
Publications in prestigious journals
– Journal of Machine Learning Research (3),
Machine Learning (6), Decision Support Systems
– Ecological modeling (10),
Journal of Biomedical Informatics (3)
Awards
– Two elected ECCAI fellows (2007, 2008)
– Prešeren BSc thesis award (2007)
– Best software award (ESWC-2006)
PC chairs of major scientific events
– DS-06, ESSLLI-07, ILP-08
– ECML/PKDD-07, ECML/PKDD-09
High SCI citations of group members
Editors/authors of numerous books
and proceedings
7
Scientific highlight: Subgroup discovery
New methods and systems
– Discovering interesting subgroups in
tabular data:
• SD, CN2-SD, APRIORI-SD
– Discovering interesting subgroups in
multi-realtional data:
• RSD
Breakthrough technology
– Effective method for using ontologies in relational data mining
• Using GO, KEGG, ENTREZ to form relational features
• Successful discovery of new scientific knowledge in functional
genomics
Journal papers
– MLJ 2004a and 2004b, JMLR 2004, …, IEEE TSMC 2006, MLJ 2007,
JBI 2007, JMLR 2008, JBI 2008
8
Scientific highlight: Equation discovery
New methods
– Integrating process-based domain knowledge and models
Breakthrough technology
– Integrating knowledge-based and data-driven modeling of
dynamic systems
Systems LAGRAMGE 2.0, IPM
Numerous successful applications
– Modeling aquatic ecosystems
• Lake Bled, Ohrid, Kasumigaura, Greifensee, Glumsoe
Journal papers in MLJ, Ecological Modeling, …
State-of-the-art-survey book
– Computational Discovery of Scientific Knowledge
9
Scientific highlight: Text mining and visualization
New methods
– Text processing, clustering, SVM,
ontology construction, …
– Graph and text visualization
Breakthrough technology
– Open source text mining SW
Systems
– Text-Garden – text-mining library
(http://www.textmining.net)
– Document-Atlas – text visualization
(http://docatlas.ijs.si/)
– OntoGen – semi-automated ontology
construction (http://ontogen.ijs.si/)
• Award winner at ESWC 2006
Content-Land
10
Scientific highlight: Qualitative decision support
New methods
– Qualitative DS modeling
– Truly hierarchical, probabilistic
Systems DEXi 2.0, proDEX
Monograph on Decision Support
Journal papers in Decision Support Systems,
Journal of Operational Research, Ecological Modeling, …
Successful applications
– GM farming models and DS systems,
Highway control, …
– SW Tools: SIGMEA Maize Coexistence Advisor (SMAC Advisor),
ECOGEN Soil Quality Index (ESQI)
ECOLOGY
water
quality
greenhouse
gasses
runoff
water
undergrnd
water
N2O
soil
state
nutrition
state
pesticide
use
CO2
fertilizer
use
soil
biodiversity
indirect
CO2
chemical
disturbance
fuel
use
soil
fertilization
biodiversity
physical
stress
climatic
disturbance
pollinators
predators
physical
disturbance
parasitoids
herbivores
weed
biomass
herbicide
use
insecticide
use
fungicide
use
weed ctrl.
applications
farm type
CONTEXT
soil
climate
weed
profile
crop
sub-type
chemical
fertiliz. use
water
managmt
CROP MANAGEMENT
soil
tillage
weed
control
pest
control
disease
control
CROP PROTECTION
11
Department of
KNOWLEDGE TECHNOLOGIES
Relevance Highlights
2004-2008
12
Relevance highlight: European projects
FP6 projects
“… Knowledge Technologies is the most successful Slovenian
program in terms of EU projects.”
National Research Fund director
F. Demšar,
Oct. 25, 2007
FP6
20 EU projects
– 4 IP projects, 1 NoE, 3 SSA, 1 CA
– 11 STREP projects
– Coordination of one STREP project (IQ)
In FP6 we acquired
~ 25% of Slovenian FP6-IST funds
(5.1+ Mio EUR), i.e. ~ 7% of Slovenian
FP6 funds
13
FP6 European projects
•
•
•
•
•
•
•
•
•
SEKT - Semantically-Enabled Knowledge
Technologies (IP, 2004 – 2006)
ECOLEAD - European Collaborative
Networked Organizations Leadership
Initiative (IP, 2004–08)
NeOn - Lifecycle Support for Networked
Ontologies (IP, 2006–10)
Co-Extra – GM and non-GM Supply Chains:
Their Co-existence and Traceability (IP, 2008–
09)
PASCAL - Pattern Analysis, Statistical
Modelling and Computational Learning (NoE,
2003–07)
CEC-WYS - Central European Centre for
Women and Youth in Science (SSA, 2004–07)
IST-World - Knowledge Base for RTD
Competencies (SSA, 2005–07)
WS DEBATE - Stimulating Policy Debate on
Women and Science Issues in Central Europe
(SSA, 2006–08)
KD-ubiq - A blue print for ubiquitous
knowledge discovery systems (CA, 2005–08)
•
•
•
•
•
•
•
•
•
•
•
ALVIS - Superpeer Semantic Search Engine
(STREP, 2004–06)
SIGMEA - Sustainable Introduction of GMOs
into European Agriculture (STREP, 2004–07)
IMAGINATION - Image-based Navigation in
Multimedia Archives (STREP, 2006–09)
SMART - Statistical Multilingual Analysis for
Retrieval and Translation (STREP, 2006–09)
SWING - Semantic Web Services
Interoperability for Geospatial Decision
Making (STREP, 2006–09)
TAO - Transitioning Applications to Ontologies
(STREP, 2006–09)
E.E.T Pipeline - European Embryonal Tumor
Pipeline (STREP, 2007–09)
E4 - Extended Enterprise management in
Enlarged Europe (STREP, 2006–08)
Tool-East - Open Source Enterprise Resource
Planning and Order Management System for
Eastern European Tool and Die Making
Workshops (STREP, 2006–08)
IQ - Inductive Queries for Mining Patterns and
Models (STREP, 2005–08), Coordinator
HEALTHREATS - Integrated Decision Support
System for HEALTH THREATS and crises
management (STREP, 2007–10)
14
FP7 European projects (in 2008)
FP7
– 3 IP, 2 STREP, 1 NoE, 1 CSA
– In FP7 we have acquired ~ 30% of Slovenian FP7-ICT funds (2.5+ Mio
EUR)
Projects
– COIN - COllaboration and INteroperability for networked enterprises (IP,
2008–12)
– ACTIVE - Enabling the Knowledge Powered Enterprise (IP, 2008–11)
– EURIDICE - European Inter-Disciplinary Research on Intelligent Cargo for
Efficient, Safe and Environment-friendly Logistics (IP, 2008–11)
– PASCAL2 - Pattern Analysis, Statistical Modelling and Computational Learning
2 (NoE, 2008–13)
– BISON - Bisociation Networks for Creative Information Discovery (FET, 2008–
11)
– PHAGOSYS - Systems biology of phagosome formation and maturation modulation by intracellular pathogens (STREP, 2008–11)
15
Industrial participation in European projects
We have helped 11 Slovenian companies to become partners
of FP6 and FP7 EU projects. Total value of EC contribution for
these industrial partners is more than 2 Mio EUR.
– Orodjarski grozd
– Avtomobilski grozd
– Grozd visokotehnološke
opreme
– Kogast Grosuplje
– Emo orodjarna
–
–
–
–
–
–
Valji Štore
Tecos
Quintelligence
Cycorp
Amebis
Hermes Softlab
16
Relevance highlight: Slovene language and heritage
nl.ijs.si portal
• Largest public repository
of Slovene language
resources
• ~ 10,000 requests/day
• Annotated language
corpora
• Lexicons and dictionaries
• On-line tools for
language processing:
concordancers,
lemmatisers, taggers
• eZISS digital library of
critical editions of Slovene
literature
17
Relevance highlight: Environmental data analysis
Applied projects
– Agriculture: modeling co-existence of genetically
modified and conventional crops
– Forestry (automated forest mapping):
• from satellite images instead of LIDAR
• cost reduction: from 660 to 0.01 US$/km2
– Fire risk model: Deployed in e-GIS UJME
Events
– ECEM/EAML-04: European Conf. on Ecological
Modeling: Env. App. of Machine Learning
– Special issues of Ecological Modeling journal
Postgraduate education
– University of Nova Gorica
– Jožef Stefan International Postgraduate School
– University of Trento
18
Relevance highlight: Healthcare data analysis
Projects MediNet and MediNet+ for
the Slovenian Ministry of Health
– Qualifications of physicians
• Modeling and exception finding
– Planning of needs for physicians
– Accessibility of primary
healthcare
19
Relevance highlight: Public portals
videolectures.net
World leading video lectures Web portal
• 4,500+ videos of 3,000+ lectures at 150+ events
• About 3,000 views/day
• Collaboration with CMU, Cambridge, Oxford,
Max Planck, Berkeley, INRIA
• To include all MIT OpenCourseWare and
CERN lectures base in 2008
20
Relevance highlight: Public portals
www.ist-world.org
World leading Web portal for
analyzing European science
• 90,000 RTD organizations,
68,000 RTD projects,
1.6 Mio experts and
2.5 Mio publications
• About 15,000 visits/day
• Extending coverage to
Russia, India, SEE countries
21
Organization of events
Slovenian events
– Solomon seminar – regular public seminar,
running for 9 years (200+ seminars)
– SiKDD – yearly Slovenian Conference on Data Mining
– Language Technologies - biennial conferences
International events
–
–
–
–
ECEM/EAML-04: Eur. Conf. on Ecol. Modeling – 100 attendees
European Semantic Web Conference 2006 – 350 attendees
IDA 2007 – 100 attendees
10+ international meetings and workshops (~50 attendees)
International events planned
– ECML/PKDD 2009 – est. 400 attendees
– WWW 2012 (in process) – est. 1500 attendees largest CS event
to be organized in Slovenia
22
Department of
KNOWLEDGE TECHNOLOGIES
Vision
23
Future advances in basic research (1)
Data analytics
– Structured data analysis (structured prediction,
bissociation analysis, …)
– Sensor network analysis, social network analysis
(large graph data)
– Multi-modal data analysis (information fusion,
different data types)
– Complex data visualization
Text analytics
– Extending TextGarden to multimedia mining
(text, image, Web) and social network analysis
– Advancing information extraction, machine translation
– Ontologies and Semantic Web
24
Future advances in basic research (2)
Human language technologies
– Semantic annotation of Slovene language corpora
– Integrated digital library of Slovene text-critical editions
– Slovene cultural heritage – processing old (19th century)
language
Decision support
– Integration of qualitative and quantitative methods
– Handling incompleteness, uncertainty and imprecision
Knowledge management
– Web 2.0, Semantic Web services
– Networked organizations
– eLearning – videolectures.net
25
Impact on other sciences through
applied research
Environmental sciences
– Ecology (Aquatic, Modeling the response of ecosystems to
climate change)
– Forestry
– Environmental epidemiology
Agriculture
Biomedicine
– Bioinformatics, Functional genomics, Systems biology
– Medicine
Linguistics, Humanities and Social sciences
Engineering
Impact will be achieved in collaboration with partners of EU projects
26
National relevance
Developing of IT and building a knowledge-based society
– Through basic research
– Training competent researchers in this area
– Education (at graduate and post-graduate level)
Applied research
– Impact of the potential introduction of GM crops,
environmental epidemiology of tick-borne diseases,
introduction of ML technology for Slovene-English machine
translation systems, analytic techniques for enterprise
knowledge management, systems biology
Continue opening new high-tech jobs in Slovenia
– Through direct industrial applications
– Through inclusion of Slovenian industry in EU projects
27
Means for achieving our vision
• Clear scientific focus on
knowledge technologies
• Excellent links with scientists
abroad
• Excellent links with industry
• Young and visionary staff
• Available equipment
• Secured funding:
– About 25 % from national long-term
research program Knowledge
Technologies and other national
and international projects
– About 75 % from EU funded
projects
2.500.000
Total funding
2.000.000
EU projects
1.500.000
1.000.000
Knowledge
Technologies
500.000
National projects
0
2004 2005 2006 2007 2008
Funding in 2008
Knowledge
Technologies
17%
EU projects
71%
Targeted research
projects
7%
Applied research
projects
2%
Basic research
projects
3%
Knowledge
Technologies
Targeted research
projects
Applied research
projects
Basic research
projects
EU projects
28
Principal researchers – project leaders
Nada Lavrač
Head of Department
Marko Bohanec
Sašo Džeroski
Marko Debeljak
Tomaž Erjavec
Marko Grobelnik
Dunja Mladenić
Mitja Jermol
29
Department members - Bohinj 2008
30
Notes
31
Notes
32
Jožef Stefan Institute and Postgraduate School
Jožef Stefan (1835-1893) was one of the
most distinguished physicists of the
j = σT4
19th century. He originated the law
of the total radiation from a black body.
Founded in 1949, Jožef Stefan Institute is the leading Slovenian
scientific research institute, covering a broad spectrum of basic
and applied research. The staff of more than 850 specializes in
natural sciences, life sciences and engineering.
Department of Knowledge Technologies is one of the seven ICT
departments of the institute. Other departments are in the area of
chemistry, biochemistry, ecotechnology, nanotechnology, physics,
nuclear technology and safety.
Founded in 2004, Jožef Stefan International Postgraduate School offers
MSc and PhD programs: ICT, nanotechnology and ecotechnology.
Courses are taught in English.