DutchSemanticWebGettogether$DutchSWSlides

Download Report

Transcript DutchSemanticWebGettogether$DutchSWSlides

Dutch Semantic Web Get-together
Vrije Universiteit Amsterdam, March 16th
http://esw.w3.org/topic/DutchSemanticWebGettogether
Organization: Antoine Isaac, Eyal Oren
Sponsor: the Network Institute
Agenda
•
•
•
•
•
•
•
•
•
•
09:30-09:45: welcome
09:45-10:30: big talk (Ivan Herman)
10:30-10:45: coffee + koekjes
10:30-11:30: speeddating
11:30-12:30: lightning talks + discussion
12:30-13:15: lunch
13:15-14:15: lightning talks + discussion
14:15-14:30: coffee
14:30-15:30: lightning talks + discussion
15:30-16:30: speeddating and drinks
Lightening talks - 1
•
•
•
•
•
•
•
•
•
Marshall
Hoekstra
Deursen
Top
Ossenbruggen
Amin
Hildebrand
Wang Yiwen
Brugman
Scott Marshall
http://www.leibnizcenter.org
Rinke Hoekstra (VU/UvA) and Saskia van de Ven (UvA)
Law and Semantic Web
 Representation of regulations using Semantic Web languages
 Legal reasoning using standard (DL) reasoners
Issue: “What if two sources say different things?”
 Specificity
(…)
 Lex Superior
 Temporal Validity
(overwrite)
 Lex Posterior
 Applicability to old cases
 Authority
(implicit)
 Lex Superior
 Jurisdiction
 Location
(…)
 Jurisdiction
 Scope (deeming provision)
 Import
 References to definitions, not documents
(documents)
ELIS – Multimedia Lab
Semantic Web vs. Multimedia Annotation
Feature
extraction
Find the best match
Feature DB
•
•
•
feature extraction results in
low-level concepts
matching algorithms use a
number of rules to propose a
high-level concept
use SW technologies for this
purpose
– formally described feature DB
– formally described rules
Metadata modeling
http://dbpedia.org/resource/Barack_Obama
<Semantic Web vs. Multimedia Annotation>
<Davy Van Deursen, Sam Coppens, Erik Mannens>
<Dutch Semantic Web – 16.03.2008>
6/1
Jan Top
e-Science for Food Research
• Semi-open innovation in food
• RDF/OWL model of the scientific workflow
research question, preparation, experiment, data analysis, reporting, …
•
•
•
•
Food Thesaurus
Web application Tiffany – Sesame plus .NET
Openness or ‘stimulated-disclosure’?
Flexibility of the model, but how flexible is the user?
The new Luxaflex® powerpoint template
Who are the users?
Why would they use the cloud?
What tasks can be supported?
How will the semantics help?
Jacco van Ossenbruggen
Alia Amin Comparison Search
Who : CH conservators, researchers, students
Why: Important Information Gathering task (JCDL’08)
What: Compare sets
using multiple thesauri, heterogeneous dataset
alignment between properties and values
How: semantic search & visualization
Subject Annotation
Who: Professional annotators
Why: Subject matter annotation of 700.000 prints
What: Search in multiple thesauri for annotation terms
How: Autocompletion on who/what/where/when
Michiel Hildebrand
Patterns of Semantic Relations
in Content-based Recommender Systems
Accuracy
Frequency
Serendipity
teachOf/
studentOf
Yiwen Wang, CHIP Project www.chip-project.org
16/03/2009
annotation
repository
Hennie Brugman
texts
annotation
service (GATEApolda)
annotations
ranking service
term
suggestions
thesaurus
(skos)
catalog
conversion
enrichment
vocabulary
thesaurus
texts
Semantic
annotations
Annotation
(based on GATE)
Recommendation &
Ranking
video
Lightening talks - 2
•
•
•
•
•
•
•
•
•
•
Brickley
Omelayenko
Cimiano
Cornet
Willems
Koenderink
Rijgersberg
Rutledge
Nederbragt
Bocconi
Dan Brickley
Borys Omelayenko
AnnoCultor
porting collections and vocabularies to the Semantic Web
Museums: Various models
Louvre
e-culture: DC / SKOS
Tropenmuseum
Concept
RKD
Rijksmuseum
Volkenkunde
Work
etc.
Image
AnnoCultor
Converter in Java or XML*
 100s properties and 100.000 concepts per institution
 Structural conversion: from simple to very complex
 Semantic enrichment: term lookup, disambiguation
 Up to 80% terms found in vocabulary lookup
annocultor.sourceforge.net
CATCH day, 28.02.2008
Philipp Cimiano
Dutch SW Day @ VU, Amsterdam
16th March 2009
Web Information Systems (WIS) - EWI
TU Delft
Towards Linguistically Grounded Ontologies
(joint work w. P. Buitelaar, P. Haase and M. Sintek)
The The
Model
Need
Related Work
The Needdo not need labels „per se“.
Ontologies
Does SKOS do the job?
Related Work
WeRequirements
need labels for:No, SKOS was defined for totally different purposes.
 human consumption
It provides a datamodel (highjacking RDF/OWL) to represent
1.linking
textual
data
to ontologies
(ontology
capture
relations
betweenpopulation)
terms, e.g.,
classification
schemas:
The
Goalmorphological
 generating
descriptions
from ontologies
through NL
inflection
(animal,animals),
separately from the The Model
The
goal of
this research
is to
yield askos:Concept;
principled and generic model that
 etc.
etc.
domain
ontology;
ex:animals
rdf:type
allows
to declaratively
specify aor
lexicon
fordecomposition
an ontology.
2.Requirements
represent
the morphological
syntactic
skos:prefLabel
"animals"@en;
 The
main
goal is
to avoid
all
applications
have to re-specify
the
We
need
a general
and
principled
model
associate
of composite
terms
and that
the "creatures"@en;
linking
oftothe
components
to
skos:altLabel
connection
between
and
an ontology in an „adhoc“ fashion
linguistic
information
to language
ontologies.
the ontology;
skos:prefLabel
"animaux"@fr;
vision
is oneskos:altLabel
where wepatterns,
can
also such
publish
3. The
model
complex
linguistic
as lexica for ontologies
"créatures"@fr.
(insubcategorization
addition to the ontologies
andtogether
people can
frames forthemselves)
specific verbs
withsearch and
reuse
„ontology
lexica“
theirthese
mapping
to
arbitrary
ontological
structures;
There
are
other
models
which are more in line with our work,
Future
Work
The
Goal
4. specify the meaning
of LIR
linguistic
with
e.g. the
Modelconstructions
from UPM/Madrid.
Future Work
respect
to
an
arbitrary
(domain)
ontology,
and
 Spread the model and make people use it (first version of an API is available)
5.
clearly techniques
separate the
linguistic
and semantic
(ontological)
 Develop
that
automatically
instantiate
the model
representation
levels.
 Investigate
relation
to other models (e.g. LIR)
Ronald Cornet - Department of Medical Informatics
Academic Medical Center – University of Amsterdam
Understanding & Evaluation
Implementation
•GUI Design
•Functionality
•Classifications (rules)
•Information models
•Large-scale reasoning
Collaborations
 VU
 IHTSDO
 NEN/CEN/ISO
Terminological
SNOMED CT
Systems
Development
•Formalization
•Standardization
•Architecture
Auditing & Maintenance
•(DL-based) Qual. Assurance Domains
 Intensive Care
 Anesthesiology
 Nephrology
ERDSS






Emerging Risks
Holistic
Ontology
Forward chaining
Risk assessment
Uncertainty
Don Willems WUR/IM



Specify scope
Identify
sources
ROC
Tool
Extract triples
Sesame repository
Protoontology
generic
Interviews cost time for KE and DE →
DE mature role
Subject layer 1
Difficult for DE to provideSubject
knowledge
layer 2 →
prompting by tool
...
Models created
D from scratch →
Subject layer n
p e o ma
rs p i n
reuse existing
sources
ec
Application perspective
tive
Task-specific knowledge required →
monitor scope
B
A
C
D
Protoontology
Scope
E
F
specific

Discipline perspective
ROC
Domain perspective
Nicole Koenderink,
WUR -- IM
Subject layer x
G
H
specific
generic
Application perspective
Hajo Rijgersberg
Design and use of a quantitative research vocabulary for e-science


Problem
Approach




Lessons learnt




Vocabulary
Web services and web apps
Evaluate use
Support simple, recurring actions
Focus on those who actually need support
Integrate in popular tools
Excel add-in
Semantic Friendly Forms
-
-
RDFS/OWL functionality in form-based wiki
Now
- Semantic MediaWiki enables crowd semantics (and displays)
- Semantic Forms facilitates crowd entry (at ~RDF level)
Semantic Friendly Forms: RDFS&OWL-based menus/autocompletion
- Entry is quicker and with fewer errors (?)
- Process RDFS&OWL for form-based input
- Input form value selection
- Property selection for class instance input & infobox
-
-
- Domain, range, cardinality, restrictions, symmetry, …
Questions
- Do RDFS&OWL-based menus accelerate crowd entry?
- Can crowds engagingly and effectively design ontologies?
- What is effective pattern and scenario for use?
JWS special issue on Interaction: deadline April 20th!
Lloyd Rutledge
RNA infrastructure / Sterna project
web
interfaces
Hans Nederbragt
web
interfaces
API's
repository
connector
RNA toolset
repository
connector
rdf-store:
rdf-records
metadata
rdf-store:
rdf-records
metadata
rdf-store:
rdf-records
metadata
rdf-store:
reference
structures
data connector
data connector
conversion
collection:
records
unstruct files
collection:
records
unstruct files
legacy
reference
structures
local
applications
local
applications
content and
metadata
reference
structures
Sterna project / RNA infrastructure
Launched in 2008, the Sterna project is an eContentplus best practice network that aims to
contribute to the further development of the European Digital Library initiative. Sterna’s
participants, mostly European institutions that are concerned with collecting and managing
content on biodiversity, wildlife and nature in general, join forces to explore new ways of
providing their content to the public. The project was initiated by the Netherlands natural
history museum Naturalis and major technical contributor Trezorix.
Sterna is short for Semantic web-based Thematic European Reference Network Application.
Sterna is also the scientific name for the bird genus of terns. Not coincidentally, because birds
are the central theme of Sterna with respect to the content that will be made accessible by
project partners via the semantic information network, which is a genuine RNA environment.
This content can be any type, from scientific articles and imagery to MP3 files of bird sounds,
field recordings and artefacts with bird feathers in them.
Multimodeling: The RNA environment is very flexible with creation and use of different
datamodels. Harmonisation of data modeling is focused on the use of common properties,
rather than on trying to end up with one common data model.
Heterogenuous reference structures: In the RNA environment reference structures can
accommodate both reference items (skos concepts) and content items (xml and rdf structures).
Content items can be based on different data models, they even can combine different data
models.
Inferencing: In the RNA environment inferencing is used to create mappings based on schema's,
rather than mapping the data itself. Also inferencing can be used on the heterogenuous
reference structures to realise interesting modes of findability, but this raises some difficult
questions as well.
www.sterna-net.eu
www.trezorix.nl
Stefano Bocconi
Entity-based data integration


The concept of identity for entities: is identity
between two entities a matter of context?
Entity-based data integration, i.e. in how to
integrate different knowledge sources about the
same entity. Need for:


handling inconsistencies?
a quality mechanism to discard less trustworthy
information in case of conflicts?
Identifier lifecycle: guarantee persistency of
identifiers (e.g. duplicate cases)
The domain is scientific publishing (particularly in
Biology) and news publishing (event detection).

Lightening talks - 3
•
•
•
•
•
•
•
•
•
•
•
Jellema
Wang, Shenghui
Tordai
Groenouwe
Groth
Siebes
Brussee
Hollink
Oren
Jijkoun
Schreiber
STITCH @ CATCH
SemanTic Interoperability To Cultural
Heritage



Thesaurus alignment techniques (lexical, structural,
extensional and using background knowledge)
Alignment deployment and evaluation in real-world
scenarios (book reindexing and search, thesaurus
merging and collection navigation, etc)
Challenges

Heterogeneity

Scalability

Multilingualism
Shenghui Wang
Towards a Methodology for
Vocabulary Alignment
E-Culture project: Semantic search engine for CH
collections and vocabularies
We do not want to create new techniques.
We want to use existing techniques and their
combinations.
• Select multiple alignment
techniques
• Combine for higher recall
• Evaluate
• Apply disambiguation techniques to improve precision
Anna Tordai
Game “SWiFT”: Semantic Web in
Fast Translation
Chide Groenouwe, Jan Top, Mark van Assem
•
•
•
•
•
•
•
Goal: Translate all information in high quality SW
representations.
Problem: Not enough knowledge engineers, A.I. is too
“stupid”.
Towards Solution: Fostering capability in information
creators.
Means: multi-player online game SWiFT?
Case studies: TIFN scientific collaboration (Jan Top et al),
wikipedia translation.
Background information: Towards a Constitution Based Game
for Fostering Fluency in “Semantic Web Writing”
http://km.aifb.uni-karlsruhe.de/ws/insemtive2008/
vrije Universiteit
amsterdam
Sign up to play! [email protected]
Paul Groth
From pipes.deri.org
From Chris Bizer
Who’s
responsible
?
How were
they
produced?
Triples
Which
ones
should I
trust?
Ronald Siebes
The web is not about anglebrackets

RDF’s layering on top of XML is the single largest
obstacle for the adoption of Semantic Web technology.

Mismatch in datatypes:Trees vs. Graphs




It stimulates bad practices: e.g. URI’s
far to hard to read for human beings, NEED tools
XML causes real practical problems




No unique way to represent an RDF graph as an XML tree
If you need an XML parser anyway, RDF becomes extra burden
Parsing is seriously inefficient or (worse) chokes you XML parser
XML makes RDF unnecessary hard to understand



Try find a tutorial RDF/XML example which is not a tree
Obscures triple model
Where does XML stop and RDF begin
Better alternatives exist: please make turtle the official
preferred RDF serialisation.
Rogier Brussee
MuNCH
Multimedia
Analysis
Semantic
Technologies
Language
Technology
Enrich thesaurus structure
sidewalk – pavement
sand – concrete
pearls – juwelery
fjords – seas
barbecues – picknicks
queens – aristocrats
acupuncture – negotiation
User Behavior In Audiovisual
Archive
-Does the thesaurus represent
the user queries?
-Popular programs over time.
-Can we use queries to
automatically annotate shots?
Eyal Oren
Large-scale distributed
RDF(S) reasoning
http://larkc.eu/marvin
• Web service for text information processing:
– Extraction (terms, names, reported speech,…)
– Cross-document name normalization and linking
– Analysis (compare, track dynamic changes)
• Protocols and standards
– REST (HTTP POST/GET) + XML
– SOAP
– RDF/XML (on demand)
Valentin Jijkoun
• Basic application loop:
– Upload your documents (text, html, pdf,…)
– Specify processing type
– Access the results of processing
• Example: what
themes played in Dutch
news in the past month around Ikea?
Questions/contacts UvA:
• Maarten de Rijke
• Valentin Jijkoun
• mdr,[email protected]
• http://ilps.science.uva.nl
Guus Schreiber
http://www.europeana.eu/portal/thought-lab.html