Document

Transcript Document

Some questions
- What is metadata?
- Data about data
Some questions
- How do we know it is
metadata?
- Intuition or marked as metadata
Some questions
- How does a machine know that
it reads metadata?
- Marked as metadata, formalized
in e.g. RDF(S) or OWL
Some questions
- How can we extract metadata?
- Manually
- Known places in structured
documents
Some questions
- How ca we use metadata?
- Annotate data
- Finding relationships (later)
Some questions
- How do we annotate data with
metadata?
- Manually (e.g. write XML tags)
- Identify instances automatically,
then machine annotates
Some questions
- Problems with automatic identification
- Disambiguation
- Same name, different entities
- Which “Christopher Thomas”?
- Same entity, different role
- “Christopher Thomas” can be an entity in the
LSDIS ontology and also in the Friendster
FOAF ontology. Not yet merged.
Taxonomies
- What is a taxonomy?
- From Greek ταξινομία from the words taxis =
order and nomos = law
- Hierarchical classification of things
- Mathematically, a taxonomy is a tree structure
of classifications for a given set of objects
Ontologies
- What is an Ontology?
- In computer science, an ontology is the attempt
to formulate an exhaustive and rigorous
conceptual schema within a given domain, a
typically hierarchical data structure containing
all the relevant entities and their relationships
and rules (theorems, regulations) within that
domain
Machine Learning
- What is Machine Learning?
- an area of artificial intelligence concerned with
the development of techniques which allow
computers to "learn"
Machine Learning techniques
– supervised learning --- where the algorithm
generates a function that maps inputs to desired
outputs. One standard formulation of the
supervised learning task is the classification
problem: the learner is required to learn (to
approximate the behavior of) a function which
maps a vector into one of several classes by
looking at several input-output examples of the
function.
Machine Learning techniques
– unsupervised learning --- which models a set
of inputs: labeled examples are not available.
– reinforcement learning --- where the
algorithm learns a policy of how to act given an
observation of the world. Every action has
some impact in the environment, and the
environment provides feedback that guides the
learning algorithm.
Machine Learning techniques
• Classification
– Supervised Learning
– Reinforcement Learning
– Artificial Neural Networks
– Nearest Neighbor/Bayesian approaches
• Group entities around a point of reference
Machine Learning techniques
• Clustering
– Unsupervised
– Try to find functions that split a dataset in
a meaningful way
– Needs an evaluation function that tells
what is meaningful and what is not.

Document

Transcript Document

Directory