Meta Tagging / Metadata

Download Report

Transcript Meta Tagging / Metadata

Meta Tagging / Metadata
Lindsay Berard
Assisted by: Li Li
Research Question
• Define and describe state of the art meta tagging
technologies with, at a minimum a full outline of
the Dublin Core proposal.
• Describe how meta tagging should be
implemented to create new value in the online
information enterprise.
Meta Tagging
Slide 1 of 20
Definitions
• Metadata: data that describes other data, can describe what
a document is, how it was created, how it should be used, and
much more
• Meta tagging: a convention used to store metadata, usually
embedded in an HTML document between head tags and
hidden from users
• Both describe content of pages in machine readable form
• Example of a meta tag:
<meta name=“title” content=“Lindsays PowerPoint” />
Meta Tagging
Slide 2 of 20
Meta Tagging – Past Use
• Meta tags used to be mainly used for search engine
optimization, the tags “keyword” and “description” were used
by search engines to index pages
• Due to people abusing the “keyword” meta tag, it has
become pretty useless and most search engines no longer
look at it
• “Keyword” and “Description” can still be useful for internal
search
Meta Tagging
Slide 3 of 20
Metadata Future – The Semantic Web
• Metadata is one of the backbones of the Semantic Web
• One of the key features of the SW is to have software agents
that can basically use the web for you
• In order for this to work, these software agents must be able
to read and understand the content of web pages, and
recognize relationships between data
• Metadata allows this to work
Meta Tagging
Slide 4 of 20
How to get to the Semantic Web
• People need to start using metadata when they create web
pages, so that computers can understand the information
within these pages – this is going to take a lot of work at first
•However, this process also needs to be easy – otherwise they
won’t do it
• For example, people enjoy tagging photos on Flickr or
articles on Digg because it is so simple
• Implementing metadata on one’s web site must be as easy
as this, and users must know the reason behind the work they
are doing
Meta Tagging
Slide 5 of 20
Ways to Use Metadata
• Can either:
• Put metadata in XHTML metatags
• Put metadata in separate RDF page and link it to the
XHTML
• Put metadata in a separate XML page and link it to the
XHTML
•In the future – we will be moving away from putting the
metadata right within the XHTML page, because there will
need to be so many tags and so much metadata, and it will be
cleaner if in a separate document (kind of like separating
design from content with CSS)
Meta Tagging
Slide 6 of 20
Five Types of Metadata
•Descriptive: helps end user, describes content of the page
•Administrative: helps with the management of the page
•Technical: specifications such as file size, format
•Structural: defines relationships between other objects
•Preservative: details how the page should be maintained in
the future
Meta Tagging
Slide 7 of 20
Levels of Metadata
•Determining how granular metadata should be is essential
•It will determine how a page or object can be found or
searched, and also describe relationships between objects
•All objects using metadata must at least meet a minimum
standard of detail, the level of granularity beyond this will vary
from project to project
Meta Tagging
Slide 8 of 20
Effective Metadata
• To be effective, there must be a universal standard which
uses a controlled vocabulary
• Example: mediaType could be “photo”, “image”, “picture”,
“photograph” – there needs to be a controlled vocabulary so
that everyone is describing things in the same way
• Without a controlled vocabulary, objects created by different
users will never be able to interact or relate to one another
Meta Tagging
Slide 9 of 20
Metadata Standards
• Standards are also needed for the tags themselves
• Example: to describe the person who created the object you
could use the tags “author”, “creator”, “writer”
• There needs to be a standard so that objects across the web
can interact with one another
• Which standard should you use?
Meta Tagging
Slide 10 of 20
Dublin Core Metadata Initiative - Background
•Actually in Dublin, Ohio – not Ireland
• Some great minds met and started talking about semantics
and the web at the International World Wide Web Conference
in 1994
• Combination of people in the industry and involved with
OCLC (Online Computer Library Center) and NCSA
(National Center for Supercomputing Applications)
• Even in 1994 they realized how hard it was to find
resources and relate them to one another
• NCSA and OCLA held a workshop together to discuss
metadata and semantics in Dublin, Ohio in 1995 – DCMI
was born!
Meta Tagging
Slide 11 of 20
Dublin Core Metadata Initiative
• Part of the push towards the semantic web
• Believe that computers need to know what pages MEAN in
order to provide users with what we want and need
• DCMI is a system that provides a way to catalog objects on
the web
•Look pretty much the same as regular metatags but are
much more conformed – they use standard formats and a
controlled vocabulary
• For example subject must be a category from the Library of
Congress
Meta Tagging
Slide 12 of 20
DCMI – Core Elements
• Contributor
• Publisher
• Coverage
• Relation
• Creator
• Rights
• Date
• Source
• Description
• Subject
• Format
• Title
• Identifier
• Type
• Language
Meta Tagging
Slide 13 of 20
DCMI – Element Recommendation
Meta Tagging
Slide 14 of 20
Dublin Core Metadata Initiative
• The DCMI tags actually try to describe the content as
opposed to usual way of using keywords and meta tagging
• DCMI has come up with a set of 15 core metadata elements,
and provide recommendations on their use
• Example of a DCMI meta tag (to be used in an HTML page):
<meta name=“DC.Creator” content=“Lindsay Berard” />
Also see: http://www.education.vic.gov.au/beyondschool/tafe/default.htm
• Can either manually add DCMI metadata, or use one of the
many generators that have been created
Example: Dublin Core Meta Tag Generator:
http://www.ukoln.ac.uk/cgi-bin/dcdot.pl
Meta Tagging
Slide 15 of 20
NewsML
• There are MANY metadata standards out there
right now – which creates a problem because they
are not generally interoperable
• NewsML: an extremely complex XML standard for
describing news and all of its components
(standards > 200 pages!)
• started by Reuters, then passed on to the
International Press Telecommunications Council
• great for news organizations to transfer news
documents and have them be described in great
detail – however the learning curve to use
NewsML is quite steep
Meta Tagging
Slide 16 of 20
Metadata and Newshouse
• Newshouse must have a metadata standard to
follow
• Should create a Metadata Standards Document to
explain which standard to follow and how to use
metadata – whether it be within the XHTML
document it self or linked via RDF or XML
• Creating a metadata generator that complies with
the Newshouse standards would be extremely
helpful to those using the site
• Simple interface where the user just inputs
fields or selects from drop downs
Meta Tagging
Slide 17 of 20