INFORMATION RETRIVEAL & NATURAL LANGUAGE PROCESSING

Download Report

Transcript INFORMATION RETRIVEAL & NATURAL LANGUAGE PROCESSING

INFORMATION RETRIVEAL &
NATURAL LANGUAGE
PROCESSING
SEMANTIC WEB
SEMANTIC WEB
• What is semantic web?
• Why semantic web is needed
• Introduction to XML
• XML Documents
SEMANTIC WEB
continued……
• Semantically structured XML
• Examples of interfaces
• Questions
WHAT IS SEMANTIC WEB?
• Built on syntaxes called RDF
• Getting data on the web with its
meaning
• Information is linked up to allow data
to be shared & reused on a global
scale
• Thought of as a globally linked
database
WHY SEMANTIC WEB IS NEEDED
• Designed to meet large scale
challenges of electronic publishing
• Made it easier to publish data on a
global scale
• Opportunity for a better information
retrieval
INTRODUCTION TO XML
• Extensible Markup Language is a text
format, which is simple & flexible
• Originates from SGML
• Designed to meet large scale challenges of
electronic publishing
• Shows semantics of data in a documented,
structured machine readable format
XML DOCUMENTS
• Main reason for XML is to send documents in a
simple way. Anyone can design their own
document
• Programs can read & understand documents
i.e. a plain text document:
‘I just got a new pet dog’
XML DOCUMENTS
continued……….
Using XML:
<sentence>
<person
href="http://aaronsw.com/">I</person> just
got a new pet <animal>dog</animal>.
</sentence>
• Contents are labelled, consist of open tag
•
<sentence> and closed tag </sentence>
URI’s are used to uniquely identify specific
markup elements
XML DOCUMENTS
continued……….
• Tag names are differentiated, each tag has
its own URI
• URI’s are assigned to each element and
attribute by the use of XML name
• Meanings are given to words or sentences
by the use of RDF.
SEMANTICALLY STRUCTURED XML
Below indicates ways in which semantically structured
XML is used by information retrieval to increase recall &
precision:
• By using schemas, documents can be grouped
• Words that have more than one meaning are separated
by the way the are shown
• Rich data types are used by queries
• Parts of the document are returned
EXAMPLES OF INTERFACES
• First example
indicates a full text
query which is
unstructured that is
submitted by the
user:
EXAMPLES OF INTERFACES
continued………
• Information searched.
• Search form is
prompted to the user
• Contains a list of
schemas, user selects
relevant one
EXAMPLES OF INTERFACES
continued………
• Blank entry for user
to fill out to further
help their search
EXAMPLES OF INTERFACES
continued………
• A fully structured
query
• User is given results
SUMMARY……….
Why XML?
• Ideal for the purpose of using richly
structured documents over the web
• Development of XML helped achieve
semantic web
• Gives syntactical foundation
SUMMARY……….
continued………
• New possibilities for both interaction &
communication combined
• Opportunities are there for large scale
decentralization.
QUESTION?
• What would be the significance of
semantic web in the future?