slides - Peyman Nasirifard`s Homepage

Download Report

Transcript slides - Peyman Nasirifard`s Homepage

Digital Enterprise Research Institute
www.deri.ie
Anatomy of a Semantic Virus
Peyman Nasirifard
[email protected]
Nature inspired Reasoning for the Semantic Web (NatuReS)
7th International Semantic Web Conference (ISWC 2008)
Karlsruhe, Germany
27th October 2008
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved.
What Do We Have Now?
Digital Enterprise Research Institute


We have currently Semantic-Web-Oriented applications
and APIs

Semantic digital libraries

SIOC-enabled shared workspaces

Semantic URL shorten tools

Semantic Wiki

Semantic blog

Lots more...
The applications „talk“ in RDF


www.deri.ie
Importing and exporting RDF
These applications provide partially „food“ for
Semantic Search engines
What Do We Have Now? (2)
Digital Enterprise Research Institute

Semantic-Web-Oriented researchers (including me :-)
encourage others



www.deri.ie
Use RDF, Publish RDF, Talk RDF!
Sematnic search engines

Finding RDF-related materials from the Web

Indexing them

Querying and reasoning over data
Sematnic search engines are RDF-hungry

„Submit RDF to us“
– Crawl deep Web

„Tell us where you saw an RDF document“

They monitor services like „pingthesemanticweb.com“
What Do We Have Now? (3)
Digital Enterprise Research Institute


Users can submit their RDF data using services like
Ping The Semantic Web (PTSW)
Feeds of the PTSW are further used


www.deri.ie
Search engines follow the links and index RDF data
We have services like DBpedia

DBpedia is a community effort to extract structured
information from Wikipedia and to make this information
available on the Web (source: http://dbpedia.org/About)

Can be used for reasoning
Real World
Digital Enterprise Research Institute

www.deri.ie
Common Sense facts
Milk is white
 Lions eat meat


Web (e.g. Wikipedia) is for humans, whereas Semantic Web (e.g.
DBpedia) aims to be for machines.


Humans have wisdom and can distinguish ridiculous common sense
facts, but machines can not detect them and will use them in
reasoning.
Do you trust Wikipedia articles?
How much?
 Why is not Wikipedia cited in scientific articles?


What about DBpedia?
Can we really benefit from the generated RDF?
 If we can not trust Wikipedia articles, how can we use DBpedia for
further reasoning?

Tirple-based Infection
Digital Enterprise Research Institute
www.deri.ie
Possible Attack
Digital Enterprise Research Institute
www.deri.ie
Some Facts and Discussions
Digital Enterprise Research Institute



Fake knowledge can exist on the Semantic Web

Maliciously: Semantic Virus

Non maliciously: Human faults (machines will not have faults)
Semantic Web is NOT just FOAF and FOAF-based
computing
Semantic Web does not grow as fast as the Web


www.deri.ie
Google has indexed one Trillion pages
(source: Google official blog)
Such attacks are not feasible on the Web

Because We as humans can understand some common sense
facts, but machines do not have the common sense facts that
we have
Some Facts and Discussions (2)
Digital Enterprise Research Institute


Trust and Proof (and perhaps logic) are the layers
that the possible virus target
Digital Signature can not address such issues.


Information quality issues (e.g. validity)
Trusting on RDF sources?

We trust mostly on sources (e.g. We trust on LiveJournal
RDF files, because Livejournal is a trusted party)

We trust SIOC plugins that generate SIOC

But can we limit knowledge providers to just some
sources?
– Internet does not do it, so we need
to accetp RDF from everyone!
www.deri.ie
Conclusions
Digital Enterprise Research Institute


Future works based on developing the virus is not
really recommended! 
The paper opens some research areas in the trust
layer of the Semantic Web tower


www.deri.ie
How much do you trust DBpedia?
How can we ensure that RDF is not fake?

Should we revise all RDF statements using some references?
– Where do we get the references?

Do we need a global, peer-reviewed and always up-to-date
common sense facts repository?
– Sounds very difficult or even impossible

Can we benefit from nature-inspired reasoning?
– Can we use statistical approaches?
Digital Enterprise Research Institute
www.deri.ie
Thank you for your attention!
Questions? Comments?
Please contact Peyman:
[email protected]
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved.
11 of 4