Talk PPT - icadl 2015

Download Report

Transcript Talk PPT - icadl 2015

“As we may think”
(Vannevar Bush, 1945)
www.webat25.org
Source: http://w3.org/Proposal.html
Lance Ulanoff, Mashable.com, December 4 2015
Hiro:
Who worshipped Asherah?
Librarian: Everyone who lived between India and
Spain, from the second millennium B.C. up
into the Christian era. With the exception of
the Hebrews, who only worshipped her until
the religious reforms....
Hiro:
I thought the Hebrews were monotheists….
Librarian: Monolatrists. They did not deny the
existence of other gods. Asherah was
venerated as the consort of Yahweh.
Hiro:
I don't remember anything about God
having a wife in the Bible.
Librarian: The Bible didn't exist at that point. Judaism
was just a loose collection of Yahwistic cults,
each with different shrines and practices.
Hiro and the Librarian, Chapter 30, Snow Crash (1992) , Neal Stephenson
In Arabia
In Ugarit
In Egypt
In Israel and
Judah
Semantic Web
(Tim Berners-Lee, 2000)
“The intelligent agent
that people have touted
for ages will finally
materialize.”
http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html
Semantic Web Knowledge Web
• Human readable vs machine
readable contents
• Machine reads human readable
contents
• Human defines standard for data
formats and models
• Machine learns to conflate
different formats of the same thing
• Explicit and precise specification
of knowledge representation that
everyone has to agree upon
• Latent and fuzzy representation of
knowledge learned by mining big
data
Paradigm Shift in Web Search (the “Librarian”)
TRADITIONAL
WEB SEARCH
KNOWLEDGE
WEB SEARCH
Index Keywords in Documents
Digest World’s Knowledge
Match Keywords in Queries
Match User Intent
Relevance of “10 blue links”
Dialog Experience
1. “Bing Dialog Model: Knowledge, Intent and Dialog”, MSR Faculty Summit, July 2010
2. “Introducing the Knowledge Graph: things, not strings”, Official Google Blog, May 2012
3. “Chinese Search Engine – Baidu’s Practice”, SIRIP, SIGIR 2014, July 2014
“Dialog Acts” in Bing/Cortana
• Answer
• Confirmation
• Disambiguation
• Suggestion
• Progressive: Refinement
• Digressive: Recommendation (reactive + proactive)
• Key difference from human-to-human dialog
• Not limited to anthropomorphic natural language dialogs
• Each dialog turn can present multiple acts
• Can overload back channel communications
Closed-loop Dynamic Bayesian Inference
Bayesian Minimum Risk
It = arg max P(I | Ut, K, It-1)
At = arg min E[Cost(A, It )]
Knowledge
+ History
Previous
Inferences
(K, It - 1)
Expected
Behavior (U’t)
+ +
-
Inferred
Intent (It)
Intent
Model
User Behavior (Ut)
Expected
Behavior (U’t)
Inferred
Intent (It )
Behavior Model
Interaction
Model
Inferred
Action (At)
Digital Librarian for Researchers:
How far are we?
A Case Study on Microsoft Academic Search
Predictive Completion and Disambiguation
Knowledge Driven Suggestions
Research Challenges
• How to complete never
foreseen academic queries?
• How to rank completion
suggestions?
• How to avoid making
completions leading no
search results?
More on Intent Inference
• Generative model approach:
𝐼𝑡 = arg max 𝑃(𝐼|𝑈𝑡 , 𝐾, 𝐼𝑡−1 )
𝐼
= arg max 𝑃 𝑈𝑡 𝐼 𝑃(𝐼|𝐾, 𝐼𝑡−1 )
𝐼
• Dynamic ranking, 𝑃(𝑈𝑡 |𝐼), score depending on user behavior (e.g., query)
• Static ranking, 𝑃(𝐼|𝐾, 𝐼𝑡−1 ), score determined by knowledge and dialog
history
Special Case 1: Static Rank at Onset
• Given knowledge graph, find 𝑃(𝐼|𝐾) for all entity types
• Journal, article, conference, author, institution
• Journal impact factor: E. Garfield, Science, 1972
• Page Rank: A paper is important if cited by important papers
• G. Pinski and F. Narin, Information Processing and Management, 1976
• N. Geller, Information Processing and Management, 1978
• Rediscovery of Perron-Frobenius theorem (1904)
• How to make better use of heterogeneity of the graph?
Static Rank of a Paper
• Inaugural WSDM Cup, Autumn 2015
• Industry organizer: MSR and Elsevier
• http://www.wsdm-conference.org/2016/wsdm-cup.html
Microsoft Academic Graph
Author (> 40M)
Paper (> 100M)
Event (> 46K)
Venue (> 23K)
Citations (billions)
Institution (20K)
Field of Study (> 50K)
Microsoft Academic
Graph (MAG)
• Data Releases on Azure
• Free Azure accesses for research
• http://research.microsoft.com/
MAG
• Web Service API coming!
• Community properties
Special Case 2: “Zero-query” Suggestion
• Digital librarian to notify me new materials I should read
• Find 𝐼𝑡 = arg max 𝑃(𝐼|𝐾, 𝐼𝑡−1 ) whenever the knowledge graph grows
• Best if
• Tailored to user based on interests inferred from aggregated behaviors
• Following user wherever, whenever and whatever
• Cortana: intelligent personal assistant
• Windows, Android, IOS
Summary
• Intelligent agent at web scale (“digital librarian”):
• From keyword matching to intent/knowledge understaning
• One year old for academic services!
• Conduct interactive dialog or forage on behalf of users behind the scene
• Albeit w/o anthropomorphic façade
• Microsoft Academic Services:
•
•
•
•
Search (reactive)
Cortana notification (proactive)
Data and Intelligent API
We want to build a community