Slate's Reading Process - Rensselaer Polytechnic Institute

Download Report

Transcript Slate's Reading Process - Rensselaer Polytechnic Institute

Learning by Reading
Micah Clark & Selmer Bringsjord
Rensselaer AI & Reasoning (RAIR) Laboratory
Department of Cognitive Science
Department of Computer Science
Rensselaer Polytechnic Institute (RPI)
Troy NY 12180 USA
03.20.06
Turning to written text and diagrams to learn, isn’t
considered learning as it has been rigorously studied
in computer science, cognitive science, and AI. In
them, to learn is almost invariably to produce an
underlying function f on the basis of a restricted set of
pairs.
Yet this form of Learning by Reading (LBR)
underpins much of modern human life – e.g. the
educational system, job training, IRS tax forms,
product manuals.
Shallow vs. Deep Learning
Shallow Learning:
Absorb the semantic content explicitly present in the
surface structure and form of the medium (texts)
Deep Learning:
Reflective contemplation of semantic content with
respect to prior knowledge, experience, and beliefs as
well as imaginative hypothetical projections
Example: Book Reports!
Shallow LBR for Slate
Reading Process
Process: Intelligence Reports  Multi-Sorted Logic
Reading Process Implementation
Process: Intelligence Reports  Multi-Sorted Logic
Reading Process – Phase 1
• ACE (Fuchs, et al)
• WordNet used prior as lexicon
database for CELT, an ACElike controlled language
(Pease, et al)
• Manual transcription/authoring in controlled
languages is viable at scale (Allen & Barthe)
• Techniques for automated conversion from natural
English to controlled English are being developed
(Mollá & Schwitter)
Attempto Controlled English
ACE is an unambiguous proper subset of full English
• Vocabulary of reserved function words and userdefined content words
• Grammar is context-free, phrase-structured, and
definite clause
• Principles of Interpretation deterministically
disambiguate otherwise ambiguous phrases
• Direct translation into Discourse Representation
Structures
Reading Process – Phase 2
• ACE Parser (APE)
• Discourse Representation
Structures (DRSs) are central to
Discourse Representation
Theory (DRT) (Kamp & Reyle)
• DRT is a linguistic theory for assigning meaning to
discourse by sequential additive contribution
• DRS is a syntactic variant of first-order logic for the
resolution of unbounded anaphora
• DRS is a structure ((referents), (conditions))
DRS Example
“John talks to Mary.”
((A, B), (John(A), Mary(B), talk(A, B)))
…“He smiles at her.”
((A, B, C, D),
(John(A), Mary(B), talk(A, B),
smile(C, D), C=A, D=B))
DRS Example
…“She does not smile at him.”
((A, B, C, D),
(John(A), Mary(B), talk(A, B),
smile(C, D), C=A, D=B),
((E, F), (smile(E, F), E=B, F=A)))
Reading Process – Phase 3
• Transformation from DRS to
MSL/FOL is well understood
(Blackburn & Bos)
• ACE uses an extended form of DRS
• Small, domain-neutral, encoding
scheme & ontology to capture
semantic content
• Straight-forward translation would interject ACE’s ontology/encoding
scheme
• Translation must map from ACE’s ontology to another, perhaps PSL
• Similar to CELT’s mapping of WordNet to the Suggested Upper Merged
Ontology (SUMO)
Encoding Scheme Examples
• Nouns and verbs have semantic type; person, object,
time, or unspecified for nouns, event, state, or
unspecified for verbs
– e.g. object(A, named_entity, person)
• Properties are encoded using property
– e.g. green(A)  property(A, green)
• Predicates are encoded using predicate
– e.g. enter(A, B)  predicate(P, event, enter, A, B)
Slate Reading Example
Input Text
Security searches every foreigner that boards a
plane. Abdul is an Iranian. He boards DL846.
Parse Tree
DRS
Multi-Sorted Logic
(Using Inverse Encoding Map)
1. A (Security(A) 
B,C ((foreigner(B)  plane(C)  board(B, C))  search(A, B)))
2. AB (Abdul(A)  Iranian(A)  DL846(B)  board(A, B))
Comparison to KANI and HITIQA
High-Quality Interactive Question Answering (HITIQA)
Knowledge Associates for Novel Intelligence (KANI)
Technical Accomplishments
• Proof-of-concept demonstration of automatic
translation of a controlled English to FOL for the IA
domain
• Demonstration leverages 3rd party technologies as
previously discussed
• Effort has identified specific aspects of the approach
in need of novel research
Programmatic Accomplishments
•
Bringsjord, S. & Clark, M. (2006) ‘For Problems Sufficiently Hard . . . AI Needs CogSci.’ To
appear in Proceedings for the American Association for Artificial Intelligence’s Spring
Symposium on Cognitive Science and AI (“Between a Rock and a Hard Place: Cognitive
Science Principles Meet AI-Hard Problems”).
•
Clark, M & Bringsjord, S. (2006) ‘Learning by Reading’, Invited talk for the Institute for
Informatics, Logics, and Security Studies, State University of New York at Albany, Albany,
NY.
•
Bringsjord, S. & Clark, M. (2006) ‘Solomon: A Next Generation Q&A System’, Blue-sky
proposal in response to BAA N61339-06-R-0034 (DTO AQUAINT Program phase 3).
•
Clark, M. (2006) ‘Method for Detecting Infinite Ambiguity in Context-Sensitive Generative
Grammars’, Research Note, Rensselaer AI & Reasoning (RAIR) Laboratory, Cognitive
Science Department, Rensselaer Polytechnic Institute, Troy, NY.
Future Research
•
•
•
•
•
•
Interpretation of ‘natural style’ proofs as DRSs
Ontologically neutral DRSs
Ambiguous referents and incremental resolution
Conversational DRT
Non-monotonic transitions in DRT
Restatements in Conversational Discourse
Immediate Objectives
• Develop inverse mapping and translation from ACE
ontology and encoding to ‘vanilla’ MSL (with
Bettina)
• Develop basic translation/reformulation of natural
deductive proofs (NDL?, Athena?, Slate?) into DRSs
(with Sunny)
References
Allen, J. & Barthe, K. (2004), ‘Introductory Overview of Controlled Languages’, Invited talk for the Society for Technical
Communication. Presentation.
Blackburn, P. & Bos, J. (Forthcoming), Working with Discourse Representation Theory: An Advanced Course in Computational
Semantics. Forthcoming.
Fuchs, N. E., Hoefler, S., Kaljurand, K., Schneider, G. & Schwertel, U. (2005), Extended Discourse Representation Structures in
Attempto Controlled English, Technical Report ifi-2005.08, Department of Informatics, University of Zurich, Zurich,
Switzerland.
Fuchs, N. E., Kaljurand, K., Rinaldi, F. & Schneider, G. (2005), A Parser for Attempto Controlled English, Technical Report
IST506779/Zurich/I2D3/D/PU, REWERSE.
Hoefler, S. (2004), The Syntax of Attempto Controlled English: An Abstract Grammar for ACE 4.0, Technical Report ifi-2004.03,
Department of Informatics, University of Zurich, Zurich, Switzerland.
Fuchs, N. E., Schwertel, U. & Schwitter, R. (1999), Attempto Controlled English (ACE) Language Manual, Version 3.0, Technical
Report 99.03, Department of Computer Science, University of Zurich, Zurich, Switzerland.
ISO (2001), Industrial automation system and integration — Process specification language, Committee Draft ISO/CD 18629-1,
International Organization for Standardization (ISO).
Kamp, H. & Reyle, U. (1993), From Discourse to Logic: Introduction to Model-theoretic Semantics of Natural Language, Formal
Logic and Discourse Representation Theory, 1 edn, Springer.
Mollá, D. & Schwitter, R. (2001), From Plain English to Controlled English, in ‘Proceedings of the 2001 Australasian Natural
Language Processing Workshop’, Macquarie University, Sydney, Australia, pp. 77–83.
Pease, A. & Fellbaum, C. (2004), Language to Logic Translation with PhraseBank, in ‘Proceedings of the Second International
WordNet Conference (GWC2004)’, Masaryk University Brno, Czech Republic, pp. 187–192.
Pease, A. & Murray, W. (2003), An English to Logic Translator for Ontology-based Knowledge Representation Languages, in
‘Proceedings of the 2003 IEEE International Conference on Natural Language Processing and Knowledge Engineering’,
Beijing, China, pp. 777–783.