Learning Attributes and Relations

Download Report

Transcript Learning Attributes and Relations

Presented by
Rani Qumsiyeh & Andrew Zitzelberger

Common approaches
 Collocation analysis: Producing anonymous
relations without a label.
 Syntactic Dependencies: The dependencies
between verbs and arguments.
 Hearst’s approach: Matching lexico-syntactic
patterns.

Definition: A pair of words which occur
together more often than expected by
chance within a certain boundary.

Can be detected by Student’s t-test or X^2
test.

Examples of such techniques are presented in
the related work section.

“A person works for some employer”
 Relation: work-for
 Concepts: person, employer

The acquisition of selectional restrictions
Detecting verbs denoting the same
ontological relation.
Hierarchical ordering of relations.

Discussed later in detail.



Used to discover very specific relations such as part-of, cause,
purpose.

Charniak employed part-of-speech tagging to detect such
patterns.

Other approaches to detect causation and purpose relations
are discussed later.

Learning Attributes relying on the syntactic relation
between a noun and its modifying adjectives.

Learning Relations on the basis of verbs and their
arguments.

Matching lexico-syntactic patterns and aims at
learning qualia structures for nouns.

Attributes are defined as relations with a datatype
as range.

Attributes are typically expressed in texts using the
preposition of, the verb have or genitive constructs:




the color of the car
every car has a color
the car's color
Peter bought a new car. Its color [...]

attitude adjectives, expressing the opinion of the speaker
such as in 'good house'

temporal adjectives, such as the 'former president' or the
'occasional visitor‘

membership adjectives, such as the 'alleged criminal', a 'fake
cowboy‘

event-related adjectives, such as 'abusive speech', in which
either the agent of the speech is abusive or the event itself

Find the corresponding description for the adjective by
looking up its corresponding attribute in WordNet.

Consider only those adjectives which do have such an
attribute relation.

This increases the probability that the adjective being
considered denotes the value of some attribute, quality or
property.

Tokenize and part-of-speech tag the corpus
using TreeTagger.

Match to the following two expressions and
extract adjective/noun pairs:
 (\w+{DET})? (\w+{NN})+ is{VBZ} \w+{JJ}
 (\w+{DET})? \w+{JJ} (\w+{NN})+

Cond (n, a) := f(n, a)/f(n)



Tourism Corpus
Threshold = 0.01
Car

For each of the adjectives we look up the
corresponding attribute in WordNet
 age is one of {new, old}
 value is one of {black}
 numerousness/numerosity/multiplicity is one of {many}
 otherness/distinctness/separateness is one of {other}
 speed/swiftness/fastness is one of {fast}
 size is one of {small, little, big}

Evaluate for every domain concept according to
(i) its attributes and their
(ii) corresponding ranges by assigning them a rate
from '0' to '3‘
▪ '3' means that the attribute or its range is totally
reasonable and correct.
▪ '0' means that the attribute or the range does not make
any sense.
A new approach that not only lists relations
but finds the general relation.
 work-for (man, department),
work.for (employee, institute),
work.for (woman, store)
 work-for (person,organization)





Conditional probability.
Pointwise mutual information (PMI).
A measure based on the x^-test.
Evaluate by applying their approach to the
Genia corpus using the Genia ontology



Extract verb frames using Steven Abney's
chunker.
Extract tuples NP-V-NP and NP-V-P-NP.
Construct binary relations from tuples.
 Use the lemmatized verb V as corresponding
relation label
 Use the head of the NP phrases as concepts.



protein_molecule: 5
Protein_family_or_group: 10
amino-acid: 10


Take into account the
frequency of occurrence.
Chose the highest one

Penalize concepts c which occur too frequently.

P{amino-acid) = 0.27, P(protein) = 0.14

Compares contingencies between two variables (the two
variables are statistically independent or not)

we can generalize c to ci if the X^2-test reveals the verb v
and c to be statistically dependent

Level of significance = 0.05
the Genia corpus contains 18.546 sentences with
509.487 words and 51.170 verbs.
 Extracted 100 relations, 15 were regarded as
inappropriate by a biologist evaluator.
 The 85 remaining was evaluated

 Direct matches for domain and range (DM),
 Average distance in terms of number of edges between
correct and predicted concept (AD)
 A symmetric variant of the Learning Accuracy (LA)


Nature of Objects
Aristotle
 Material cause (made of)
 Agentive cause (movement, creation, change)
 Formal cause (form, type)
 Final cause (purpose, intention, aim)
 Generative Lexicon framework [Pustejovsky,
1991]
 Qualia Structures
 Constitutive (components)
 Agentive (created)
 Formal (hypernym)
 Telic (function)
Knife

Human
 Subjective decisions

Web
 Linguistic errors
 Ranking errors
 Commercial Bias
 Erroneous information
 Lexical Ambiguity

Pattern library tuples (p, c)
 p is pattern
 c is clue (c:string -> string)

Given a term t and a clue c
 c(t) is sent to the search engine

π(x) refers to plural forms of x

Amount words:
 variety, bundle, majority, thousands, millions,
hundreds, number, numbers, set, sets, series,
range

Example:
 “A conversation is made up of a series of
observable interpersonal exchanges.”
▪ Constitutive role = exchange
PURP:=\w+{VB} NP I NP I be{VB} \w+{VBD}).

No good patterns
 X is made by Y
 X is produced by Y
 Instead:
 Agentive_verbs = {build, produce, make, write,
plant, elect, create, cook, construct, design}


e = element
t = term

Lexical elements: knife, beer, book, computer

Abstract Noun: conversation

Specific multi-term words:
 Natural language processing
 Data mining

Students score
 0 = incorrect
 1 = not totally wrong
 2 = still acceptable
 3 = totally correct
Reasoning: Formal and constitutive patterns are more ambiguous.

Madche and Stabb, 2000
 Find relations using association rules
 Transaction is defined as words occurring
together in syntactic dependency
 Calculate support and confidence
 Precision = 11%, Recall = 13%

Kavalec and Svatek, 2005
 Added ‘above expectation’ heuristic
▪ Measure association between verb and pair of concepts

Gamallo et al., 2002
 Map syntactic dependencies to semantic relations
 1) shallow parser + heuristics to derive syntactic
dependencies
 2) cluster based on syntactic positions
 Problems
▪ Mapping is under specified
▪ Largely domain dependent

Ciaramita et al., 2005
 Statistical dependency parser to extract:
▪ SUBJECT-VERB-DIRECT_OBJECT
▪ SUBJECT-VERB-INDIRECT_OBJECT
 χ2 test – keep those occurring significantly more
often than by chance
 83% of learned relations are correct
 53.1% of generalized relations are correct

Heyer et al., 2001
 Calculate 2nd order collocations
 Use set of defined rules to reason

Ogata and Collier, 2004
 HEARST patterns for extraction
 Use heuristic reasoning rules

Yamaguchi, 2001
 Word space algorithm using 4 word window
 Cos(angle) measure for similarity
▪ If similarity > threshold relationship
 Precision = 59.89% for legal corpus

Poesio and Almuhareb, 2005
 Classify attributes into one of six categories:
▪ Quality, part, related-object, activity, related-agent, nonattribute
 Classifier was trained using:
▪ Morphological information, clustering results, search engine
results, and heuristics
 Better results from combining related-object and part
 F-measure = 53.8% for non-attribute class, and
between 81-95% for other classes

Claveau et al., 2003
 Inductive Logic Programming Approach
 Doesn’t distinguish between different qualia roles



Learning relations from non-verbal structures
Gold standard of qualia structures
Deriving a reasoning calculus

Strengths
 Explained (their) methods in detail

Weaknesses
 Required a lot of NLP background knowledge
 Short summaries of other’s work