Exploiting Syntactic
Patterns as Clues in ZeroAnaphora Resolution
Ryu Iida, Kentaro Inui and Yuji Matsumoto
Nara Institute of Science and Technology
{ryu-i,inui,matsu}@is.naist.jp
June, 20th, 2006
1
Zero-anaphora resolution
Zero-anaphor = a gap with an anaphoric function
Zero-anaphora resolution becoming important in many
applications
In Japanese, even obligatory arguments of a predicate
are often omitted when they are inferable from the
context
45.5% nominative arguments of verbs are omitted in
newspaper articles
2
Zero-anaphora resolution (cont’d)
Three sub-tasks:
Zero-pronoun detection: detect a zero-pronoun
Antecedent identification: identify the antecedent for a given
zero-pronoun
Anaphoricity determination:
antecedent
Mary-wa
John-ni
anaphoric zero-pronoun
(φ-ga ) tabako-o
Mary-NOM John-DAT (φ-NOM ) smoking-OBJ
[Mary asked John to quit smoking.]
yameru-youni it-ta
quit-COMP
say-PAST
3
Zero-anaphora resolution (cont’d)
Three sub-tasks:
Zero-pronoun detection: detect a zero-pronoun
Antecedent identification: identify antecedent from the set of candidate
antecedents for a given zero-pronoun
Anaphoricity determination: classify whether a given zero-pronoun is
anaphoric or non-anaphoric
antecedent
Mary-wa
John-ni
anaphoric zero-pronoun
(φ-ga ) tabako-o
Mary-NOM John-DAT (φ-NOM ) smoking-OBJ
[Mary asked John to quit smoking.]
yameru-youni it-ta
quit-COMP
say-PAST
non-anaphoric zero-pronoun
(φ-ga)
ie-ni
(φ -NOM)
home-DAT
kaeri-tai
want to go back
[(φ=I) want to go home.]
4
Previous work on anaphora resolution
Research trend has been shifting from rule-based approaches
(Baldwin, 95; Lappin and Leass, 94; Mitkov, 97, etc.) to empirical, or
learning-based, approaches (Soon et al., 2001; Ng 04, Yang et al., 05,
etc.)
Cost-efficient solution for achieving performance comparable to best
performing rule-based systems
Learning-based approaches represent a problem, anaphoricity
determination and antecedent identification, as a set of feature
vectors and apply machine learning algorithms to them
5
Syntactic pattern features
Useful clues for both anaphoricity determination and
antecedent identification
Mary-wa
Mary-TOP
Antecedent
John-ni
John-DAT
zero-pronoun
φ-ga
φ-NOM
tabako-o
smoking-OBJ
predicate
yameru-youni
quit-CONP
predicate
it-ta
say-PAST
6
Syntactic pattern features
Useful clues for both anaphoricity determination and
antecedent identification
Mary-wa
Mary-TOP
Antecedent
John-ni
John-DAT
zero-pronoun
φ-ga
φ-NOM
tabako-o
smoking-OBJ
predicate
yameru-youni
quit-CONP
predicate
it-ta
say-PAST
Questions
How to encode syntactic patterns as features
How to avoid data sparseness problem
7
Talk outline
1.
2.
3.
4.
5.
Zero-anaphora resolution: Background
Selection-then-classification model (Iida et al., 05)
Proposed model
Represents syntactic patterns based on dependency
trees
Uses a tree mining technique to seek useful sub-trees
to solve data sparseness problem
Incorporates syntactic pattern features in the
selection-then-classification model
Experiments on Japanese zero-anaphora
Conclusion and future work
8
Selection-then-Classification Model
(SCM) (Iida et al., 05)
A federal judge in Pittsburgh issued a temporary restraining order preventing Trans
World Airlines from buying additional shares of USAir Group Inc. The order,
requested in a suit filed by USAir, …
candidate anaphor
federal judge
order
…
candidate
antecedents
tournament model
USAir Group Inc
candidate
anaphor
suit
USAir
9
Selection-then-Classification Model
(SCM) (Iida et al., 05)
federal judge
order
…
candidate
antecedents
tournament model
(Iida et al. 03)
USAirUSAir
GroupGroup
Inc Inc
USAir Group Inc
candidate
anaphor
suit
USAir
Federal judge
order … USAir Group Inc
candidate antecedents
suit
USAir
candidate
anaphor
10
Selection-then-Classification Model
(SCM) (Iida et al., 05)
federal judge
candidate
antecedents
order
tournament model
…
USAir Group Inc
suit
candidate
anaphor
USAir
USAir Group Inc
most likely
candidate
antecedent
11
Selection-then-Classification Model
(SCM) (Iida et al., 05)
federal judge
candidate
antecedents
order
tournament model
…
USAir Group Inc
suit
candidate
anaphor
USAir
USAir Group Inc
most likely
candidate
antecedent
score ≧ θ
ana
USAir is anaphoric and
USAir Group Inc is the antecedent of USAir
USAir Group Inc USAir
Anaphoricity
determination model
score θ ana
USAir is non-anaphoric
12
Selection-then-Classification Model
(SCM) (Iida et al., 05)
federal judge
candidate
antecedents
order
tournament model
…
USAir Group Inc
suit
candidate
anaphor
USAir
USAir Group Inc
most likely
candidate
antecedent
score ≧ θ
ana
USAir is anaphoric and
USAir Group Inc is the antecedent of USAir
USAir Group Inc USAir
Anaphoricity
determination model
score θ ana
USAir is non-anaphoric
13
Training the anaphoricity determination
model
Anaphoric
set of candidate
antecedents
NP1
NP2
NPi: candidate antecedent
NP3
Antecedent
anaphoric
noun phrase
Non-anaphoric
set of candidate
antecedents
NP4
NP5
Anaphoric
instances
ANP
NP4
NP1
NP2
tournament model
NP3
NP4
non-anaphoric
noun phrase
ANP
NP5
NANP
candidate
antecedent
NP3
Non-anaphoric
instances
NP3
NANP
14
Talk outline
1.
2.
3.
4.
5.
Zero-anaphora resolution: Background
Selection-then-classification model (Iida et al., 05)
Proposed model
Represents syntactic patterns based on dependency
trees
Uses a tree mining technique to seek useful sub-trees
to solve data sparseness problem
Incorporates syntactic pattern features in the
selection-then-classification model
Experiments on Japanese zero-anaphora
Conclusion and future work
15
New model
(TL)
LeftCand predicate
zeropronoun
(T )
candidate
anaphor
(TI)
predicate
R
RightCand
federal judge
candidate
antecedents
order
predicate
zeropronoun
LeftCand
RightCand
predicate
predicate
tournament model
(TL)
LeftCand predicate
…
USAir Group Inc
suit
USAir
USAir Group Inc
most likely
candidate
antecedent
score ≧ θ
ana
USAir is anaphoric and
USAir Group Inc is the antecedent of USAir
zeropredicate
pronoun
USAir Group Inc USAir
Anaphoricity
determination model
score θ ana
USAir is non-anaphoric
16
Use of syntactic pattern features
Encoding parse tree features
Learning useful sub-trees
17
Encoding parse tree features
Mary-wa
Mary-TOP
Antecedent
John-ni
John-DAT
zero-pronoun
φ-ga
φ-NOM
tabako-o
smoking-OBJ
predicate
yameru-youni
quit-CONP
predicate
it-ta
say-PAST
18
Encoding parse tree features
Mary-wa
Mary-TOP
Antecedent
John-ni
John-DAT
zero-pronoun
φ-ga
φ-NOM
tabako-o
smoking-OBJ
predicate
yameru-youni
quit-CONP
predicate
it-ta
say-PAST
19
Encoding parse tree features
Antecedent
John-ni
John-DAT
Antecedent
zero-pronoun
φ-ga
φ-NOM
zero-pronoun
predicate
yameru-youni
quit-CONP
predicate
predicate
it-ta
say-PAST
predicate
20
Encoding parse tree features
Antecedent
John-ni
John-DAT
Antecedent
ni
DAT
zero-pronoun
φ-ga
φ-NOM
zero-pronoun
ga
CONJ
predicate
yameru-youni
quit-CONP
predicate
youni
CONJ
predicate
it-ta
say-PAST
predicate
ta
PAST
21
Encoding parse trees
LeftCand
Mary-wa
Mary-TOP
RightCand
John-ni
John-DAT
zero-pronoun
φ-ga
φ-NOM
tabako-o
smoking-OBJ
predicate
yameru-youni
quit-CONP
predicate
it-ta
say-PAST
(TL)
LeftCand
zeropronoun
(TR)
RightCand
zeropronoun
(TI)
predicate predicate
LeftCand
RightCand predicate
predicate predicate
22
Encoding parse trees
Antecedent identification
root
(TL)
LeftCand predicate
zeropronoun
(TR)
RightCand predicate
zeropronoun
(TI)
predicate
LeftCand
RightCand predicate
predicate
Three sub-trees
23
Encoding parse trees
Antecedent identification
root
…
(TL)
zeroLeftCand predicate
pronoun
(TR)
RightCand predicate
zeropronoun
(TI)
predicate
LeftCand
f1
f2
…
fn
RightCand predicate
predicate
Three sub-trees
Lexical, Grammatical,
Semantic, Positional and
Heuristic binary features
24
Encoding parse trees
label
Left or right
Antecedent identification
root
…
(TL)
zeroLeftCand predicate
pronoun
(TR)
RightCand predicate
zeropronoun
(TI)
predicate
LeftCand
f1
f2
…
fn
RightCand predicate
predicate
Three sub-trees
Lexical, Grammatical,
Semantic, Positional and
Heuristic binary features
25
Learning useful sub-trees
Kernel methods:
Tree kernel (Collins and Duffy, 01)
Hierarchical DAG kernel (Suzuki et al., 03)
Convolution tree kernel (Moschitti, 04)
Boosting-based algorithm:
BACT (Kudo and Matsumoto, 04) system learns a list of
weighted decision stumps with the Boosting algorithm
26
Learning useful sub-trees
Boosting-based algorithm: BACT
Learns a list of weighted decision stumps with Boosting
Classifies a given input tree by weighted voting
learn
Training instances
Labels
decision stumps
weight
0.4
positive
positive
positive
sub-tree
Label
positive
….
apply
Score: +0.34
positive
27
Overall process
Input (a zero-pronoun φ in the sentence S)
syntactic
patterns
scoreintra≧θintra
Output the most-likely
Intra-sentential model
candidate antecedent
appearing in S
scoreintra<θintra
scoreinter≧θinter
Inter-sentential model
Output the most-likely
candidate appearing
scoreinter<θinter
outside of S
Return ‘‘non-anaphoric’’
28
Table of contents
Zero-anaphora resolution
Selection-then-classification model (Iida et al., 05)
Proposed model
1.
2.
3.
4.
5.
Parse encoding
Tree mining
Experiments
Conclusion and future work
29
Experiments
Japanese newspaper article corpus comprising zeroanaphoric relations: 197 texts (1,803 sentences)
995 intra-sentential anaphoric zero-pronouns
754 inter-sentential anaphoric zero-pronouns
603 non-anaphoric zero-pronouns
Recall =
# of correctly resolved zero-anaphoric relations
# of anaphoric zero-pronouns
Precision =
# of correctly resolved zero-anaphoric relations
# of anaphoric zero-pronouns the model detected
30
Experimental settings
Conducting five-fold cross validation
Comparison among four models
BM: Ng and Cardie (02)’s model:
Identify an antecedent with candidate-wise classification
Determine the anaphoricity of a given anaphor as a byproduct of the search for its antecedent
BM_STR: BM +syntactic pattern features
SCM: Selection-then-classification model (Iida et al., 05)
SCM_STR: SCM + syntactic pattern features
31
Results of intra-sentential ZAR
Antecedent identification (accuracy)
BM (Ng02)
BM_STR
SCM (Iida05)
SCM_STR
48.0%
(478/995)
63.5%
(632/995)
65.1%
(648/995)
70.5%
(701/995)
The performance of antecedent identification improved by
using syntactic pattern features
32
Results of intra-sentential ZAR
antecedent identification + anaphoricity determination
33
Impact on overall ZAR
Evaluate the overall performance for both intrasentential and inter-sentential ZAR
Baseline model: SCM
resolves intra-sentential and inter-sentential zero-anaphora
simultaneously with no syntactic pattern features.
34
Results of overall ZAR
35
AUC curve
AUC (Area Under the recall-precision Curve) plotted by
altering θintra
Not peaky optimizing parameter θintra is not difficult
36
Conclusion
We have addressed the issue of how to use syntactic patterns
for zero-anaphora resolution.
How to encode syntactic pattern features
How to seek useful sub-trees
Incorporating syntactic pattern features into our selection-thenclassification model improves the accuracy for intra-sentential
zero-anaphora, which consequently improves the overall
performance of zero-anaphora resolution
37
Future work
How to find zero-pronouns?
Designing a broader framework to interact with
analysis of predicate argument structure
How to find a globally optimal solution to the set
of zero-anaphora resolution problems in a given
discourse?
Exploring methods as discussed by McCallum and
Wellner (03)
38
© Copyright 2026 ExpyDoc