Exploiting Syntactic Patterns as Clues in ZeroAnaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology {ryu-i,inui,matsu}@is.naist.jp June, 20th, 2006 1 Zero-anaphora resolution Zero-anaphor = a gap with an anaphoric function Zero-anaphora resolution becoming important in many applications In Japanese, even obligatory arguments of a predicate are often omitted when they are inferable from the context 45.5% nominative arguments of verbs are omitted in newspaper articles 2 Zero-anaphora resolution (cont’d) Three sub-tasks: Zero-pronoun detection: detect a zero-pronoun Antecedent identification: identify the antecedent for a given zero-pronoun Anaphoricity determination: antecedent Mary-wa John-ni anaphoric zero-pronoun (φ-ga ) tabako-o Mary-NOM John-DAT (φ-NOM ) smoking-OBJ [Mary asked John to quit smoking.] yameru-youni it-ta quit-COMP say-PAST 3 Zero-anaphora resolution (cont’d) Three sub-tasks: Zero-pronoun detection: detect a zero-pronoun Antecedent identification: identify antecedent from the set of candidate antecedents for a given zero-pronoun Anaphoricity determination: classify whether a given zero-pronoun is anaphoric or non-anaphoric antecedent Mary-wa John-ni anaphoric zero-pronoun (φ-ga ) tabako-o Mary-NOM John-DAT (φ-NOM ) smoking-OBJ [Mary asked John to quit smoking.] yameru-youni it-ta quit-COMP say-PAST non-anaphoric zero-pronoun (φ-ga) ie-ni (φ -NOM) home-DAT kaeri-tai want to go back [(φ=I) want to go home.] 4 Previous work on anaphora resolution Research trend has been shifting from rule-based approaches (Baldwin, 95; Lappin and Leass, 94; Mitkov, 97, etc.) to empirical, or learning-based, approaches (Soon et al., 2001; Ng 04, Yang et al., 05, etc.) Cost-efficient solution for achieving performance comparable to best performing rule-based systems Learning-based approaches represent a problem, anaphoricity determination and antecedent identification, as a set of feature vectors and apply machine learning algorithms to them 5 Syntactic pattern features Useful clues for both anaphoricity determination and antecedent identification Mary-wa Mary-TOP Antecedent John-ni John-DAT zero-pronoun φ-ga φ-NOM tabako-o smoking-OBJ predicate yameru-youni quit-CONP predicate it-ta say-PAST 6 Syntactic pattern features Useful clues for both anaphoricity determination and antecedent identification Mary-wa Mary-TOP Antecedent John-ni John-DAT zero-pronoun φ-ga φ-NOM tabako-o smoking-OBJ predicate yameru-youni quit-CONP predicate it-ta say-PAST Questions How to encode syntactic patterns as features How to avoid data sparseness problem 7 Talk outline 1. 2. 3. 4. 5. Zero-anaphora resolution: Background Selection-then-classification model (Iida et al., 05) Proposed model Represents syntactic patterns based on dependency trees Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem Incorporates syntactic pattern features in the selection-then-classification model Experiments on Japanese zero-anaphora Conclusion and future work 8 Selection-then-Classification Model (SCM) (Iida et al., 05) A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, … candidate anaphor federal judge order … candidate antecedents tournament model USAir Group Inc candidate anaphor suit USAir 9 Selection-then-Classification Model (SCM) (Iida et al., 05) federal judge order … candidate antecedents tournament model (Iida et al. 03) USAirUSAir GroupGroup Inc Inc USAir Group Inc candidate anaphor suit USAir Federal judge order … USAir Group Inc candidate antecedents suit USAir candidate anaphor 10 Selection-then-Classification Model (SCM) (Iida et al., 05) federal judge candidate antecedents order tournament model … USAir Group Inc suit candidate anaphor USAir USAir Group Inc most likely candidate antecedent 11 Selection-then-Classification Model (SCM) (Iida et al., 05) federal judge candidate antecedents order tournament model … USAir Group Inc suit candidate anaphor USAir USAir Group Inc most likely candidate antecedent score ≧ θ ana USAir is anaphoric and USAir Group Inc is the antecedent of USAir USAir Group Inc USAir Anaphoricity determination model score θ ana USAir is non-anaphoric 12 Selection-then-Classification Model (SCM) (Iida et al., 05) federal judge candidate antecedents order tournament model … USAir Group Inc suit candidate anaphor USAir USAir Group Inc most likely candidate antecedent score ≧ θ ana USAir is anaphoric and USAir Group Inc is the antecedent of USAir USAir Group Inc USAir Anaphoricity determination model score θ ana USAir is non-anaphoric 13 Training the anaphoricity determination model Anaphoric set of candidate antecedents NP1 NP2 NPi: candidate antecedent NP3 Antecedent anaphoric noun phrase Non-anaphoric set of candidate antecedents NP4 NP5 Anaphoric instances ANP NP4 NP1 NP2 tournament model NP3 NP4 non-anaphoric noun phrase ANP NP5 NANP candidate antecedent NP3 Non-anaphoric instances NP3 NANP 14 Talk outline 1. 2. 3. 4. 5. Zero-anaphora resolution: Background Selection-then-classification model (Iida et al., 05) Proposed model Represents syntactic patterns based on dependency trees Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem Incorporates syntactic pattern features in the selection-then-classification model Experiments on Japanese zero-anaphora Conclusion and future work 15 New model (TL) LeftCand predicate zeropronoun (T ) candidate anaphor (TI) predicate R RightCand federal judge candidate antecedents order predicate zeropronoun LeftCand RightCand predicate predicate tournament model (TL) LeftCand predicate … USAir Group Inc suit USAir USAir Group Inc most likely candidate antecedent score ≧ θ ana USAir is anaphoric and USAir Group Inc is the antecedent of USAir zeropredicate pronoun USAir Group Inc USAir Anaphoricity determination model score θ ana USAir is non-anaphoric 16 Use of syntactic pattern features Encoding parse tree features Learning useful sub-trees 17 Encoding parse tree features Mary-wa Mary-TOP Antecedent John-ni John-DAT zero-pronoun φ-ga φ-NOM tabako-o smoking-OBJ predicate yameru-youni quit-CONP predicate it-ta say-PAST 18 Encoding parse tree features Mary-wa Mary-TOP Antecedent John-ni John-DAT zero-pronoun φ-ga φ-NOM tabako-o smoking-OBJ predicate yameru-youni quit-CONP predicate it-ta say-PAST 19 Encoding parse tree features Antecedent John-ni John-DAT Antecedent zero-pronoun φ-ga φ-NOM zero-pronoun predicate yameru-youni quit-CONP predicate predicate it-ta say-PAST predicate 20 Encoding parse tree features Antecedent John-ni John-DAT Antecedent ni DAT zero-pronoun φ-ga φ-NOM zero-pronoun ga CONJ predicate yameru-youni quit-CONP predicate youni CONJ predicate it-ta say-PAST predicate ta PAST 21 Encoding parse trees LeftCand Mary-wa Mary-TOP RightCand John-ni John-DAT zero-pronoun φ-ga φ-NOM tabako-o smoking-OBJ predicate yameru-youni quit-CONP predicate it-ta say-PAST (TL) LeftCand zeropronoun (TR) RightCand zeropronoun (TI) predicate predicate LeftCand RightCand predicate predicate predicate 22 Encoding parse trees Antecedent identification root (TL) LeftCand predicate zeropronoun (TR) RightCand predicate zeropronoun (TI) predicate LeftCand RightCand predicate predicate Three sub-trees 23 Encoding parse trees Antecedent identification root … (TL) zeroLeftCand predicate pronoun (TR) RightCand predicate zeropronoun (TI) predicate LeftCand f1 f2 … fn RightCand predicate predicate Three sub-trees Lexical, Grammatical, Semantic, Positional and Heuristic binary features 24 Encoding parse trees label Left or right Antecedent identification root … (TL) zeroLeftCand predicate pronoun (TR) RightCand predicate zeropronoun (TI) predicate LeftCand f1 f2 … fn RightCand predicate predicate Three sub-trees Lexical, Grammatical, Semantic, Positional and Heuristic binary features 25 Learning useful sub-trees Kernel methods: Tree kernel (Collins and Duffy, 01) Hierarchical DAG kernel (Suzuki et al., 03) Convolution tree kernel (Moschitti, 04) Boosting-based algorithm: BACT (Kudo and Matsumoto, 04) system learns a list of weighted decision stumps with the Boosting algorithm 26 Learning useful sub-trees Boosting-based algorithm: BACT Learns a list of weighted decision stumps with Boosting Classifies a given input tree by weighted voting learn Training instances Labels decision stumps weight 0.4 positive positive positive sub-tree Label positive …. apply Score: +0.34 positive 27 Overall process Input (a zero-pronoun φ in the sentence S) syntactic patterns scoreintra≧θintra Output the most-likely Intra-sentential model candidate antecedent appearing in S scoreintra<θintra scoreinter≧θinter Inter-sentential model Output the most-likely candidate appearing scoreinter<θinter outside of S Return ‘‘non-anaphoric’’ 28 Table of contents Zero-anaphora resolution Selection-then-classification model (Iida et al., 05) Proposed model 1. 2. 3. 4. 5. Parse encoding Tree mining Experiments Conclusion and future work 29 Experiments Japanese newspaper article corpus comprising zeroanaphoric relations: 197 texts (1,803 sentences) 995 intra-sentential anaphoric zero-pronouns 754 inter-sentential anaphoric zero-pronouns 603 non-anaphoric zero-pronouns Recall = # of correctly resolved zero-anaphoric relations # of anaphoric zero-pronouns Precision = # of correctly resolved zero-anaphoric relations # of anaphoric zero-pronouns the model detected 30 Experimental settings Conducting five-fold cross validation Comparison among four models BM: Ng and Cardie (02)’s model: Identify an antecedent with candidate-wise classification Determine the anaphoricity of a given anaphor as a byproduct of the search for its antecedent BM_STR: BM +syntactic pattern features SCM: Selection-then-classification model (Iida et al., 05) SCM_STR: SCM + syntactic pattern features 31 Results of intra-sentential ZAR Antecedent identification (accuracy) BM (Ng02) BM_STR SCM (Iida05) SCM_STR 48.0% (478/995) 63.5% (632/995) 65.1% (648/995) 70.5% (701/995) The performance of antecedent identification improved by using syntactic pattern features 32 Results of intra-sentential ZAR antecedent identification + anaphoricity determination 33 Impact on overall ZAR Evaluate the overall performance for both intrasentential and inter-sentential ZAR Baseline model: SCM resolves intra-sentential and inter-sentential zero-anaphora simultaneously with no syntactic pattern features. 34 Results of overall ZAR 35 AUC curve AUC (Area Under the recall-precision Curve) plotted by altering θintra Not peaky optimizing parameter θintra is not difficult 36 Conclusion We have addressed the issue of how to use syntactic patterns for zero-anaphora resolution. How to encode syntactic pattern features How to seek useful sub-trees Incorporating syntactic pattern features into our selection-thenclassification model improves the accuracy for intra-sentential zero-anaphora, which consequently improves the overall performance of zero-anaphora resolution 37 Future work How to find zero-pronouns? Designing a broader framework to interact with analysis of predicate argument structure How to find a globally optimal solution to the set of zero-anaphora resolution problems in a given discourse? Exploring methods as discussed by McCallum and Wellner (03) 38
© Copyright 2024 ExpyDoc