Incorporating Contextual Cues in Trainable Models for Coreference Resolution 14 April 2003 Ryu Iida Computational Linguistic Laboratory Graduate School of Information Science Nara Institute of Science and Technology Background Two approaches to coreference resolution Rule-based approach [Mitkov 97, Baldwin 95, Nakaiwa 96, Okumura 95, Murata 97] Many attempted to encode linguistic cues into rules Problem: Further manualinfluenced refinementbyisCentering needed inTheory this study This was significantly butWalker it will et beal.prohibitively costly [Grosz 95, 94, Kameyama, 86] Best-achieved performance in MUC: (Message Understanding Conference) Precision roughly 70% Recall roughly 60% Corpus-based machine learning approach [Aone and These Bennettprevious 95, Soonwork et al. tend 01, Ng Problem: toand lackCardie an 02, Seki 02] Cost effective appropriate reference to the theoretical They have achieved a performance comparable to best performing linguistic work on coherence and coreference rule-based systems 2 Background Challenging issue Achieving a good union between theoretical linguistic findings and corpus-based empirical methods 3 Outline of this Talk Background Problems with previous statistical approaches Two methods Centering features Tournament-based search model Experiments Conclusions 4 Statistical approaches [Soon et al. ‘01, Ng and Cardie ‘02] Reach a level of performance comparable to state-of-theart rule-based systems Recast the task of anaphora resolution as a sequence of classification problems 5 Statistical approaches [Soon et al. ‘01, Ng and Cardie ‘02] antecedent [MUC-6] A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. 〇 The order, requested in a suit filed by USAir, dealt another × blow to TWA's bid to buy the company for $52 a share. × anaphor the task is to classify these pairs of noun phrases into positive or negative positive instance: Pair of an anaphor and the antecedent negative instance: Pairs of an anaphor and the NPs located between the anaphor and the antecedent USAir Group Inc USAir positive order USAir negative suit USAir negative output class 6 Statistical approaches [Soon et al. ‘01, Ng and Cardie ‘02] Feature set [Ng and Cardie 02] USAir Group Inc USAir order USAir suit USAir features positive POS DEMONSTRATIVE negative STRING_MATCH negative NUMBER GENDER SEM_CLASS DISTANCE SYNTACTIC ROLE candidate anaphor 先行詞候補 照応詞 先行詞候補 Prp_noun:1 照応詞 Organization:1 Organization:1 SENT_DIST:0 Person:1 Person:1 SENT_DIST:0 ハ:1 ハ:1 Person:1 Person:1 SENT_DIST:0 ハ:1 ハ:1 STR_MATCH:0 Pronoun:0 Pronoun:0 Pronoun:0 Pronoun:1 Pronoun:0 Pronoun:1 positive negative negative training (C4.5) Model (decision tree) 7 Statistical approaches [Soon et al. ‘01, Ng and Cardie ‘02] Test Phase [Ng and Cardie, 02] candidates extract NPs -2.0 Input each pair of given anaphor NP1 NP2 NP3 Select the best-scored candidate and one of these candidates to -1.1 as the output the decision tree -0.4 We refer to-1.0 Ng and Cardie’s model as NP4 the baseline-3.5 of our empirical evaluation NP5 NP6 NP7 NP8 1.5 -0.3 antecedent -2.5 NP6 anaphor Precision 78.0%, Recall 64.2% Slightly better than best-performing rule-based model at MUC-7 8 A drawback of the previous statistical models Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made The previous models do not capture moccasins, Sarah asked her when she had brought coffee to her localsaid context appropriately room, and Glendora she hadn't. antecedent anaphor [Kameyama 98] Sarah she negative Glendora she positive Positive and negative instances may have the identical feature vector POS Prop_Noun Pronoun NE SEM_CLASS SENT_DIST POS Prop_Noun Pronoun NE SEM_CLASS SENT_DIST features : Noun : Yes : No : PERSON : Person :0 : : : : : : Noun Yes No PERSON Person 0 9 Two methods Two methods Use more sophisticated linguistic cues: centering features Augmentation of a set of new features inspired by Centering Theory that implement local contextual factors Improve the search algorithm: tournament model A new model which makes pair-wise comparisons between candidates 11 Centering Features Sarah she negative Glendora she positive the problem is that the current feature set does not tell the difference between these two candidates POS Prop_Noun Pronoun NE SEM_CLASS SENT_DIST POS Prop_Noun Pronoun NE SEM_CLASS SENT_DIST : : : : : : : : : : : : features Noun Yes No PERSON Person 0 Noun Yes No PERSON Person 0 Sarah went downstairs and received another curious shock, CHAIN(Cb = Cp = Sarah) …… transition she hadn't. CHAIN(Cb = Cp = Glendora) antecedent Glendora Introduce extra devices such as the forward-looking center list Encode state transitions on them into a set of additional features 12 Two methods Use more sophisticated linguistic cues: centering features We augment the feature set with a set of new features inspired by Centering theory that implement local contextual factors Improve the search algorithm: tournament model We propose a new model which makes pair-wise comparisons between antecedent candidates 13 Tournament model What we want to do is to answer a question which is more likely to be coreferent, Sarah or Glendora Sarah went downstairs and received another curious shock, for × dining room in her home made when Glendora × flapped into the moccasins, × Sarah asked 〇 her when she had brought coffee to her room, and 〇 Glendora said 〇 she hadn't. Conduct a tournament consisting of a series of matches in which candidates compete with each other Match victory is determined by a pairwise comparison between candidates as a binary classification problem Most likely candidate is selected through a single-elimination tournament of matches 14 Tournament model Training instances features Training Phase class In the tournament, the correct antecedent NP5 must prevail over any of the other four candidates NP1 NP5 ANP right NP4 NP5 ANP right Extract four training instances NP5 NP7 ANP left Induce a pairwise classifier from a set of extracted training instances NP5 NP8 ANP left The classifier classifies a given pair of candidates into left or right the right hand side of a given pair wins (is more likely to be the antecedent) antecedent NP1 NP2 NP3 NP4 NP5 coreferent beginning of document NP6 NP7 NP8 ANP anaphor 15 Tournament model 1. the first match is arranged between the nearest candidates (NP7 and NP8) 2. each of the following matches arranged in turn between the winner (NP8) of the previous match and a new challenger (NP5) Test Phase NP1 NP2 NP3 NP4 NP5 coreferent beginning of document NP6 NP7 NP8 ANP anaphor 16 Tournament model 3. the winner is next matched against the next challenger (NP4) 4. this process is repeated until the last one participate 5. the model selects the candidate that prevails through the final round as the answer Test Phase antecedent NP5 NP1 NP2 NP3 NP4 NP5 coreferent beginning of document NP6 NP7 NP8 ANP anaphor 17 Experiments Experiments Empirical evaluation on Japanese zero-anaphora resolution Japanese does not normally use personal pronoun as anaphor Instead, Japanese uses zero-pronouns Comparison among four models 1. 2. 3. 4. Baseline model Baseline model with Centering Features Tournament model Tournament model with Centering Features 19 Centering Features in Japanese Japanese anaphora resolution model [Nariyama 02] Expansion of Kameyama’s work on the application of Centering Theory to Japanese zero-anaphora resolution Expanding the original forward-looking center list into Salience Reference List (SRL) to take into account broader contextual information More use of linguistic information In the experiments, we introduced two features to reflect the SRL-related contextual factors 20 Method Data GDA-tagged Japanese newspaper article corpus Texts Sentences Tags of anaphoric relation Tags of ellipsis (Zero-anaphor) GDA : : : : 2,176 24,475 14,743 5,966 MUC-6 60 8,946 0 As a preliminarily test, only resolving subject zeroanaphors, 2,155 instances in total Conduct five fold cross-validation on that data set with support vector machines 21 Feature set (see our paper for details) 1. Features for simulating Ng and Cardie’s feature set POS Pronoun Particle Named-Entity Semantic class Animacy Selectional Restrictions Distance between an anaphor and the candidate Number of anaphoric relations 2. Centering Features Order in SRL Heuristic rule of preference 3. Features for capturing the relations between two candidates introduce only in Preference of SRL in two candidates Preference of Animacy in two candidates Distance between two candidates tournament model but not in the baseline model 22 Results Tournament model Baseline model + Centering Features Baseline model Tournament model + Centering Features 23 Results (1/3) the effect of incorporating centering features Baseline model + Centering Features 67.0% 64.0% Baseline model centering features were reasonably effective 24 Results (2/3) Tournament model Baseline model + Centering Features 70.8% 67.0% 64.0% Baseline model Introducing the tournament model significantly improved the performance regardless the size of training data 25 Results (3/3) Tournament model Baseline model + Centering Features 70.8% 69.7% 67.0% 64.0% Baseline model Tournament model + Centering Features most complex model did not outperform the tournament model without The improvement ratio of this model against the data size is the best of all centering features 26 Results after cleaning data (March ‘03) Tournament model + Centering Features 74.3% 72.5% Tournament model the tournament model with centering features is more effective than the one without centering features 27 Conclusions Our concern is achieving a good union between theoretical linguistic findings and corpus-based empirical methods We presented a trainable coreference resolution model that is designed to incorporate contextual cues by means of centering features and a tournament-based search algorithm. These two improvements worked effectively in our experiments on Japanese zero-anaphora resolution. 28 Future Work In Japanese zero-anaphora resolution, 1. Identification of relations between the topic and subtopics 2. Analysis of complex and quoted sentences 3. Refinement of the treatment of selectional restrictions 29 30 Tournament model Training instances features Training Phase beginning of document coreferent antecedent class NP1 NP5 ANP right NP1 NP4 NP5 ANP right NP2 NP5 NP7 ANP left NP5 NP8 ANP left NP3 coreferent NP4 NP5 coreferent anaphor NP6 In the tournament, the correct antecedent NP5 must prevail over any of the other four candidates NP7 extract four training instances NP8 Induce from a set of extracted training instances a pairwise classifier ANP 31 Tournament model Test Phase beginning of document NP1 < coreferent NP2 coreferent NP3 < NP4 > NP5 NP6 NP7 NP8 anaphor ANP < coreferent A tournament consists of a series of matches in which candidates compete with each other antecedent NP5 32 Tournament model What we want to do is to answer a question which is more likely to be coreferent, Sarah or Glendora < < Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't. < CHAIN(Cb = Cp = Sarah): transition CHAIN(Cb = Cp = Glendora): Implement a pairwise comparison between candidates as a binary classification problem Sarah < Glendora she 33 Tournament model Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't. Training Phase She downstairs extract NPs Glendora Training instances moccasins downstairs < Glendora she moccasins < Glendora she coffee < Glendora she Sarah < Glendora she room < Glendora she Sarah her she coffee her room Glendora she coreferred coreferent output class 34 Conclusions To incorporate linguistic cues into trainable approaches: Add features which takes into consideration linguistic cues such as Centering Theory: Centering Features Propose the novel search model which the candidates are compared in terms of the likelihood of antecedents: Tournament model In Japanese zero-anaphora resolution task, Tournament model significantly outperforms earlier machine learning approaches [Ng and Cardie 02] Incorporating linguistic cues in machine learning models is effective 35 Data GDA-tagged Japanese newspaper article corpus Texts Sentences Tags of anaphoric relation Tags of ellipsis (Zero-anaphor) : : : : GDA MUC-6 2,176 24,475 14,743 5,966 60 8,946 0 coreferent <n id=“tagid1”>クリントン米大統領</n>の内政の最大課題のひとつである<n id=“tagid2”>包括犯罪対策法案</n>が十一日の下院本会議で、審議・表決に 移ることを承認する動議が、反対二二五対賛成二一〇で否決された。 これで coreferent <n eq=“tagid2”>同法案</n>は事実上、大幅修正または廃案に追い込まれた 。 <n eq=“tagid1”>同大統領</n>は緊急会見で怒りをあらわにして、法案の復 活を要求。 <n eq=“tagid1”>同大統領</n>は中間選挙を前に得点を <v agt=“tagid1”>あげる<v>ことを目指したが、逆に大きな痛手を受けた。 Ellipsis (AGENT) Extract 2,155 example 36 Statistical approaches [Soon et al. 01, Ng and Cardie 02] Reach a level of performance comparable to state-of-theart rule-based systems pairs of noun phrases into positive or negative. Recast the task of anaphora resolution as a sequence of classification problems [MUC-6] A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. Pair of an anaphor and the antecedent:positive instance Pairs of an anaphor and the NPs located between the anaphor and the antecedent: negative instance USAir Group Inc USAir positive order USAir negative suit USAir negative output class 37 *Centering Features Centering Theory [Grosz 95, Walker et al. 94, Kameyama, 86] Part of an overall theory of discourse structure and meaning Two levels of discourse coherence: global and local Centering models the local-level component of attentional state e.g. Intrasentential centering [Kameyama 97] Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't. 38 *Centering Features in English [Kameyama 97] CHAIN(Cb = Cp = Sarah): Sarah went downstairs and received another curious shock, ESTABLISH(Cb = Cp = Glendora): for when Glendora flapped into the dining room in her home made moccasins, CHAIN(Cb = Glendora, Cp = Sarah): Sarah asked her CHAIN(Cb = Cp = Glendora): when she had brought coffee to her room, CHAIN(Cb = NULL, Cp = Glendora): and Glendora said CHAIN(Cb = Cp = Glendora): she hadn't. [Kameyama 97] 39 *Centering Features in English [Kameyama 97] The essence is that takes into account the preference between candidates Sarah went downstairs and received another curious shock, CHAIN(Cb = Cp = Sarah) …… transition she hadn't. CHAIN(Cb = Cp = Glendora) Cb and Cp distinguish the two candidates Implement local contextual factor: centering features 40 *Tournament model Test Phase She < < downstairs shock Glendora room A tournament consists of a series of matches in which candidates compete with each other < her moccasins < Sarah her she antecedent Glendora < coffee < her < room Glendora she 41 Rule-based Approaches Encoding linguistic cues into rules manually Thematic roles of the candidates Order of the candidates Semantic relation between anaphors and antecedents etc.. Further manual refinement of rule-based models This approaches by Centering will are be influenced prohibitively costly Theory [Grosz 95, Walker et al. 94, Kameyama, 86] The Coreference Resolution Task of Message Understanding Conference (MUC-6 / MUC-7) Precision: Recall: roughly 70% roughly 60% 42 Statistical Approaches with Tagged-Corpus The statistical approaches have achieved a performance comparable to the best-performing rule-based systems Lack an appropriate reference to theoretical linguistic work on coherence and coreference Making a good marriage between theoretical linguistic findings and corpus-based empirical methods 43 *Test Phase [Soon et al. 01] candidates extracting NP judge Pittsburgh order A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. Trans World Airlines antecedent share 〇 USAir Group Inc × order a suit anaphor USAir Group Inc × USAir Precision 67.3%, Recall 58.6% on MUC data set 44 Improving Soon’s model [Ng and Cardie 02] Expanding the feature set 12 features ⇒ 53 features POS DEMONSTRATIVE STRING_MATCH NUMBER GENDER SEM_CLASS DISTANCE SYNTACTIC ROLE Introducing a new search algorithm 45 Test Phase [Soon et al. 01] candidates extracting NP judge Pittsburgh order A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. Trans World Airlines antecedent share 〇 USAir Group Inc × order a suit anaphor USAir Group Inc × USAir Precision 67.3%, Recall 58.6% on MUC data set 46 Task of Coreference Resolutions Two process Resolution of anaphors Resolution of antecedents applications Machine Translation, IR, etc antecedent A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. (Same color NPs are coreferred) anaphor [MUC-6] 47 Future Work Evaluate some examples Tournament model doesn’t deal with Direct quote …… 獄に下るモハンメドは妻にこう言い残した。「おれが刑務所にいる間、外で働 いてはいけない」。貞節を守れ、という意味だ。さすがに刑務所で新しい子供 に恵まれる可能性はないと思ったのだろうか。 SRL Proposed methods cannot deal with Topic different discourse structures モハンメド Focus おれ I-Obj 刑務所 D-Obj NULL Others 外 48 Centering Features of Japanese Adding the likelihood of antecedents into features In Japanese, wa-marked NPs tend to be topics Topics tend to be omitted Salience Reference List (SRL) [Nariyama 02] Store NPs in SRL from the beginning of text Overwrite the old entity if new entity fills same point Topic/φ (wa) > Focus (ga) > I-Obj (ni) > D-Obj (wo) > Others …NP1-waNP2-wo…。 …NP3-ga…、NP4-ha…。 …NP5-ni……(φ-ga)V。 Topic NULL NP1 NP4 Focus NULL NP3 I-Obj NULL NP5 D-Obj NULL NP2 Others NULL preferred 49 Evaluation of models Introduce a confidence measure Confidence coefficient is the value when two candidates are the nearest at the tournament President A 0.9 armistice < 2.4 corefered < President B < 3.2 this he > 3.8 action he 50 Evaluation of Tournament model investigate the Tournament model (the best performance ) 51 Centering Features example President A proposed the armistice, ドゥダエフ大統領は、正月休戦を提案したが、 but President B ignored this. エリツィン・ロシア大統領はこれを黙殺し、 And he started action. (φガ)行動を開始した。 SRL ドゥダエフ大統領 NULL > NULL > > NULL > NULL エリツィン・ロシア大統領 行動 > NULL > > NULL > NULL 52 *Features (1/3) Ng’s model, Tournament model Features are decided by one candidate candidate1 candidate2 anaphor POS Pronoun Particle Named-Entity the number of anaphoric relations First NP in a sentence Order of SRL 53 *Features (2/3) Ng’s model, Tournament model Features are decided by a pair of an anaphor and the candidate candidate1 candidate2 anaphor Selectional restrictions the pair of candidate and anaphor satisfies selectional restriction in Nihongo Goi Taikei log-likelihood ratio calculated from cooccurrence data Distance in terms of sentence between an anaphor and the candidate 54 *Features (3/3) only Tournament model Features are decided by the relation between two candidates candidate1 candidate2 Distance in terms of sentence between two candidates Animacy Whether or not one candidate belongs to the class of PERSON or ORGANIZATION Which candidate is preferred in SRL anaphor Topic NULL Focus NP1 I-Obj NULL D-Obj NP2 Others NULL 55 Anaphoric relations endophora Antecedent exists in a context anaphora Antecedents precede anaphors cataphora Anaphors precede antecedents exophora Antecedent doesn’t exist in context Variety of antecedents Noun Phrase (NP), Sentence, text, etc Many previous works deal with anaphora resolutions The number of antecedent-anaphor examples is the most of all 56 Results (examples: 2155 ⇒ 2681) Tournament model The Model using Centering Features gets worse than the model without Centering Features Tournament model Ng’s model + Centering Features Ng’s model Tournament model + Centering Features Modify some tagging errors by hand examples: 2155 ⇒ 2681 57
© Copyright 2025 ExpyDoc