LIMSI @ WMT’14 Medical Translation Task 1,2 1,2 1,2 2,3 2,4 Nicolas Pécheux , Li Gong , Quoc Khanh Do , Benjamin Marie , Yulia Ivanishcheva , 1,2 1,2 2 1,2 2 Alexandre Allauzen , Thomas Lavergne , Jan Niehues , Aurélien Max , François Yvon (1) Université Paris-Sud, (2) LIMSI-CNRS, (3) Lingua et Machina, (4) Centre Cochrane français Highlights Systems • Subtask of sentence translation from summaries, English → French In what circumstances do granulomatous and eosinophilic gastritis occur? What are the etiologies of dysphagia in gastroesophageal reflux disease? Ncode — bilingual n-gram approach to SMT • Successful approach that makes use of two flexible translation systems VSM — Vector space model to perform domain adaptation MIRA Data sources Corpus Tokens (en) weight Coppa Emea Pattr-Abstracts Pattr-Claims Pattr-Titles UMLS Wikipedia 10M 6M 20M 32M 3M 8M 17k -3 26 22 6 4 -7 -5 NewsCommentary Europarl Giga 4M 54M 260M 6 -7 27 all 397M 33 • Combining both data sources drastically boosts performance Devel medical WMT’13 both 42.2± 0.1 43.0± 0.1 48.3± 0.1 OTF — on-the-fly estimation of the parameters of a standart phrasebased model Soul — Continous space models working on top of conventional lan∗ guage models (reranking); adapted language model (LM ) Test SysComb — Combination of both systems (reranking) 39.6± 0.1 41.0± 0.0 45.4± 0.0 Devel BLEU scores obtained by Ncode Part-of-Speech Tagging Proxy Test Set • Medical data exhibit different syntactic constructions and a specific vocabulary • Only a small development set is available (500 sentences) • This makes both system design and tuning challenging • We use a specific model trained on medical data PoS tagging Devel Test Standard Specialized 47.9± 0.0 48.3± 0.1 44.8± 0.1 45.4± 0.0 • We created an internal dev/test set (LmTest) by extracting sentences from Pattr-Abstracts Devel LmTest NewsTest12 Test 48.3± 0.1 41.8± 0.2 39.8± 0.1 46.8± 0.1 48.9± 0.1 37.4± 0.2 26.2± 0.1 18.5± 0.1 29.0± 0.1 45.4± 0.0 40.1± 0.1 39.0± 0.3 Test Ncode + Soul LM∗ ∗ + Soul LM + TM 48.5 49.8 50.1 45.2 45.9 47.0 OTF + VSM ∗ + Soul LM ∗ + Soul LM + TM 46.6 46.9 48.4 49.7 42.5 42.8 44.2 44.9 SysComb 50.7 46.5 • Ncode outperforms OTF by 2.8 BLEU points • Vector space model does not yield here any improvement • Continous space language models yield gains of up to 2 BLEU points • System combination gain does not transfer to the test set Conclusions • Moderate to high-quality translations Error Analysis • Lack of an internal test challenging extra SysComb OTF+VSM+Soul missing incorrect unknown word content filler disamb. form style term order word term all 4 4 13 4 20 31 47 44 62 82 8 6 18 20 21 42 1 3 11 12 205 248 Manual error analysis following (Vilar et al., 2006) for the first 100 test sentences. • More careful integration of medical terminology proved necessary
© Copyright 2024 ExpyDoc