Overview of MedNLP-2 Eiji Aramaki Mizuki Morita Tomoko Ohkuma Yoshinobu Kano (Kyoto university) (The university of Tokyo) (Fuji Xerox) (Shizuoka university) Why are we dealing with Medical records? Medical records contain rich clinical information as in text form Medical records contain rich clinical information as text BUT: The amount is more than one researcher can handle → requires ICT (NLP) Our goal is to develop the fundamental techniques for NLP in the medical field. ALSO, we are aiming to develop methodology, and publish the standard tools for the medical NLP. Who are the organizers? NLP Researcher Bioinformatics Medical NLP Bioinformatics NLP Researcher NLP Researcher (Fuji Xerox) Framework Tool sharing Company viewpoint • Organizers cover both academic researchers and a company member • Covering various fields (not only pure NLP/IR but also bioinformatics & framework making), Who are the organizers? NLP Researcher Bioinformatics Medical NLP Bioinformatics NLP Researcher NLP Researcher (Fuji Xerox) Framework Tool sharing Company viewpoint • Because MedNLP targets on two aspects: computer science & medical application, this them is suitable for such multiple aspects Overview • • • • • • Background Material Task Design Overview of Task 1 Overview of Task 2 What’ the Next One & Only Non-English Medical Shared Task • Medical Shared Task – – – – – Image CLEFmed (2005-) Image I2b2 NLP (2006) English TREC Medical Records Track (2011) English CLEFeHealth (2013) English MedNLP (2011) Japanese (& non-English) : • Why are not many non-English medical tasks available? Medical Record contains Privacy Information • In US, HIPPA clearly defines what is privacy, consisting of 18 items (name, telephone number, e-mail address, face picture….) – Once privacy information is removed, it can be used freely → SO: many English Health Records can be available • In contrast, Japan is still conservative • we do not have such clear privacy guideline for medical text • This becomes a heavy barrier for research use for medical text To break the barrier: 2 types of dummy medical records • (1) Dummy (virtual) Records – We asked volunteers, who are MDs, to write records assuming dummy (virtual) patients – Then, we bought the records • (2) Exam Texts – Question texts of the National Medical Exam (=“医 師国家試験”) for doctors. Exam basically consists of multiple question like SAT test in US or “center shiken” in Japan Most of question is give in the form of short sentence. BUT… Exam Texts Some of question contains rich information on a patient, which is called “case based question” That style is very similar to clinical record So, we convert the data to corpus Conversion process is 2 folds: Question style expression, such as multiple options, are removed Ws also add Named Entity, date time, to the corpus Part A .Medical Examination for Doctors (2005) Quantity & Quality of Dummy data MedNLP-1 MedNLP-2 Disorder of the Alimentary Tract 4 19 Liver, Biliary Tract & Pancreas 2 12 Cardiovascular System 12 23 Endocrinology, Metabolism & Nutrition 5 17 Disorders of the Kidney & Urinary Tract 4 14 Immune System & Immune-Mediated Injury 5 17 Disorders of the Hematopoietic System 1 13 Infectious Disease 6 15 Disorders of the Respiratory System 11 26 TOTAL 50 156 While MedNLP-1 does not covers several clinical domains enough, MedNLP-2 covers all domains To validate the quality, we ask MDs to classify the dummy records from the mixed corpus 10 dummy records 10 real records Accuracy Medical (physician) (n=2) 60.0% Non medical (n=3) 56.3% It was hard to distinguish even forMDs Task Design 京大病院来院5日前から腹が痛むとのこと De identification ■■大病院来院5日前から腹が痛むとのこと ■■大病院来院5日前から腹痛とのこと Coding ■■大病院来院5日前からR104とのこと Decision Support MedNLP-2 NER MedNLP-1 Milestone What kind of Task is required? MedNLP-2 targets on The 2nd step & 3rd step. Output Example Given a raw text MedNLP-1 MedNLP-2 Participants Participants increased! Task MedNLP-1 De-identification 6 groups (11 systems) NER 11 groups (15 systems) ICD-coding Free 1 groups (1 systems) MedNLP-2 10 groups (24 systems) 9 groups (19 systems) 2 groups (2 systems) The number of groups is the same to the previous MedNLP-1 The number of systems increased much • Surprisingly, In total, MedNLP-2 had 12 groups and 45 systems! • One of the most active tasks in NTCIR • More Surprisingly: ICD-coding task, which is a medical specific task, also almost 20 submissions. • This indicates that NLP people pay much attention to find the way to reach the medical application. Lists of MedNLP-2 Participants 北陸先端科学技術大学院大学 国立中央大学(台湾) JAIST National Central University 北海道大学 Hokkaido University 朝陽科技大学(台湾) 京都大学 Chaoyang University of Technology Kyoto University 岡山大学 大学 Academic 海外 Oversea 南京大学 (中国) Nanjing University Okayama Prefectural University 東京大学 中央研究院 (台湾) The University of Tokyo Academia Sinica 奈良先端科学技術大学院大学 Nara Institute of Science and Technology ダブリン大学(英国) 安田女子大学 Dublin City University Yasuda Women's College 日本ユニシス Nihon Unisys, Ltd Participants have various background just as the organizers 日立中央研究所 Hitachi, Ltd. 企業 Company NTT研究所 NTT Science and Core Technology Laboratory Group Lists of MedNLP-2 Participants 北陸先端科学技術大学院大学 国立中央大学(台湾) JAIST National Central University 北海道大学 Hokkaido University 朝陽科技大学(台湾) 京都大学 Chaoyang University of Technology Kyoto University 岡山大学 大学 Academic 海外 Oversea 南京大学 (中国) Nanjing University Okayama Prefectural University 東京大学 中央研究院 (台湾) The University of Tokyo Academia Sinica 奈良先端科学技術大学院大学 Nara Institute of Science and Technology ダブリン大学(英国) 安田女子大学 Dublin City University Yasuda Women's College 日本ユニシス Nihon Unisys, Ltd We are very happy to have five submission from oversea → Although the material is Japanese language only, task is not depend on the language. 日立中央研究所 Hitachi, Ltd. 企業 Company NTT研究所 NTT Science and Core Technology Laboratory Group Overview of Task 1 extraction of complaint and diagnosis Task (Shortly, NER task) Two types of NER Task (1) NER ONLY • Given a raw text, find a disease name 腹痛は認めず Stomachache is not found 腹痛は認められず Stomachache is not found (2) NER + MODALITY • Given a raw text, find a disease name & its modality 腹痛は認めず Stomachache is not found 腹痛は認められず 腹痛は認められず Stomachache is not found Stomachache is not found Negative MedNLP-1 << MedNLP-2 Seemingly MedNLP-2 much improved MedNLP-1 (2011) MedNLP-2 (2014) 15 groups over baseline 20 groups over baseline MedNLP-1 << MedNLP-2 MedNLP-1 (2011) MedNLP-2 (2014) The accuracy of the best did not improve! Still 85% is the maxim → we need a breakthrough On the other hand, the average performance much improved. That shows participants have already learned the best way from previous MedNLP-1, and used it 15 the groups over baseline → We could successfully improve the level of NLP in this field 20 groups over baseline STILL, we can improve modality detection • In modality detection, we could see divergence in performance • Several systems suffer from negation. • Especially, detection of suspicion is difficult. and the half of systems (Fmeasure) is lower than 50%. • The next challenge of this task is how to deal with such rare modalities Overview of Task 2 ICD coding task (shortly coding task) ICD-Coding Task 2 ways to join (1) TASK2ONLY • Given a text with disease name, to give ICD-code to them (2) TOTAL TASK • Given a text without any information, to find a disease name, and give ICD-code to them Divergence in performance 70% Difference is 40 % Rare case in recent shared task 30% Much Divergence in Task2ONLY 80% Difference is 50 % 30% Because • Everything is unknown in new task – What kind of tool or method is good? • Supervised or un-supervised – What kind of resource is good? • Extra corpus • Disease name Dictionary – What is the “ICD-Coding” task all about? • Multi labeling • Document classification • Term similarity design Methods Gro up Method Tool Resource Approach B RNN word2vec MEDIS Hyojun Byomei Master ICD-10 English dictionary Supervised C SVM Brown clustering word2vec Wikipedia Supervised D Distance in ICD tree hierarchy MEDIS Hyojun Byomei Master - E Full-test search MEDIS Hyojun Byomei Master ICD-10 English dictionary Unsupervised F Pattern match MEDIS Hyojun Byomei Master Similarity Design G Pattern match Brown clustering H Logistic regression MEDIS Hyojun Byomei Master LSD, T-Jisyo, MeDRA/J - J Rule MEDIS Hyojun Byomei Master Rule K Full-text search, Exact match Lucene Google translate Unsupervised Apache Solr Unsupervised Much varieties in tool and methods, including Methods the state-of-art tools, such as word2vec, RNN, Gro Method Tool Resource Approach up areRNNutilized word2vec MEDIS Hyojun Byomei Master Supervised B ICD-10 English dictionary C SVM Brown clustering word2vec Wikipedia Supervised in ICD MEDIS Hyojun Byomei Master TheDistance popular resource is “ MEDIS Hyojun Byomei Master” tree hierarchy EBUT: Full-test search of groups Lucene MEDIS Hyojun Byomei Masterit Unsupervised half do not use Google ICD-10 English dictionary D translate F Pattern match MEDIS Hyojun Byomei Master Similarity Design Pattern match Unsupervised Interesting approach (using English resources Brown clustering Husing Logisticmachine regression MEDIS Hyojun Byomei Master translation ) is utilized G LSD, T-Jisyo, MeDRA/J J Rule K Full-text search, Exact match MEDIS Hyojun Byomei Master Apache Solr Rule Unsupervised STILL: rule-based approach is employed Methods Gro up Method Tool Resource Approach B RNN word2vec MEDIS Hyojun Byomei Master ICD-10 English dictionary Supervised C SVM Brown clustering word2vec Wikipedia Supervised D Distance in ICD tree hierarchy MEDIS Hyojun Byomei Master - E Full-test search MEDIS Hyojun Byomei Master ICD-10 English dictionary Unsupervised F Pattern match MEDIS Hyojun Byomei Master Similarity Design G Pattern match Brown clustering Lucene Google translate Unsupervised We’d like to discussMEDIS the detail in the MedNLP H Logistic regression Hyojun Byomei Master LSD, T-Jisyo, MeDRA/J Session (will be held day tomorrow J Rule MEDIS Hyojun after Byomei Master Rule K Full-text search, Apache Please join us Unsupervised morning (9:20-)) Exact match Solr Overview of Task 3 - Free Task -- What is Free Task? • MedNLP has a unique task, FREE task, in which participants design their tasks freely (Any task is welcome!) • We design this task because we are frequently asked “We’d like to join MedNLP. But, MedNLP task is NOT our target task” or “We could not have enough ability to develop the NLP systems” • In order to save such groups, we proposed this task • However, the "Free task" is much too open-ended – An NTCIR reviews said “I'm a little pessimistic about whether anything concrete will come of this.” • I am not so pessimistic, because 2 groups joined this task, presented interesting works. Several medical terms are too difficult, and hard to understand for non-medical people, including patients and NLP researchers. To help the understanding of medical word, they build a word dictionary for non-medical people F-group (Word Dictionary for Patients) ATOK covers the corpus? L-group (Investigation Dictionary Coverage) Conclusion Summary MedNLP-1 MedNLP-2 Corpus Amount 50 documents 150 documents Material Dummy Records Dummy Records Medical Doctor Exam. Task De-identification NER Free NER ICD-coding Free # of systems 12 groups (27 systems) 12 (45 systems) MedNLP-2 improved Providing larger corpus Designing more complex task Although the number of groups is the same, but the number systems increased Acknowledgment Adviser MASUICHI Hiroshi, Ph.D. Annotator SHIKATA Shuko KUBO Kay SHIMAMOTO Yumiko
© Copyright 2024 ExpyDoc