Presentation Title: Semantic Computing and Standard Data Category Registry 9th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006 , Kobe Japan. Day: 20060322 Slot No. K3 Name: HASIDA Koiti (ISO/TC37/SC4/TDG3 Convener) Organization: AIST & GSK Semantic Gap People and computers don't share meaning and value. – We don't understand computers. – Computers don't understand us. So they cannot collaborate well. 9th Open Forum for Metadata Registry, Kobe, 2006 We Don't Understand Computers. (Computers Don't Understand Themselves, either.) I installed Service Pack 2 into my PC running Windows XP. Since then I cannot connect to wireless LAN. Why? I cannot remove a strange line in MS Word. We cannot coordinate workflow systems with each other in our intranet. 9th Open Forum for Metadata Registry, Kobe, 2006 Computers Don't Understand Us. I cannot find the information I want. The search engine returns a lot of irrelevant information and little relevant information. – Web sites are very hard to keep easy to use. – The computer doesn’t know what exactly I want to know. The computer doesn’t know what the Web content means. Performance improved by banning intracorporate e-mails. – E-mails poorly reflect contexts of real human communication. 9th Open Forum for Metadata Registry, Kobe, 2006 Semantic Computing = Semantics-Oriented Architecture Glassbox Computer – – design and operation of computer systems through semantics shared with people semantic model of data and process Straightforward provision of services meaningful to people People can understand, compose, and improve software. – emergent total optimization by accumulation of improvements by many users 9th Open Forum for Metadata Registry, Kobe, 2006 agent device home info. appliance translation Ubiquitous Info. Service enterprise project management summarization accounting ITS behavior mining retrieval Semantic Service semantic authoring possible-world simulation semantic Web service ontology ad-hoc wireless network dialog network robot spatial reasoning planning speech Semantic Platform vision multiagent architecture semantic annotation Ubiquitous Platform grid sensor net privacy security 9th Open Forum for Metadata Registry, Kobe, 2006 Ontology 9th Open Forum for Metadata Registry, Kobe, 2006 Ontology of Patent Claim Each `claim’ class instance has one or more `constituent’ properties with `technology’ class instances as values. class (concept) property The `claim’ class subsumes the `Jepsontype claim’ class. claim constituent+ technology about* Jepson-type claim presupposes other claim 9th Open Forum for Metadata Registry, Kobe, 2006 description Semantic Structure of Patent Claim constituent constituent extract ion a from (1) ion source (1) (2) separates a about mass analyzer (2) (2) extracts ion b mass spectroscope (0) constituent presupposes constituent Jepson-type claim 0 constituent subslit (10) ion-electron converter (4) electron detector (3) about (4) converts b to electron c about about (3) detects c and extracts as electric signal enables enables enables enables place (10) between (2) and (4) purpose constituent voltage controller(12) about (12) determines Vs and Vc according to V0 Vs=V0-k1 Vc=V0-k2 V0 = ion-extraction voltage on (1) Vs = voltage on (10) Vc = converter voltage on (4) k1 and k2 are constants 9th Open Forum for Metadata Registry, Kobe, 2006 constraint Translation … Two-Day Work 検索質問Qのノードxごとに、リンクy-zが データベースDに含まれてyのラベルがL であるようなノードyとノードz∈F(x)が存 在するような、ラベルLのリストを、表示 部に表示する wrong translation displaying, on a display unit, a list of labels L in which are present a node z∈F(x) and a node y of which a link yz is contained in the database D and of which the label y is L, for each of the nodes x of a search question Q 9th Open Forum for Metadata Registry, Kobe, 2006 Explicit Semantic Structure 検索質問Qの各ノードx 量化 Lのリストを表示部に表示する 内包 each node x in retrieval query Q z∈F(x)。 データベースDがリンクy-zを含む。 yのラベルがLである。 quantify display the list of L on the display unit z∈F(x). Database D contains link y-z. The label of y is L. 9th Open Forum for Metadata Registry, Kobe, 2006 intension Semantic Authoring 9th Open Forum for Metadata Registry, Kobe, 2006 The Right Question about Semantic Annotation How to make many people do semantic annotation (in place of machines)? How to raise intellectual productivity of people/society? 9th Open Forum for Metadata Registry, Kobe, 2006 Traditional Authoring Huge knowledge needed. human content authoring human content document inaccurate 精度低 Information loss Linearization cost human content computer 9th Open Forum for Metadata Registry, Kobe, 2006 Semantic Authoring easy & accurate human content semantic authoring coarsegrain graphica l content content accurate 精度低 Little information loss No linearization cost human finegrain graphica l content computer human 9th Open Forum for Metadata Registry, Kobe, 2006 content Coarse-Grain Graphical Content Result of semantic authoring Easy for people to understand and compose – – explicit logical structure no intersentential order concession I was hungry. causes I had had a lunch. I had a snack. causes causes I became full. 9th Open Forum for Metadata Registry, Kobe, 2006 Fine-Grain Graphical Content automatic analysis of coarse-grain graphical content retrieval, translation, summarization, etc. too fine for human browsing/editing agt have concession lunch obj causes aen I hungry agt aen causes have obj snack causes become gol full 9th Open Forum for Metadata Registry, Kobe, 2006 Semantic Authoring is Easier than Text Composition (1/2) concession I was hungry. causes I had had a lunch. I had a snack. causes causes I became full. 9th Open Forum for Metadata Registry, Kobe, 2006 Semantic Authoring is Easier than Text Composition (2/2) A text synonymous with the graph in the previous page: * I had had a lunch. But I was hungry, and so I had a snack. Then I became full. This relation is hard to reflect in the text. I had had a lunch but I was hungry. So I had a snack. Then I became full. 9th Open Forum for Metadata Registry, Kobe, 2006 Semantic Authoring Authoring based on ontologies, together with explicit semantic structures Easier authoring of better content than with MS Word, etc. Accurate semantic structure in resulting content – – – short text in box rhetorical structure anaphora/coreference 9th Open Forum for Metadata Registry, Kobe, 2006 Improvement of Document Quality by Idea Processor Yagishita’s (1998) experiment Compose network-type content by idea processor Compose text based on the network-type content Less oversights – more points covered Deeper thoughts – longer inference chains 9th Open Forum for Metadata Registry, Kobe, 2006 Traditional Idea Processor No standardized relations – Only the author or participants of brain storming can understand. hard to share and reuse – Cost of text composition – big apparent cost → limited spread Semantic Authoring Standardization of relations – – – ISO/TC37/SC4/TDG3 easy to share and reuse retrieval, summarization, translation, etc. Automatic text generation – small cost → wide spread 9th Open Forum for Metadata Registry, Kobe, 2006 Scalability section paragraph paragraph paragraph 9th Open Forum for Metadata Registry, Kobe, 2006 Upgrading Semantic Levels in Software Architecture window system semantic authoring operating system semantic platform file system RDF database 9th Open Forum for Metadata Registry, Kobe, 2006 ISO/TC37/SC4/TDG3 Semantic Content Representation 9th Open Forum for Metadata Registry, Kobe, 2006 ISO/TC37 Terminology and Other Language Resources SC1: Principles and Methods SC2: Terminography and Lexicography SC3: Computer Applications for Terminology – ISO12620: Data Categories SC4: Language Resources Management 9th Open Forum for Metadata Registry, Kobe, 2006 ISO/TC37/SC4 Language Resources Management – Chair: Laurent Romary – Secretariat: Key-Sun Choi WG1: Basic descriptors and mechanisms for language resources (Laurent Romary) WG2: Representation schemes (Kiyong Lee) – Multimodal meaning representation scheme WG3: Multilingual text representation WG4: Lexical resources/database (Nicoletta Calzolari) WG5: Workflow of LR management 9th Open Forum for Metadata Registry, Kobe, 2006 Thematic Domain Group ISO/TC37/SC4/Ad Hoc TDGs TDG1: Metadata (Peter Wittenburg) TDG2: Morphosyntax (Gil Francopoulo) TDG3: Semantic Content Representation (Koiti Hasida) – – – – – – Discourse relations (Koiti Hasida) Dialogue acts (Harry Bunt) Referential structures and links (Laurent Romary) Logico-semantic relations (Scott Farrar) Temporal entities and relations (Kiyong Lee) Semantic roles and argument structure (Thierry Declerck) – More? 9th Open Forum for Metadata Registry, Kobe, 2006 Expected Products Not ISs (International Standards) in ISO’s official sense But Standard Registries of Data Categories – discourse relations, dialogue acts, etc. 9th Open Forum for Metadata Registry, Kobe, 2006 Scope of TDG3 Semantics, Abstracting Syntax Away – Semantic DCs usable with various annotation schemes • We’re not writing annotation manuals. – We don’t care syntax-semantics mapping, syntactic markup and markables, etc. Deliverables – Concrete Data Category Registries • semantic types of function words/morphemes and their taxonomy – not full dictionaries or encyclopedias – Documents on These DCs 9th Open Forum for Metadata Registry, Kobe, 2006 Criteria on DC Registry Purpose – annotation/interpretation • Inter-Annotator Agreement – authoring/composition/description • Descriptive Convenience General Requirement – ease of selection • clarity and coverage 9th Open Forum for Metadata Registry, Kobe, 2006 Collaborative Semantic Authoring 9th Open Forum for Metadata Registry, Kobe, 2006 Discussion-Supporting Groupware How to eliminate illegal bike-parking? solution solution Remove illegallyparked bikes immediately. Prepare more bike-parking lots. con That is not profitable. We have to keep them for six months. causes con We don't have enough space to keep them. 9th Open Forum for Metadata Registry, Kobe, 2006 causes Collaborative Semantic Authoring Traditional Groupware – – IBIS, Coordinator, Open Meeting, etc. improved efficiency and quality of discussion • • • • – reduced redundancy simultaneous utterances better coverage of important ponts deeper discussion weakness ・・・ usable only for group work Collaborative SA – – seamless unification of individual SA as a major usual task and group work the above merits + advanced retrieval, summarization, etc. 9th Open Forum for Metadata Registry, Kobe, 2006 From e-Mails to Collaborative SA Perspicuous semantic structure develops. No spams. TODO – user-account maintenance 9th Open Forum for Metadata Registry, Kobe, 2006 Knowledge-Circulating Society 9th Open Forum for Metadata Registry, Kobe, 2006 Knowledge Circulation social sharing, reuse, and extended reproduction of knowledge participation of everybody in every situation general public users producers consumers mediators provision of knowledge shared DB acquisition of knowledge 9th Open Forum for Metadata Registry, Kobe, 2006 Semantic Enterprise System System Design and Operation Based on Business-Process Semantics Incremental and emergent total optimization (in the sense of Enterprise Architecture) – – accumulation of improvements by users Integration of business operation, regulation, and computer system Transparent and fair procurement 9th Open Forum for Metadata Registry, Kobe, 2006 Knowledge Circulation in Research (Past) Knowledge-Circulation period > 2 years Papers are hard to read/write. evaluation publication review research writing paper submission 9th Open Forum for Metadata Registry, Kobe, 2006 (Future) Collaborative creation of huge graphical content Publication of sentences rather than papers Fast knowledge circulation – Evaluation better than IF and CI – In a week? Network analysis visualization retrieval, translation, summarization 9th Open Forum for Metadata Registry, Kobe, 2006 e-Knowledge Government Limitation of representative system – increasing diversity and complexity of social problems Involvement of all the citizens – collection and analysis of public opinions and knowledge – policy making and consensus building Given effective discussion by all the people: – no need for representative/indirect democracy – compositional democracy ・・・ KAWAKITA Jiro – deliberative democracy IT-based support – retrieval, summarization, translation, etc. – Weblog not sufficient • no systematic support to formation of long inference chains 9th Open Forum for Metadata Registry, Kobe, 2006
© Copyright 2024 ExpyDoc