Informationsintegration Übung 2 SS 2016 Yvonne Lichtblau Wettbewerb Übung 1 Yvonne Lichtblau – Informationsintegration SS 2016 2 Top Verdiener Sport 2014 Top Verdiener Sport 2014 10 9 8 Floyd Mayweather Manny Paquiao Christiao Roaldo Lyonel Messy Roger Federer LeBron James Kevin Durant Phil Michelson Tiger Woods Kobe Bryant 7 6 5 4 3 2 US PH PT AR CH US US US US US 1 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 3 5 7 5 3 7 6 6 6 7 7 Medaillengewinner Frauen Sport 200m Rücken Medaillengewinner Frauen Sport 200m Rücken 10 9 8 Diana Mocanu Roxana Mărăcineanu Miki Nakao Kirsty Coventry Stanislawa Komarowa Antje Buscschulte Reiko Nakamura Margaret Hoelzer Krisztina Egerszegi Elaine Tanner 7 6 5 4 3 2 RO RO JP ZW RU DE JP US HU CA 1 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 4 4 5 7 7 3 6 7 6 7 5 Top Fußballer Instagram Top Fußballer Instagram 10 9 Cristiano Ronaldo Neymar Junior Lionel Messi David Beckham James Rodriguez Gareth Bale David Luiz Zlatan Ibrahimovic Ronaldinho Luis Su├írez 8 7 6 5 4 3 2 PT BR AR GB CO GB BR SE BR UY 1 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Gruppe8 UK = GB Yvonne Lichtblau – Informationsintegration SS 2016 5 6 5 6 5 5 4 6 6 7 6 Formel 1 nach Anzahl Titeln Formel 1 nach Anzahl Titeln 10 9 8 Micha Schumacher Juan Manuel Fangio Alain Prost Sebastian Vettel Jack Brabham Jackie Stewart Niki Lauda Nelson Piquet Ayrton Senna Lewis Hamilton 7 6 5 4 3 2 1 DE AR FR DE AU GB AT BR BR GB 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 6 5 6 6 7 5 4 7 7 6 3 Top 10 Athleten Frauen 2009 Top 10 Athleten Frauen 2009 10 9 8 Serena Williams Annika Sörenstam Yelena Isinbayeva Justine Henin Lisa Leslie Venus Williams Marta Lorena Ochoa Diana Taurasi Kerri Walsh 7 6 5 4 3 2 1 US SE RU BE US US BR MX US US 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 7 6 6 4 6 7 5 3 5 6 5 Triathleten mit meisten Olympiasiegen Triathleten mit meisten Olympiasiegen 10 9 8 Bevan Docherty Stephan Vuckovic Jan Frodeno Simon Whitfield Sven Riederer Jan Rehula Hamish Carter Eneko Llanos Andreas Raelert Pete Jacobs 7 6 5 4 3 2 1 NZ DE DE CA CH CZ NZ ES DE AU 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 8 6 5 7 4 4 3 7 5 6 3 Rallye Mexico 2016 Rallye Mexico 2016 10 9 8 Seebastien Ogier Jari Matti Latvala Thierry Neuville Dani Sordo Mads Ostberg Eric Camilli Andreas Mikkelsen Ott T├ñnak Benito Guerra jr. Hayden Paddon 7 6 5 4 3 2 1 FR FI BE ES NO FR NO EE MX NZ 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 9 3 6 3 4 3 2 3 5 2 4 World Cup Dressur Gewinner World Cup Dressur Gewinner 10 9 8 Valentina Truppa IT Tinne Vilhelmson-Silfv SE Anky van Grunsven NL Isabell Werth DE Steffen Peters US Edward Gal NL Adelinde Cornelissen NL Helen Langehanenberg DE Charlotte Dujardin GB Peter Minderhoud NL 7 6 5 4 3 2 1 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 10 3 4 2 5 2 3 4 4 3 1 Segeln Olympia 2012 Segeln Olympia 2012 10 9 8 Tom Slingsby Pavlos Kontides Rasmus Myrgrend Tonči Stipanović Andrew Murdoch Simon Grotelöschen Alejandro Foglia Juan Ignacio Maegli Toni Wilhelm Richard Stauffacher 7 6 5 4 3 2 AU CY SE HR NZ DE UY GT DE CH 1 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 11 3 2 4 2 1 4 6 2 4 3 Erfolgreiche E-Sportler Erfolgreiche E-Sportler 10 9 8 Saahil Arora Sumail Hassan Jiao Wang Zhang Pan Clement Ivanov Gustav Magnusson Danil Ishutin Kuro Takhasomi Lee Jae-dong Manuel Schenkhuizen 7 6 5 4 3 2 1 US PK CN CN EE SE UA DE KR NL 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Yvonne Lichtblau – Informationsintegration SS 2016 Gruppe8 12 2 2 4 2 1 3 0 4 3 2 Wettbewerb - Korrektheit Gesamtergebnis 100 90 80 70 60 50 40 30 20 10 0 Gruppe1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Gruppe8 1. Platz: Gruppe8 (93%) 2. Platz: Gruppe1 (78%) 3. Platz: Gruppe4 (73%) Yvonne Lichtblau – Informationsintegration SS 2016 13 Wettbewerb - Geschwindigkeit benötigte Zeit für 100 Personen 140 120 Zeit in sec 100 80 60 40 20 0 Gruppe 1 Gruppe2 Gruppe3 Gruppe4 Gruppe5 Gruppe6 Gruppe8 1. Platz: Gruppe1 (6.4s) 2. Platz: Gruppe2 (16,6s) 3. Platz: Gruppe5 (43,4s) Yvonne Lichtblau – Informationsintegration SS 2016 14 Vorstellung Lösungen Übung 1 Yvonne Lichtblau – Informationsintegration SS 2016 15 Übung 2 Database Setup/ Basic Queries Yvonne Lichtblau – Informationsintegration SS 2016 16 Background We start with genes, their location, and their function Types of information: ● Genes: Have a taxon ID (organism), have an ID, have a preferred name (symbol), have multiple other names (synonyms), have multiple functional annotations, have a connected protein (with a protein_id and a protein_version_id), have a status, are on a chromosome, have a start and end position, and a chromosomal location Yvonne Lichtblau – Informationsintegration SS 2016 http://thinnergene.com 17 Background ● ● Genes function: Are described by a taxonomy of terms which forms a DAG; each term has an ID, a name, a description, and can be annotated to multiple genes (Gene Ontology) Gene – function relationship: Has an evidence code Yvonne Lichtblau – Informationsintegration SS 2016 18 Task 1: Create the Schema ● ● Create a relational model for the information described on the previous page Implement this model in the Postgres database ● ● One account per group, access information will be sent Login from gruenau2: psql h delphi U gruppeX p 5432 ● Datataypes: Look at the data ● ● Be conservative Length: Guess a reasonable number which leaves some buffer Yvonne Lichtblau – Informationsintegration SS 2016 19 Task 2: Fill the schema (partly) ● From the web page download the following files: ● ● ● ● ● ● gene2refseq (see slide 17) [tax_id, geneID, status, protein_accession, protein_accession_version, start, end] gene_info (see slide 17) [tax_id, geneID, symbol, synonyms, chromosome, map location] gene2go (see slide 18) [tax_id, geneID, goID, evidence] README Import the given attributes into your schema ● This requires some processing: ● gene2refseq and gene_info should be merged ● gene_info.synonyms should be normalized The DAG structure of the GO will become relevant in assignment 3, but is not considered in this exercise!! Yvonne Lichtblau – Informationsintegration SS 2016 20 Task 3: Some queries ● Formulate and answer the following queries: ● How many genes does your gene table have? (13,644,002) ● How many relationships between genes and a GO term are there? ● How many distinct GO terms are annotated to at least one gene? ● How many gene names are present (symbol or snyonym)? ● How many synonyms are assigned to more than one gene? Yvonne Lichtblau – Informationsintegration SS 2016 21 Submit ● ● ● ● By Monday, 23.05.2016, 23:59 pm Send by mail to: [email protected] (questions are also welcome!) Submit: ● Create Table statements ● Relational schema as image (tables as boxes, foreign key – primary key links as arcs) ● Queries and answers of task 3 ● Notes of your workflow as PDF (preprocessing and import of tables) Criteria for passing the exercise: Submit everything from the list and correct answer of each query. Yvonne Lichtblau – Informationsintegration SS 2016 22
© Copyright 2024 ExpyDoc