Titelfolie / Titel durch Klicken hinzufügen

PICASSO Big Data Expert Group
Sören Auer
© Fraunhofer-Institut für Intelligente
Analyse- und Informationssysteme
IAIS
The three Big Data „V“ – Variety is often neglected
Quelle: Gesellschaft für Informatik
© Fraunhofer · Seite 2
Sören Auer
2
Semantic Web Layer Cake 2001
• Monolithic based on XML
• Focus on heavyweight
Semantic (Ontologies, Logic,
Reasoning)
http://www.w3.org/2001/10/03-sww-1/slide7-0.html
© Fraunhofer · Seite 3
• Lingua Franca of Data
integration with many
technology interfaces (XML,
HTML, JSON, CSV, RDB,…)
SWRL Regeln
• Focus on lightweight
vocabularies, rules,
thesauri etc.
SKOS Thesauri
Logik
Ontologien
Vocabularies
• Less “invasive”
SPARQL
RDF
RDF Data
Shapes
RDF-Schema
RDF/XML
JSON-LD
CSV2RDF
R2RML
RDFa
XML
JSON
CSV
RDB
HTML
Unicode
© Fraunhofer · Seite 4
URIs
(Access control), Signatur,
Encryption (HTTPS/CERT/DANE),
The Semantic Web Layer Cake 2015 –
“A Little Semantics Goes a Long Way”
INTEGRATING BIG DATA &
LINKED DATA
© Fraunhofer – Seite 5
Blueprint of the Data Aggregator Platform
Follows typical Lambda Architecture
Batch Layer
Batch View
Input data
Domain-specific BDE apps
Spatial
Big Data Analytics
Real-time data &
message passing
Social
Statistical
Temporal
Transactiona
l
Imagery
Applications & Showcases
Real-time dashboards
Stream
BDE Platform & Intelligence
Data Storage
message passing
In-stream Mining
Speed Layer
Real-time
View
Transactions …
Integrated on top of existing Big Data distribution
+ Semantic Layer (Retaining Semantics using LD approach )
© Fraunhofer · Seite 6
6
Adding a Semantic Layer to Data Lakes
Accounting
Outbound and
Consumption
Management
Accounting
Regulatory Reporting
Frontend to Access Relationship and KPI Definition
/ Documentation
Risk
Frontend to Access (ad hoc) Reports
Treasury
Outbound Data Delivery to
Target Systems
Knowledge Graph for Relationship Definition and Meta Data
Semantic Data Lake
• central place for
model, schema and
data historization
• Combination of Scale
Out (cost reduction)
and semantics
(increased control &
flexibility)
• grows incrementally
(pay-as-you-go)
Inbound
XML2RDF
JSON-LD
CSVW
R2RML
Data Lake (order of magnitude cheaper scalable data store)
Inbound Raw Data Store
Data Sources
© Fraunhofer
· Seite
7
[1] Wrobel,
Voss,
Köhler,
Beyer, Auer: Big Data, Big Opportunities - Anwendungssituation und Forschungsbedarf.7 Informa
[2] Debattista, Lange, Scerri, Auer: Linked 'Big' Data. IEEE/ACM Big Data Computing BDC 2015: 92-98
INDUSTRIAL DATA SPACE
© Fraunhofer · Seite 8
Vocabulary-based Integration facilitates Data-driven
Businesses
Vocabulary
© Fraunhofer · Seite 9
Die Arbeiten zum Industrial Data Space sind
komplementär verzahnt mit der Plattform Industrie 4.0
Versicherung
Handel 4.0
4.0
Industrie 4.0
Fokus auf die
produzierende
Industrie
Bank 4.0
…
Smart Services
Industrial Data Space
Fokus auf Daten
Daten
Übertragung,
Netzwerke
Echtzeitsysteme
…
© Fraunhofer ·· Seite 10
The Industrial Data Space Initiative
Community of >30 large German and European Companies
Pre-competitive, publicly funded innovation project involving 11
Fraunhofer institutes for developing IDS reference architecture
Current signatories of the MoU to support the Industrial Data Space
Association
© Fraunhofer-Institut für Intelligente
Analyse- und Informationssysteme
IAIS
Semantic Data Linking for Enterprise Data Value Chains
Data Lake
Industrial
Data Space
Pure Internet
centralized, monopolistic
federated, secure, „trusted“,
standard-based
completely dezentral, open,
unsecure
Central Repository
Decentral
Decentral
Central
Decentral
Decentral
Data Linking
Single provider
Federated, on demand
Missing
Data Security
Bilateral
Certified system
Bilateral
Central Provider
Role system
Unstructured
Internet
Internet
Internet
Data management
Data Ownership
Market structure
Transport infrastructure
Bilder: ©Fotolia
Francesco De Paoli, Nmedia, hakandogu
© Fraunhofer · Seite 12
Basic principles of the Industrial Data Space
On Demand
Vernetzung
Interlinking
Bilder: © Fotolia
77260795 ∙ 73040142
58947296 ∙ 68898041
© Fraunhofer · Seite 13
Linked Light
Semantics
Security
with
Industrial
Data
Container
Certified
Roles
Industrial Data Space:
On Demand Interlinking
All Data stays with its Ownern and are controlled and secured. Only on request for a
service data will be shared. No central platform.
Service F
Enterprise 6
Enterprise 5
Service G
Service A
Enterprise 1
Enterprise 4
Service B
Service E
Service C
Bildquellen: Istockphoto
© Fraunhofer · Seite 14
Enterprise 2
Enterprise 3
Service D
Linked Light Semantics
A lighweight approach for Data Interlinking
Classical Enterprise
systems
Linked Light Semantics
Internet / WWW
Fixed Data schema
Reference vocabularies
Web pages
Globale Enforcement
Bridge between local
Representations
Only Links
Closed
Intelligent and structured
interlinked
Completely open
Manuel
Transformation
Automatic translation/mapping
Lack of
standardization
High cost
Leight-weight
No structure
Q: istockphoto.com
© Fraunhofer · Seite 15
--- VERTRAULICH ---
IDS Architecture Overview
Clearing
Vocabulary
Apps
Index
Industrial Data Space
App Store
Industrial Data Space
Registry
Industrial Data Space
Broker
Download
Third Party
Upload
Internal IDS
Connector
Upload / Download / Search
External IDS
Connector
External IDS
Internet
Connector
Upload / Download
Company A
Internal IDS
Connector
© Fraunhofer
© Fraunhofer · Seite 16
Company B
--- VERTRAULICH ---
Cloud Provider
Industry 4.0
Semantic Models as Bridge between Shop & Office Floor
© Fraunhofer · Seite 17
Semantic Administrative Shell &
Reference Architecture for Industry 4.0 (RAMI4.0)
Administrative Shell (Verwaltungsschale)
provides a digital identity for arbitrary
Industry 4.0 components (e.g. sensors,
actors/robots) exposing data covering
the whole life-cycle
Reference Architecture for Industry 4.0
(RAMI4.0) provides a conceptual
framework for implementing
comprehensive Industry 4.0 scenarios
We have implemented both concepts
along with a number of IEC and ISO
standards in a comprehensive
information model ready to be
implemented in productive
environments
© Fraunhofer · Seite 18
Summary
Challenges and Opportunities - Interoperability and Standardization
•
Adding a semantic layer to Big Data technology
•
Integrating Linked Data and Big Data technology
•
Towards Enterprise Knowledge Graphs and Data Spaces
•
Applications e.g. in Manufacturing, Cultural Heritage, Finance
© Fraunhofer · Seite 19