Data Quality, Master Data Management & Data Governance Addressing the challenges using Oracle Enterprise Data Quality (EDQ) Presented by: Bryn Davies Managing Director InfoBluePrint (Pty) Ltd Introducing InfoBluePrint InfoBluePrint is a company dedicated to helping businesses to optimally manage their critical business information as a valuable corporate asset. We are uniquely and exclusively focused on DATA and we supply specialized Data Quality and Information Management Services. www.infoblueprint.co.za Our Focus Areas Data Migrations - ECTL © 2012 InfoBluePrint Example Business Areas Requiring a Data Quality Focus • • • • • • • • • • • • • Customer Relationship Management Customer Centricity & Single View Contactability Online/Digital Strategy Governance, Risk and Compliance POPI Watchlist Screening Supplier & Procurement Management Marketing & Sales Human Resources Business Intelligence EPM System Consolidations, New Systems, Data Migrations Data Categorised DATA Transactional Data Master Data Provides context to transactional data Characteristic Data Reference Data Defines fixed value domain data for classification Defines individual characteristics of the master entity being defined Example: Name Address Town Region ID Status Level Amount B.T. Davies 25 Short St CT West 6012315053081 Active Gold R251.87 © 2012 InfoBluePrint Common Data Problems – Customer Information • Typically key data for people and businesses such as: – – – – – – – • • • • • Name, title ID related information (ID no, company registration no, VAT no) Address/location information (physical, postal, delivery, billing) Contact details/types (address, phone, cell, email, website) Status (eg. active vs inactive) Product/Service type/class Marketing attributes eg LSM, demographics, psychographics Inconsistent spelling, formatting and structure Incorrect Out of date Missing Duplication of customers within and across different silos – Account Centric rather than Customer Centric – Lack cross-correlation and hierarchies: inability to achieve single view, householding, reliable segmentation © 2012 InfoBluePrint Consequences of Unreliable Customer Information Impossible to create a Single View of Customer Unable to communicate Wrong/inappropriate communication Ineffective marketing Unreliable reports Poor decisions Wrong decisions! Inefficiency – wasted time and effort doing “scrap and rework” Frustrated employees Issue resolution takes longer – costs more Frustrated customers Complex, difficult new system roll-outs (over time/budget due to data issues) Never ending “data clean-ups” Legal and compliance problems (eg. PoPI, Sanctions/PEP, FATCA…) © 2012 InfoBluePrint InfoBluePrint’s 10 Key Points for Data Quality 1. (Quality) Data is an Asset 2. Data is a Product 3. Quality is Defined by the Data Customer 4. Data Quality is not just about Dirty Data 5. Data Quality Measurement is Mandatory 6. Don’t do DQ without DQ Software 7. Data Quality is the Ultimately the Responsibility of Business 8. Get the First DQ Project Right First Time 9. Understand the Human Element 10. Doing Nothing will Cost you the Most © 2012 InfoBluePrint A Database is like a Lake* • The water is the data • The streams represent business processes feeding the database • Factories upstream are sources of pollutant • Information users drink the water *Analogy courtesy Tom Redman A Data Quality Management Framework for Success © 2012 InfoBluePrint Introducing Oracle Enterprise Data Quality (EDQ) DQ-Based Solutions Business Solutions Domain Knowledge •• •• •• • Pre-Built Solutions • Any scope – components to end-to-end solutions • Any pre-built/reusable item • Processes, methods • Knowledge, reference data • Application integration • Data Quality Platform • Complete range of DQ capabilities • Best-of-breed capabilities for party and product data • Easy to use, intuitive • Open, tunable, flexible Customer-delivered Customer-delivered Partner-delivered Partner-delivered Oracle-delivered Oracle-delivered Enterprise Data Quality Governance Dashboards Match/Merge Product Data Extensions Party Data Extensions Standardization Profile and Audit Introducing Oracle Enterprise Data Quality (EDQ) Govern Common Access/UI Monitor effectiveness & resolve problems Match Identify & merge duplicates Standardize Drive conformance to standards Profile Quickly understand data content Enterprise DQ Platform • Process metrics • Quality metrics • Case Management • Remediation • Party (individuals, • Semantic households) match (category) match • Entity match • Statistical match • Match review • Merge/survivorship • Global parse • Category parse • Extract • Transform • Substitute • Address verification • Enrich & geocoding • Classify • Statistics • Patterns • Phrases • Duplicates • Completeness • Max/min values Oracle EDQ – Denoise, Parse, Standardise Name: Dr Ellen Van Der Heijde Title: Dr First: Ellen Last: Van Der Heijde Gender: Female Name: Mr RJ & Mrs FB MacDonald Title: Mr First: R Middle: J Last: MacDonald Gender: Male Title: Mrs First: F Middle: B Last: MacDonald Gender: Female • Standardize, Transform and Parse • Split names and name elements • Identify individuals and businesses • Derive additional attributes Name: Jalila Abdul-Alim (Do Not Call) First: Jalila Last: Abdul-Alim Gender: Female Note: Do Not Call Oracle EDQ – Match, Merge, Enrich Title: Mr First: Robert Last: Fulmar Gender: Male DoB: 12/05/1978 Phone: 555-120-1329 Address: 9405 Main St Fairfax Virginia 22030 First: Bob Last: Fulmar Gender: Male Email: [email protected] Title: Dr First: R Last: Fulmer DoB: 01/01/1978 Email: [email protected] Address: 9407 Main Street Fairfax VA 22031-4001 Title: Dr First: Robert Last: Fulmar Gender: Male DoB: 12/05/1978 Email: [email protected] Phone: 555-120-1329 Address: 9407 Main St Fairfax VA 22031-4001 • Match & Merge data from disparate sources • Create ‘best’ record based on survivorship rules EDQ Roles Executives & Stakeholders Business Analysts Data Analysts Developers Data Stewards Match Reviewers Managers & Executives Main EDQ Console, Focused on the User Key Feature: Pre-built Processors • Comprehensive DQ Functionality with a Single User Interface and Repository Immediate drill-down to examine real data Drill-down to see actual data values and determine required rules, standards etc. EDQ Inbuilt Case Management for Governance Review and resolve exceptions from the DQ process Usage • Cases/alerts are assigned a work queues and a priority • Data specialists sign in and review/resolve issues • Management reports allow monitoring of work queues and productivity • Helpful for o One-time cleanse/migration o Ongoing governance program Features • Hierarchical Case/alert functionality • Configurable Workflows • Automatic prioritization of cases/alerts • Timers • Email Notification Support • Comprehensive audit trail • Immediate ad-hoc reporting Example Dashboard Batch and Online Data Quality Deployment CRM Asset Management Planned Maintenance Billing Clean your data – then keep it clean in real-time Service Finance Realtime DQ with EDQ Web Services Enforce common DQ standards across the enterprise Applications App 1 App 2 App 3 Any EDQ process may be called as a real-time web service Call any process from any application to 1. Common Services Library of enterprise standard DQ services 2. Enforce common standards Minimize architectural changes Introducing Oracle Enterprise Data Quality (EDQ) DQ-Based Solutions Business Solutions Domain Knowledge •• •• •• • Pre-Built Solutions • Any scope – components to end-to-end solutions • Any pre-built/reusable item • Processes, methods • Knowledge, reference data • Application integration • Data Quality Platform • Complete range of DQ capabilities • Best-of-breed capabilities for party and product data • Easy to use, intuitive • Open, tunable, flexible Customer-delivered Customer-delivered Partner-delivered Partner-delivered Oracle-delivered Oracle-delivered Enterprise Data Quality Governance Dashboards Match/Merge Product Data Extensions Party Data Extensions Standardization Profile and Audit InfoBluePrint’s Generic EDQ Processes for SA Party Data Party Type Classification: Natural Person vs Juristic Entity (Individual vs Organisation •“Consumer” vs “Business” override rules can be incorporated Sub-Type Classification: SA ID, Temp Visa, Private Co., Trust, NGO, Medical, School etc etc “Secondary Location” Parsing and Derivation of Missing Data Parsing, Cleansing, Standardising: •Name, Legal, Trading, Maiden •ID, Co Reg •Addresses – all classes •Telephone – all classes •Email •Banking details •And more Householding – various categorisations eg. name, address, email, banking etc) PAMSS Data Preparation & Processing © 2012 InfoBluePrint Challenges with SA Address Data • • • • • • • • • No Address Standards in SA (SANS1883 pending) SAPO data is generally unreliable PAMSS is very basic and used only for bulk mailing discounts Postcode system has very low granularity Postcode system highly ambiguous – no distinctions in hierarchies of city/town/suburb Informal addresses are plentiful Multiple languages used No National Address Database (NAD) – several commercial versions available at a cost – varying degrees of reliability Several PAMSS and “Address Cleansing” vendors – varying degrees of reliability, mostly offsite and not integrated into your environment © 2012 InfoBluePrint InfoBluePrint EDQ Address Classification Invalid & Intl BOX BUILDING • Strip for ID’s, Names, Tel Nos. • Invalid • International (classify and parse) • Classify Primary Indicator • Classify Secondary Indicator • Classify Language • Classify Primary Indicator (if Null) • Classify Language FARM STREET CORNER OF • Classify Primary Indicator (if Null) • Classify Language • Classify Primary Indicator (if Null) • Classify Language • Classify Primary Indicator (if Null) • Classify Language PLOT/ERF/SITE OTHERS • Classify Primary Indicator (if Null) • Classify Secondary Indicator • Classify Language • Add Primary Indicator (if Null) • Default language to English © 2012 InfoBluePrint Data Enrichment in SA • Many data suppliers – generally: – Marketing (list brokers) – Credit Bureaus (sell data that they collect for various purposes) • Some data suppliers do not have legal sources • Many claims of attributes available prove to be false (low population) • Varying degrees of reliability, especially wrt currency of data • Careful consideration is required as most supply on a subscription basis • Some bureaus offer a service to manually collect and/or validate missing data • Due diligence on SA data suppliers available as part of our service © 2012 InfoBluePrint Our Focus Areas Data Migrations - ECTL © 2012 InfoBluePrint Data Governance • • Data Governance is not about governing data – it is about governing the people and processes that touch the data Data Governance is not a product, a service or a project – it is a formal organisational programme Management is the decisions you make Governance is the structure for making them CIO Magazine Law & Order Data Governance LAW ORDER The system of rules and procedures for governing data. Automation for monitoring & enforcing the rules and procedures to use and protect the data. PEOPLE PROCESS TECHNOLOGY Data Governance: Organisational Model “Data Governance is the exercise of authority and control (planning, monitoring and enforcement) over the management of data assets” (DAMA DMBoK) Data Governance Steering Committee • Approves strategy and direction • Resolves escalated issues • Co-exists with other strategic Steerco’s DG Steerco Data Governance Council DG Council •Approve enterprise data definitions •Formulate data governance program decisions •Ratify principles, standards, policies & processes •Strategic issue resolution •Encourage and facilitate change Data Governance Office Data Steward Teams • The face of data governance across the enterprise • Implements strategic data governance transformation • Incorporated within the Data Governance Council Data Steward Teams • Point of contact for daily data issues • Subject matter experts • Supplies data stewards • Day to day consumers of data EDQ and Data Governance Data Governance Capabilities for Data Stewards & Stakeholders Data Flow Explorer Sources Quality KPIs Case & Issue Management Exception Review Data Flows Oracle OpenWorld 2014 Metadata Management Business Glossary Targets Master Data Management Master Data Management DQ Spans MDM and Data Integration Oracle BI/EPM Data Services Information Management Watchlist Screening Transaction Processing Services Customer Hub Oracle Data Integration ETL/E-LT Storage 3rd Party Applications Oracle Applications Business Intelligence Services Product Hub Content Management Services Supplier Hub Financial Hub Profiling Standardization Match/Merge Data Federation Replication Transformation OLTP System Collaboration Services Site Hub Oracle MDM Enterprise Data Quality Data Warehouse/ Data Mart Custom Applications OLAP Cube Synchronization Web and Event Services, SOA MDM Strategy Development MDM Scope • Business Goals • Data Types • Processing Requirements MDM Business Solution • Solution Functional Components • Solution Patterns • Integration Requirements MDM Roadmap • Functional Component Dependencies • Business Benefit Realisation MDM Technical Solution • Technology • Implemented Solution © 2012 InfoBluePrint MDM Scope © 2012 InfoBluePrint MDM Solution Customers Inter Office Email Business WEB Admin Front Ends Master Data Direct Update and Enquiry – CRM Front End Admin DATA MIGRATION (Initial) NEW(Operational) LEGACY (Operational) Extraction Cleansing Augmentation Load eRA Master Data Applications Application NEW (Analytical) LEGACY (Analytical) Application Integration Adapters Data Integration Adapters ENABLERS (Ongoing) Data Mapping Matching Merging Data DATA ALIGNMENT(Ongoing) eRAData To Master eRA Data From Master Synchronisation Broadcast Services SUPPORT SERVICES (Ongoing) Backup & Recovery Data © 2012 InfoBluePrint Business Continuity Critical Interdependencies DQ because: DQ needs: - DG needs: MDM needs: - DQ provides the framework, processes and artefacts for measuring and managing data improvement - DQ provides supporting artefacts and processes - DQ monitoring is a dashboard for Data Governance effectiveness - Initial migration must take on quality master data (and external data) - Consistency in format/value/rules is required - DQ of hub data must be 2010 © InfoBlueprint controlled and known! DG because: MDM because: -DG provides the structures for the preventive part of DQ, eg. people/process - DG drives out metadata issues -DG provides direction and policies required to manage the data - MDM provides the technology platform for persisted quality data - MDM forces enterprise view of Data Quality - MDM drives common data models and hierarchies (eg. party) - - DG resolves people & process issues for MDM - DG drives ownership and stewardship - DG forces preventive measures to be in place -MDM provides a physical data DMZ - MDM drives new roles and responsibilities for data - MDM provides a technical platform to support Data Governance © 2012 InfoBluePrint 39 Examples of What’s Needed to Get Data Under Control Data Governance People Process Practices Artefacts Technology Data Quality MDM DG Steerco Data Council Data Stewards DQ Forums Data Quality Specialists Data Analysts DQ Tools Skills Data Architect Data Modellers MDM Tools Skills Master Data Inventory Data Steward Matrix Data Policies Data issues: rules for identification, categorisation, prioritisation Issue Resolution Workflows Business Rules DQ Assessments DQ Improvement Processes & Systems DQ Monitoring MDM Models MDM Hierarchies Validations System of Entry vs System of Record Workflows DG Policy Admin Metadata Repository Workflow Tool Rules Repository Data Quality Tool – batch and realtime MDM Hub(s) Data Integration Data Quality © 2012 InfoBluePrint www.infoblueprint.co.za [email protected] http://www.oracle.com/technetwork/middleware/oedq/overview/index.html InfoBluePrint Clients A Data Quality Management Framework for Success • To manage Data Quality properly requires both corrective and preventive actions. This approach provides us with: −Top-down prevention focused approach to define and implement the appropriate management and practices that will be required to ensure that we will have sustainability. −Bottom –up correction based approach for situations which need to address the identified problems. • • Before starting Correction activities it is important to put in place the supporting governance and processes that will enable improvements: • Effective Prioritisation • Adequate management • Appropriate monitoring and reporting • Consistent corrective action Ongoing Monitoring is required to highlight: • Improvements after correction activities • Incidence of new issues identified for the first time © 2012 InfoBluePrint InfoBluePrint Data Quality Improvement (DQI) - Process © 2012 InfoBluePrint InfoBluePrint DQ Assessment – Measure Data Quality © 2012 InfoBluePrint
© Copyright 2024 ExpyDoc