THE DATA LAKE DREAM Edd Dumbill • @edd [email protected] • svds.com/StrataNY2014 2 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. WHAT IS A DATA LAKE? A scalable, accessible repository of data DW Analytics Hadoop (in its natural or processed state) 3 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. CONVENTIONAL DATA STRATEGY “WHAT YOU DO TO DATA” CLEAN 4 VALIDATE © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. CONTROL PROTECT MODERN DATA STRATEGY “WHAT YOU DO WITH DATA” ATTRACT NEW CUSTOMERS TARGET VIP CUSTOMERS AUTOMATE 5 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. growth potential big data applications well understood systems uncertainty 6 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. TOWARDS THE “DATA LAKE” — Step 1 DW 7 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. TOWARDS THE “DATA LAKE” — Step 2 DW Hadoop 8 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. Analytics TOWARDS THE “DATA LAKE” — Step 3 DW Hadoop 9 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. Analytics TOWARDS THE “DATA LAKE” — Step 4 DW Analytics Hadoop 10 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. UP vs. OUT — Enterprise Edition Scale-up cost US Dollars UC1 UC2 Increasing cost per unit of capability from scaleup architectures causes rationing of resources. Only the most valuable use cases are pursued. UC4 UC3 UC5 Data Resource Usage 11 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. Different use cases put different demands on the data infrastructure. Scale-out cost THE DATA VALUE CHAIN DRAW VALUE FROM YOUR STRATEGIC DATA ASSETS Discover 12 Ingest Process © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. Persist Integrate Analyze Expose • Make it cheap • Failure as a feature • Ask good questions • Make it quick • Both learning and adaptation • Enable the feedback loop • Don’t break things • Make operations a platform for innovation • APIs, platforms, simulation 13 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. BUILD FOR EXPERIMENTS THE EXPERIMENTAL ENTERPRISE Data science allows us to observe our experiments and respond to the changing environment. We need to both support investigative work and build a solid layer for production. The foundation of the experimental enterprise focuses on making infrastructure readily accessible. 14 © 2014 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. Edd Dumbill [email protected] @edd @SVDataScience Yes, we’re hiring! [email protected] Want these slides? Go to: svds.com/StrataNY2014 15
© Copyright 2024 ExpyDoc