尊敬的 微信汇率:1円 ≈ 0.046089 元 支付宝汇率:1円 ≈ 0.04618元 [退出登录]
SlideShare a Scribd company logo
© 2014
The Emerging Data Lake IT Strategy
An Evolving Approach for Dealing with Big Data & Changing Environments
Thomas Kelly, Practice Director
Cognizant Technology Solutions
Sean Martin, Founder and CTO
Cambridge Semantics
© 20142
We’re living in an amazing world of information sharing,
connecting with family, neighbors, vendors, and customers
all over the world
© 20143
Telling the world
about what we like
and don’t like
… is now following Cognizant Technology Solutions
and Cambridge Semantics
© 20144
What we’re doing and how we’re succeeding
© 20145
We’re deciding what advertising that we want to see…
… and what we don’t
how business
and customers
© 20146
Many businesses have emerged that embrace this model of
customer engagement
and we’ve said Goodbye to businesses that didn’t
10 million stays in 2013,
without owning a hotel
Grew to nearly $75B in
annual retail revenue in 2013,
without opening a storefront Shares over 40 million
photos each day
© 20147
Engaging in a more
personalized shopping
experience, retailers are
building a stronger
relationship with each
© 20148
Customer Service
Delivering a positive and
successful experience for
each customer
© 20149
Life Sciences and Healthcare
Combining health, genetic,
clinical, and public sciences
data to bring effective
therapies to patients sooner
© 201410
Financial Services
Delivering innovative
products and services,
based on a 360° view of
the Customer, across all
business lines, engaging
all available data assets,
internal and external
© 201411
The Challenges That We're Addressing
Onboarding and Integrating Data is Slow and Expensive
• Transforming data from a growing variety of technologies
• Custom coded ETL
• Existing ETL processes are not reusable
• Optimization for analytics is time-consuming and costly
• Often wait until there is a defined need for a set of data, delaying benefits
realization while waiting to onboard the data
Data Provenance is Often Poorly Recorded
• Data meaning is “lost in translation”
• Data transformations tracked in spreadsheets
• Post-onboarding, maintenance and analysis cost for onboarded data is high
• Recreating data lineage is manual, time-consuming, and error-prone
© 201412
The Challenges That We're Addressing
Target Data is Difficult to Consume
• Optimization favors known analytics, but not well suited to new requirements
• A one-size-fits-all canonical view is used rather than fit-for-purpose views
• Or, lacks a conceptual model to easily consume the target data
• Difficult to identify what data is available, how to get access, and how to
integrate the data to answer a question
Industrializing the Big Data Environment is Difficult to Manage
• Proliferation of data silos leads to inconsistency/syncing issues
• Conflicting objectives of opening access to data assets while managing
security and privacy requirements
• Velocity of business change rapidly invalidate data organization and analytics
• Managing the integration/interaction with the multiple data management
technologies that make up the Big Data environment
© 201413
The Data Lake is made up of four key
Data Lake Management
Data Management Query Management
• Low Cost, High Performance Storage
• Flexible, Easy-to-Use Data Organization
• Performance-Optimized Analytics
• Automation of most manual Development and
Query Activities
• Self-Service End-User Features
• Intelligent Processing
© 201414
Data Ingestion
Data Lake Management
Data Management Query Management
Data Sources
Linked Data
Internet of Things IoT
Batch Load
Desktop and Mobile
Social Media and
© 201415
Data Management
Data Lake Management
Data Management Query Management
Data Sources
Linked Data
Internet of Things IoT
In Memory
Batch Load
Desktop and Mobile
NoSQL Map Reduce
Social Media and
HDFS Storage
Structured and
Unstructured Data
HDFS Storage
© 201416
Data Lake Management
Data Management Query Management
In Memory
Data Lake Management
Data Assets
Data Sources
Linked Data
Internet of Things IoT
Data Mappings
• Source-to-Target
• Transformations
• Internal and External
Data Assets
• Defined Data Orgs
taxonomies, thesauri)
• Authorization and Access Rules
• Rule-based Security
• Group, Role, and User Level
• Auditable Access
• Processes
• Schedules
• Provenance
Batch Load
• Business Unit Data
Organization and Terms
• Optimized to Assist
• Monitor and Manage
Data Lake Operations
Desktop and Mobile
Data Governance
• Focus on Shared Data
• Standard Models
• Controlled Vocabulary
• Common Definitions
• Standards-based Data
NoSQL Map Reduce
Social Media and
Structured and
Unstructured Data
HDFS Storage
© 201417
Query Management
Data Lake Management
Data Management
Batch Load
Query Management
Data Sources
Linked Data
Internet of Things IoT
In Memory
Query Data, Metadata,
and Provenance
Capture and Share
Analytics Expertise
Semantic Search
Analytics Directed to
the Best Query Engine
Data Discovery
Desktop and Mobile
NoSQL Map Reduce
Social Media and
HDFS Storage
Structured and
Unstructured Data
HDFS Storage
© 201418
Semantic Technology Delivers “Smart” Data
Integrates a network of internal and external data assets,
insulating end users from the details of the underlying
Captures expertise (logic, inferencing) and integrates it with
the data, delivering “smart” data to non-expert users
Manages a comprehensive inventory of the data assets
Secures access to the right data assets by the right users
© 201419
Key W3C Standards in Semantic Technology
Resource Description
Framework (RDF)
Framework for storing and
integrating data and data
definitions in the form of subject-
predicate-object expressions, or
“triples”. Relationships are
organized in a logical graph
model. Reduced development
time and cost; faster time-to-
business value.
Web Ontology Language
An ontology is a comprehensive
model of data definitions and
relationships that is human- and
machine-readable. Ontologies
are inheritable and extensible.
Improved application quality,
flexible iterative / investigative
approach, easily adapts to
business change.
Query Language
SQL-like query language for
semantic data that can leverage
the ontological relationships and
constructs to execute smarter
queries. Access multiple
internal and external databases
simultaneously in a single query.
Access and integrate data
across business silos.
Reasoning over data through
business rules. Expertise is
captured and embedded in the
ontology model, accessible
through user queries. This is
the “smart” in Smart Data.
Easier end user access to
expertise; intelligent systems
Linked Data
Connects data contained in
different databases, allowing
queries to find, share and
combine data so insights can be
identified across the Web.
Connect disparate databases to
navigate and integrate data
regardless of location or
technology platform.
RDB to RDF Mapping
Language (R2RML)
Preserving current investments
in relational technology, R2RML
maps relational data to an
ontology. SPARQL can query
RDF and relational databases
Low cost of entry to use
Semantic Technology to deliver
high-value solutions
© 201420
The Common Model is the “Data Glue”
(SFA system)
(Quote system)
(OMS system)
(CMS system)
Common Model
(“Data Glue”)
Source Systems
• Different business entities in
physical systems actually share
many of the same concepts,
meanings, and relationships
• Semantic data science exposes
common business concepts and
connects them with their physical
expression in production systems
• Data is “glued” together by its
business meaning, rather than
physical structures dictated by
the underlying technologies
The conceptual model can be directly used by both business and IT users to
operationalize data services, understand the data landscape, track data lineage, and
conduct downstream analytics.
© 201421
Semantic Models Relate Data by Business
Life Style
© 201422
Implications to the Existing IT Architecture
and Practices
User Tools to Discover
and Optimize Data
Structured and
Data, Voice,
and Video
Data Analysis
Extends Existing
Investments in
IT Architecture
Secure Access
Builds Out Enterprise
Data Models, with
Integration Hub
Self-Service Data Feeds
and Analytics
Reduction of
Data Mart Silos
© 201423
Data Lake Approach to Meeting Business Needs
Business Needs
Traditional Technologies
and Practices
Data Lake Technologies
and Practices
Onboard New Data
 Comprehensive analysis creates rigid
structure that is difficult to change, or
 Minimal definition of data organization
requires detailed understanding of data
 Flexible data model can be revised or extended
without redesign of the database
 Agile, evolutionary refinement of the data
organization, leveraging new insights as users work
with the data
Connect External Data
 External data is collected and loaded into
the analytics repository.
 Data is streamed, or is refreshed on a
scheduled frequency.
 External data can be sourced from databases,
spreadsheets, Web pages, news feeds, and more;
data is queried through common methods, without
regard to location, with real-time values delivered at
query time.
Integrate Data between
Business Units or Business
 Governance activities establish common
vocabulary, and data definitions
 And, systems of record publish existing data
specifications or ontology model; each organization
defines data in a manner that is best suited for its
 Shared data is copied to an integrated
 Federation and virtualization features provide
choices in which data to copy and which data to
retain in the system(s) of record
 Organization-specific definitions may
require duplicating certain data in marts
 All models can be supported through a single copy of
the data, maintained in the data lake or system of
Capture and Embed Expertise
 Expertise often captured in the reporting
and analytics; change management
challenge when updates required.
 Expertise captured in the data definitions; single,
shared definition minimizes change management
© 201424
Lessons learned from early adopters
Prioritize data onboarding by the data’s ability to
contribute to customer engagement
Onboard Onboard data assets as they become available
Connect Connect to available internal and external data assets
Load Load the data unfiltered/untransformed
Organize Use models to provide organization to the data
Create models that are tailored to the needs of the
business groups
Search Make it easy to find data
Manage security and privacy, but make it easy to
authorize access to data that users need
© 201425
Addressing Challenges
- Privacy vs Personal Value
- Granularity of customer understanding
- Delivering strategic objectives when projects tend
to have a technical focus
- Opening access to data
- Need for executive sponsorship
- Access to external data
- Establishing firewalls
- Persistent, pervasive data quality issues
© 201426
Clues to better customer engagement will be
found in the ever-growing volume of data that
we’re creating
© 201427
A Data Lake Strategy helps you to create a
personalized, engaging experience with each
Visibility Self-Service
Open, yet Secure
Internet Scale
Data Access
© 201428
© 201429
Thank you!

More Related Content

What's hot

The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation
Hadoop Big Data Lakes Keynote
Hadoop Big Data Lakes KeynoteHadoop Big Data Lakes Keynote
Hadoop Big Data Lakes Keynote
Mark van Rijmenam
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on Hadoop
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
Perficient, Inc.
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016
The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016
The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
Dunn Solutions Group
Moving Past Infrastructure Limitations
Moving Past Infrastructure LimitationsMoving Past Infrastructure Limitations
Moving Past Infrastructure Limitations
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
mark madsen
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Data Con LA
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Technologies
Developing a Strategy for Data Lake Governance
Developing a Strategy for Data Lake GovernanceDeveloping a Strategy for Data Lake Governance
Developing a Strategy for Data Lake Governance
Tony Baer
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data Lake
Data Lake, Virtual Database, or Data Hub - How to Choose?
Data Lake, Virtual Database, or Data Hub - How to Choose?Data Lake, Virtual Database, or Data Hub - How to Choose?
Data Lake, Virtual Database, or Data Hub - How to Choose?
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
Big Data User Group Karlsruhe/Stuttgart

What's hot (20)

The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation
Hadoop Big Data Lakes Keynote
Hadoop Big Data Lakes KeynoteHadoop Big Data Lakes Keynote
Hadoop Big Data Lakes Keynote
Intro to Data Science on Hadoop
Intro to Data Science on HadoopIntro to Data Science on Hadoop
Intro to Data Science on Hadoop
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016
The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016
The Big Data Journey – How Companies Adopt Hadoop - StampedeCon 2016
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
Moving Past Infrastructure Limitations
Moving Past Infrastructure LimitationsMoving Past Infrastructure Limitations
Moving Past Infrastructure Limitations
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Enterprise Search: Addressing the First Problem of Big Data & Analytics - Sta...
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
Developing a Strategy for Data Lake Governance
Developing a Strategy for Data Lake GovernanceDeveloping a Strategy for Data Lake Governance
Developing a Strategy for Data Lake Governance
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
The Emerging Role of the Data Lake
The Emerging Role of the Data LakeThe Emerging Role of the Data Lake
The Emerging Role of the Data Lake
Data Lake, Virtual Database, or Data Hub - How to Choose?
Data Lake, Virtual Database, or Data Hub - How to Choose?Data Lake, Virtual Database, or Data Hub - How to Choose?
Data Lake, Virtual Database, or Data Hub - How to Choose?
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture

Similar to The Emerging Data Lake IT Strategy

Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Nathan Bijnens
Increasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationIncreasing Agility Through Data Virtualization
Increasing Agility Through Data Virtualization
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan PowerEnsuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
Molly Alexander
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
Modernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationModernizing Integration with Data Virtualization
Modernizing Integration with Data Virtualization
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Big data
Big dataBig data
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
MapR Technologies
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentData Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data Environment
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
Sai Paravastu
SQL Server 2019 Data Virtualization
SQL Server 2019 Data VirtualizationSQL Server 2019 Data Virtualization
SQL Server 2019 Data Virtualization
Matthew W. Bowers
ADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and Comparison
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
South West Data Meetup
Data Analytics.pptx
Data Analytics.pptxData Analytics.pptx
Data Analytics.pptx
Rapyder Cloud Solutions
Reinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital TransformationReinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital Transformation

Similar to The Emerging Data Lake IT Strategy (20)

Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Increasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationIncreasing Agility Through Data Virtualization
Increasing Agility Through Data Virtualization
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan PowerEnsuring Data Quality and Lineage in Cloud Migration - Dan Power
Ensuring Data Quality and Lineage in Cloud Migration - Dan Power
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...DAS Slides: Metadata Management From Technical Architecture & Business Techni...
DAS Slides: Metadata Management From Technical Architecture & Business Techni...
Modernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationModernizing Integration with Data Virtualization
Modernizing Integration with Data Virtualization
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Accelerate Digital Transformation with Data Virtualization in Banking, Financ...
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the CloudFoundational Strategies for Trusted Data: Getting Your Data to the Cloud
Foundational Strategies for Trusted Data: Getting Your Data to the Cloud
Big data
Big dataBig data
Big data
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentData Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data Environment
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
SQL Server 2019 Data Virtualization
SQL Server 2019 Data VirtualizationSQL Server 2019 Data Virtualization
SQL Server 2019 Data Virtualization
ADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and ComparisonADV Slides: Data Pipelines in the Enterprise and Comparison
ADV Slides: Data Pipelines in the Enterprise and Comparison
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
Data Analytics.pptx
Data Analytics.pptxData Analytics.pptx
Data Analytics.pptx
Reinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital TransformationReinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital Transformation

More from Thomas Kelly, PMP

Semantic Analytics
Semantic AnalyticsSemantic Analytics
Semantic Analytics
Thomas Kelly, PMP
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data Lake
Thomas Kelly, PMP
Enterprise Semantic Technology
Enterprise Semantic TechnologyEnterprise Semantic Technology
Enterprise Semantic Technology
Thomas Kelly, PMP
Mobile semantic technology
Mobile semantic technologyMobile semantic technology
Mobile semantic technology
Thomas Kelly, PMP
Rapid data integration and curation
Rapid data integration and curationRapid data integration and curation
Rapid data integration and curation
Thomas Kelly, PMP
Transforming Big Data into Big Value
Transforming Big Data into Big ValueTransforming Big Data into Big Value
Transforming Big Data into Big Value
Thomas Kelly, PMP
Semantic Technology for the Data Warehousing Practitioner
Semantic Technology for the Data Warehousing PractitionerSemantic Technology for the Data Warehousing Practitioner
Semantic Technology for the Data Warehousing Practitioner
Thomas Kelly, PMP
Semantic Technology for Provider-Payer-Pharma Data Collaboration
Semantic Technology for Provider-Payer-Pharma Data CollaborationSemantic Technology for Provider-Payer-Pharma Data Collaboration
Semantic Technology for Provider-Payer-Pharma Data Collaboration
Thomas Kelly, PMP

More from Thomas Kelly, PMP (8)

Semantic Analytics
Semantic AnalyticsSemantic Analytics
Semantic Analytics
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data Lake
Enterprise Semantic Technology
Enterprise Semantic TechnologyEnterprise Semantic Technology
Enterprise Semantic Technology
Mobile semantic technology
Mobile semantic technologyMobile semantic technology
Mobile semantic technology
Rapid data integration and curation
Rapid data integration and curationRapid data integration and curation
Rapid data integration and curation
Transforming Big Data into Big Value
Transforming Big Data into Big ValueTransforming Big Data into Big Value
Transforming Big Data into Big Value
Semantic Technology for the Data Warehousing Practitioner
Semantic Technology for the Data Warehousing PractitionerSemantic Technology for the Data Warehousing Practitioner
Semantic Technology for the Data Warehousing Practitioner
Semantic Technology for Provider-Payer-Pharma Data Collaboration
Semantic Technology for Provider-Payer-Pharma Data CollaborationSemantic Technology for Provider-Payer-Pharma Data Collaboration
Semantic Technology for Provider-Payer-Pharma Data Collaboration

Recently uploaded

Difference in Differences - Does Strict Speed Limit Restrictions Reduce Road ...
Difference in Differences - Does Strict Speed Limit Restrictions Reduce Road ...Difference in Differences - Does Strict Speed Limit Restrictions Reduce Road ...
Difference in Differences - Does Strict Speed Limit Restrictions Reduce Road ...
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls HyderabadHyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
binna singh$A17
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your DoorHyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Russian Escorts in Delhi 9711199171 with low rate Book online
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
sapna sharmap11
machine learning notes by Andrew Ng and Tengyu Ma
machine learning notes by Andrew Ng and Tengyu Mamachine learning notes by Andrew Ng and Tengyu Ma
machine learning notes by Andrew Ng and Tengyu Ma
Vijayabaskar Uthirapathy
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your DoorAhmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Russian Escorts in Delhi 9711199171 with low rate Book online
🔥Book Call Girls Lucknow 💯Call Us 🔝 6350257716 🔝💃Independent Lucknow Escorts ...
🔥Book Call Girls Lucknow 💯Call Us 🔝 6350257716 🔝💃Independent Lucknow Escorts ...🔥Book Call Girls Lucknow 💯Call Us 🔝 6350257716 🔝💃Independent Lucknow Escorts ...
🔥Book Call Girls Lucknow 💯Call Us 🔝 6350257716 🔝💃Independent Lucknow Escorts ...
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering RoadshowFabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Gabi Münster
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call GirlCall Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
sapna sharmap11
PCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdfPCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdf
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts ServicePune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
Douglas Day
🔥Night Call Girls Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servi...
🔥Night Call Girls Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servi...🔥Night Call Girls Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servi...
🔥Night Call Girls Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servi...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...

Recently uploaded (20)

Difference in Differences - Does Strict Speed Limit Restrictions Reduce Road ...
Difference in Differences - Does Strict Speed Limit Restrictions Reduce Road ...Difference in Differences - Does Strict Speed Limit Restrictions Reduce Road ...
Difference in Differences - Does Strict Speed Limit Restrictions Reduce Road ...
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls HyderabadHyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your DoorHyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
machine learning notes by Andrew Ng and Tengyu Ma
machine learning notes by Andrew Ng and Tengyu Mamachine learning notes by Andrew Ng and Tengyu Ma
machine learning notes by Andrew Ng and Tengyu Ma
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your DoorAhmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
🔥Book Call Girls Lucknow 💯Call Us 🔝 6350257716 🔝💃Independent Lucknow Escorts ...
🔥Book Call Girls Lucknow 💯Call Us 🔝 6350257716 🔝💃Independent Lucknow Escorts ...🔥Book Call Girls Lucknow 💯Call Us 🔝 6350257716 🔝💃Independent Lucknow Escorts ...
🔥Book Call Girls Lucknow 💯Call Us 🔝 6350257716 🔝💃Independent Lucknow Escorts ...
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering RoadshowFabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call GirlCall Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
Call Girls Goa (india) ☎️ +91-7426014248 Goa Call Girl
PCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdfPCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdf
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts ServicePune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
🔥Night Call Girls Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servi...
🔥Night Call Girls Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servi...🔥Night Call Girls Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servi...
🔥Night Call Girls Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servi...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...

The Emerging Data Lake IT Strategy

  • 1. © 2014 The Emerging Data Lake IT Strategy An Evolving Approach for Dealing with Big Data & Changing Environments SPEAKERS: Thomas Kelly, Practice Director Cognizant Technology Solutions Sean Martin, Founder and CTO Cambridge Semantics bit.ly/DataLake
  • 2. © 20142 We’re living in an amazing world of information sharing, connecting with family, neighbors, vendors, and customers all over the world
  • 3. © 20143 Telling the world about what we like and don’t like #HIMYMfinale @MLB … is now following Cognizant Technology Solutions and Cambridge Semantics
  • 4. © 20144 What we’re doing and how we’re succeeding
  • 5. © 20145 We’re deciding what advertising that we want to see… … and what we don’t Unsubscribe Influencing how business and customers engage
  • 6. © 20146 Many businesses have emerged that embrace this model of customer engagement and we’ve said Goodbye to businesses that didn’t 10 million stays in 2013, without owning a hotel Grew to nearly $75B in annual retail revenue in 2013, without opening a storefront Shares over 40 million photos each day
  • 7. © 20147 Retail Engaging in a more personalized shopping experience, retailers are building a stronger relationship with each customer
  • 8. © 20148 Customer Service Delivering a positive and successful experience for each customer
  • 9. © 20149 Life Sciences and Healthcare Combining health, genetic, clinical, and public sciences data to bring effective therapies to patients sooner
  • 10. © 201410 Financial Services Delivering innovative products and services, based on a 360° view of the Customer, across all business lines, engaging all available data assets, internal and external
  • 11. © 201411 The Challenges That We're Addressing Onboarding and Integrating Data is Slow and Expensive • Transforming data from a growing variety of technologies • Custom coded ETL • Existing ETL processes are not reusable • Optimization for analytics is time-consuming and costly • Often wait until there is a defined need for a set of data, delaying benefits realization while waiting to onboard the data Data Provenance is Often Poorly Recorded • Data meaning is “lost in translation” • Data transformations tracked in spreadsheets • Post-onboarding, maintenance and analysis cost for onboarded data is high • Recreating data lineage is manual, time-consuming, and error-prone
  • 12. © 201412 The Challenges That We're Addressing Target Data is Difficult to Consume • Optimization favors known analytics, but not well suited to new requirements • A one-size-fits-all canonical view is used rather than fit-for-purpose views • Or, lacks a conceptual model to easily consume the target data • Difficult to identify what data is available, how to get access, and how to integrate the data to answer a question Industrializing the Big Data Environment is Difficult to Manage • Proliferation of data silos leads to inconsistency/syncing issues • Conflicting objectives of opening access to data assets while managing security and privacy requirements • Velocity of business change rapidly invalidate data organization and analytics optimizations • Managing the integration/interaction with the multiple data management technologies that make up the Big Data environment
  • 13. © 201413 Data Ingestion The Data Lake is made up of four key components Data Lake Management Data Management Query Management Delivering • Low Cost, High Performance Storage • Flexible, Easy-to-Use Data Organization • Performance-Optimized Analytics • Automation of most manual Development and Query Activities • Self-Service End-User Features • Intelligent Processing
  • 14. © 201414 Data Ingestion Data Lake Management Data Management Query Management Data Sources Linked Data Internet of Things IoT Data Ingestion On-Demand Query Streaming Semantic Tagging Scheduled Batch Load Model- Driven Self-Service Desktop and Mobile Operational Systems Social Media and Cloud
  • 15. © 201415 Data Management Data Lake Management Data Management Query Management Provenance Data Movement Data Sources Linked Data Internet of Things IoT Semantic Graph Columnar In Memory Data Ingestion On-Demand Query Streaming Semantic Tagging Scheduled Batch Load Model- Driven Self-Service Desktop and Mobile NoSQL Map Reduce Operational Systems Social Media and Cloud HDFS Storage Structured and Unstructured Data HDFS Storage
  • 16. © 201416 Data Ingestion Data Lake Management Data Management Query Management Semantic Graph Columnar In Memory Provenance Data Movement Data Lake Management Data Assets Catalog WorkflowModels Access Management Data Sources Linked Data Internet of Things IoT Data Mappings • Source-to-Target • Transformations • Internal and External Data Assets • Defined Data Orgs (ontologies, taxonomies, thesauri) • Authorization and Access Rules • Rule-based Security • Group, Role, and User Level Authorization • Auditable Access • Processes • Schedules • Provenance Capture On-Demand Query Streaming Semantic Tagging Scheduled Batch Load Model- Driven Self-Service Business-Focused • Business Unit Data Organization and Terms • Optimized to Assist Analytics Monitoring • Monitor and Manage Data Lake Operations Desktop and Mobile Data Governance • Focus on Shared Data • Standard Models • Controlled Vocabulary • Common Definitions • Standards-based Data Views (FIBO, CDISC/RDF) NoSQL Map Reduce Operational Systems Social Media and Cloud Structured and Unstructured Data HDFS Storage
  • 17. © 201417 Query Management Data Ingestion On-Demand Query Streaming Semantic Tagging Data Lake Management Data Management Scheduled Batch Load Model- Driven Self-Service Query Management Provenance Data Movement Data Sources Linked Data Internet of Things IoT Semantic Graph Columnar In Memory Query Data, Metadata, and Provenance Capture and Share Analytics Expertise Semantic Search Analytics Directed to the Best Query Engine Data Discovery Desktop and Mobile NoSQL Map Reduce Operational Systems Social Media and Cloud HDFS Storage Structured and Unstructured Data HDFS Storage
  • 18. © 201418 Semantic Technology Delivers “Smart” Data Integrates a network of internal and external data assets, insulating end users from the details of the underlying technologies Captures expertise (logic, inferencing) and integrates it with the data, delivering “smart” data to non-expert users Manages a comprehensive inventory of the data assets Secures access to the right data assets by the right users
  • 19. © 201419 Key W3C Standards in Semantic Technology Resource Description Framework (RDF) Framework for storing and integrating data and data definitions in the form of subject- predicate-object expressions, or “triples”. Relationships are organized in a logical graph model. Reduced development time and cost; faster time-to- business value. Web Ontology Language (OWL) An ontology is a comprehensive model of data definitions and relationships that is human- and machine-readable. Ontologies are inheritable and extensible. Improved application quality, flexible iterative / investigative approach, easily adapts to business change. SPARQL Query Language SQL-like query language for semantic data that can leverage the ontological relationships and constructs to execute smarter queries. Access multiple internal and external databases simultaneously in a single query. Access and integrate data across business silos. Inference Reasoning over data through business rules. Expertise is captured and embedded in the ontology model, accessible through user queries. This is the “smart” in Smart Data. Easier end user access to expertise; intelligent systems capabilities. Linked Data Connects data contained in different databases, allowing queries to find, share and combine data so insights can be identified across the Web. Connect disparate databases to navigate and integrate data regardless of location or technology platform. RDB to RDF Mapping Language (R2RML) Preserving current investments in relational technology, R2RML maps relational data to an ontology. SPARQL can query RDF and relational databases simultaneously. Low cost of entry to use Semantic Technology to deliver high-value solutions
  • 20. © 201420 The Common Model is the “Data Glue” Lead (SFA system) Quote (Quote system) Order (OMS system) Contract (CMS system) Common Model (“Data Glue”) Source Systems • Different business entities in physical systems actually share many of the same concepts, meanings, and relationships • Semantic data science exposes common business concepts and connects them with their physical expression in production systems • Data is “glued” together by its business meaning, rather than physical structures dictated by the underlying technologies The conceptual model can be directly used by both business and IT users to operationalize data services, understand the data landscape, track data lineage, and conduct downstream analytics.
  • 21. © 201421 Semantic Models Relate Data by Business Meaning Life Events Life Style Preferences Interests Customer Music Purchasing Personal Network Entertainment Profession
  • 22. © 201422 Implications to the Existing IT Architecture and Practices User Tools to Discover and Optimize Data Relationships Structured and Unstructured Data, Voice, and Video Data Analysis Automation Extends Existing Investments in IT Architecture Manages Secure Access Builds Out Enterprise Data Models, with Integration Hub Capabilities Self-Service Data Feeds and Analytics Infrastructure Capacity Elasticity Reduction of Data Mart Silos Easier Access to External Data
  • 23. © 201423 Data Lake Approach to Meeting Business Needs Business Needs Traditional Technologies and Practices Data Lake Technologies and Practices Onboard New Data  Comprehensive analysis creates rigid structure that is difficult to change, or  Minimal definition of data organization requires detailed understanding of data contents  Flexible data model can be revised or extended without redesign of the database  Agile, evolutionary refinement of the data organization, leveraging new insights as users work with the data Connect External Data  External data is collected and loaded into the analytics repository.  Data is streamed, or is refreshed on a scheduled frequency.  External data can be sourced from databases, spreadsheets, Web pages, news feeds, and more; data is queried through common methods, without regard to location, with real-time values delivered at query time. Integrate Data between Business Units or Business Partners  Governance activities establish common vocabulary, and data definitions  And, systems of record publish existing data specifications or ontology model; each organization defines data in a manner that is best suited for its business.  Shared data is copied to an integrated database.  Federation and virtualization features provide choices in which data to copy and which data to retain in the system(s) of record  Organization-specific definitions may require duplicating certain data in marts  All models can be supported through a single copy of the data, maintained in the data lake or system of record. Capture and Embed Expertise  Expertise often captured in the reporting and analytics; change management challenge when updates required.  Expertise captured in the data definitions; single, shared definition minimizes change management efforts
  • 24. © 201424 Lessons learned from early adopters Prioritize Prioritize data onboarding by the data’s ability to contribute to customer engagement Onboard Onboard data assets as they become available Connect Connect to available internal and external data assets Load Load the data unfiltered/untransformed Organize Use models to provide organization to the data Customize Create models that are tailored to the needs of the business groups Search Make it easy to find data Secure Manage security and privacy, but make it easy to authorize access to data that users need
  • 25. © 201425 Addressing Challenges - Privacy vs Personal Value - Granularity of customer understanding - Delivering strategic objectives when projects tend to have a technical focus - Opening access to data - Need for executive sponsorship - Access to external data - Establishing firewalls - Persistent, pervasive data quality issues
  • 26. © 201426 Clues to better customer engagement will be found in the ever-growing volume of data that we’re creating
  • 27. © 201427 A Data Lake Strategy helps you to create a personalized, engaging experience with each customer Visibility Self-Service SmartProvenance Open, yet Secure Internet Scale Agile Adaptable Universal Data Access