尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
From Insight to Action - Using
Data Science to Transform
Your Organization
Rob Morrow, Chief Technologist US Government
2© Cloudera, Inc. All rights reserved.
Deploy on any cloud infrastructure
Cloudera Director: Management for IaaS-related and CDH cluster operations
Easy Administration
• Dynamic cluster lifecycle management
• ICD-503 Support
• Single pane of glass: multi-cluster view
• Consumption based billing and metering
Enterprise-grade
• Integration across Cloudera Enterprise
• Management of CDH deployments at
scale
Flexible Deployments
• No cloud vendor lock-in: open plugin
framework for IaaS platforms
• Scaling of provisioned clusters
• Spot instance provisioning
Cloudera Director
3© Cloudera, Inc. All rights reserved.
Enterprise Data Science Topics
It took Todd Lipcon 3 years to
create Kudu;10 years of work
before that learning and gaining
trust among OS Community as a
committer.
Government of the future:
value created through
interesting methods.
If your organization is already
good at the 5,000 Open Source
Algorithms (Regression etc), you
now need a Data Science Cadre.
Open Source: Help Wanted. Methods, not raw DataMost problems are not really
Data Science “Challenges”
4© Cloudera, Inc. All rights reserved.
Data Engineering and Data Science Workloads
Data Ingestion
(Kafka, Navigator,
Search)
Cloudera enables users to build real-time, end-
to-end data pipelines in order to power their
business. Leadership in Apache Spark and
Kafka have made Cloudera a trusted resource
for users who want to capture real-time,
streaming, and time series data without being
presented with gaps in security.
Data Processing
(Spark, Hive)
Cloudera is helping users accelerate
their data pipelines with leadership in
technologies like Apache
Spark. Data processing in Cloudera
Enterprise can help take processing
windows from hours to minutes and
enables faster access to data for a
variety of users and skillsets.
Data Science (Spark
MLlib)
Cloudera is bringing the most popular data
science languages/libraries to our platform
for easier collaboration, self-service
exploration, and implementation at
scale. Cloudera is advancing the state of
distributed machine learning at scale.
Cloudera enables exploratory data science
and the ability to deliver robust data
products.
5© Cloudera, Inc. All rights reserved.
Closing Gaps in Critical Skills Areas in the Govt
Data Science
High Value, Low Frequency
• Only a small set of problems require
direct Data Science expertise (~5%)
• Domain-general, algorithm-specific
• Very high expertise
Characterized by
• Spark/Python Expertise
• Advanced Algorithms
• Hypothesis-testing
Automation/Workload
• Per-task/Algorithm automation
Data Analysis
High Frequency, Self-Service
• The “other” 95% of Problems
• More domain-specific
Characterized by
• Tools with UI’s (Data Robot)
• “Exploratory” data investigation
Automation/Workload
• Easily automated
Data Science “Unicorns” are even more valuable in the Govt.
So how to you scale them out?
6© Cloudera, Inc. All rights reserved.
Two Data Science Use Cases
Improving decisions vs. improving products
Decision Science
(improving business decisions)
Data Products
(improving products for customers)
• User: Data scientists and analysts
• Data: New and changing; often sampled
• Environment: Local machine, sandbox cluster
• Tools: R, Python, SAS/SPSS, SQL; notebooks; data
wrangling/discovery tools, …
• Goal: Understand data, develop and improve models,
share results
• Production: Hosted/scheduled reports or dashboards
• User: Data engineers, developers, SREs
• Data: Known data; full scale
• Environment: Production clusters
• Tools: Java/Scala, C++; IDEs; continuous
integration, source control, …
• Goal: Build and maintain applications, improve
model performance, manage models in production
• Production: Online applications
7© Cloudera, Inc. All rights reserved.
Ingest
The Foundation of Hadoop’s Potential
Data can come from a variety of “siloed” sources
▪ Existing databases
▪ Sensor data
▪ Server logs
▪ Chat transcripts
Value of data is multiplied when combined and
correlated with other data
▪ “40% value improvement from combining data from
multiple IoT sources” McKinsey Global Institute
8© Cloudera, Inc. All rights reserved.
Data Processing
Leverage the right processing for your job
Data may require unique processing characteristics
▪ Batch
▪ Streaming
▪ Real-time
Hadoop arose to address one and now the ecosystem
has evolved to answer the rest.
▪ “We’re doubling down on Spark. We invested earliest,
and we’ve invested most, in making Hadoop
enterprise-grade” Mike Olson
9© Cloudera, Inc. All rights reserved.
Data Science
A Unified Platform to Accelerate Data Science from Exploration to Production.
Data Scientists need to use data to…
▪ Explore
▪ Model
▪ Test
The field of data science blends math and statistics
knowledge with advanced computer knowledge.
▪ “Data Scientist: Person who is better at statistics than
any software engineer and better at software
engineering than any statistician” Josh Wills
10© Cloudera, Inc. All rights reserved.
MLlib
Collection of mainstream machine learning algorithms built on Spark
Including:
•Classifiers: logistic regression, boosted trees, random forests, etc
•Clustering: k-means, Latent Dirichlet Allocation (LDA)
•Recommender Systems: Alternating Least Squares
•Dimensionality Reduction: Principal Component Analysis (PCA) and Singular Value
Decomposition (SVD)
•Feature Engineering & Selection: TF-IDF, Word2Vec, Normalizer, etc
•Statistical Functions: Chi-Squared Test, Pearson Correlation, etc
11© Cloudera, Inc. All rights reserved.
Data Science Track Info
Data Science Location: Severn
Matrix Decomposition at Scale
Juliet Hougland, Data Scientist, Cloudera
Large-scale Agent-Based Modeling and Simulation on High-Performance Computers
Dr. Robert Axtell, George Mason University
Random Decision Forests at Scale
Todd Boetticher, Solutions Consultant, Cloudera
12© Cloudera, Inc. All rights reserved.
1
Recommended Training for Data Engineering
Learn how to identify which tool
is the right one to use in a given
situation, and gain hands-on
experience using those tools
Cloudera University’s three-day
course helps participants
understand what data scientists
do, the problems they solve,
and the tools and techniques
they use
Learn how to increase the ROI
from big data investments, by
delivering faster time to insight
for your organization.
Apache Spark and Hadoop Data Science on Hadoop Cloudera Search
13© Cloudera, Inc. All rights reserved.
Thank you

More Related Content

What's hot

Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
Cloudera, Inc.
 
Engaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpEngaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap Up
Cloudera, Inc.
 
Random Decision Forests at Scale
Random Decision Forests at ScaleRandom Decision Forests at Scale
Random Decision Forests at Scale
Cloudera, Inc.
 
Protecting health and life science organizations from breaches and ransomware
Protecting health and life science organizations from breaches and ransomwareProtecting health and life science organizations from breaches and ransomware
Protecting health and life science organizations from breaches and ransomware
Cloudera, Inc.
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

Cloudera, Inc.
 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
Cloudera, Inc.
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
Cloudera, Inc.
 
Advanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningAdvanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine Learning
Cloudera, Inc.
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

Cloudera, Inc.
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
Cloudera, Inc.
 
Data Drive Applications_Webinar
Data Drive Applications_WebinarData Drive Applications_Webinar
Data Drive Applications_Webinar
Sean Spediacci
 
The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data Journey
Cloudera, Inc.
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
Cloudera, Inc.
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
Cloudera, Inc.
 
Optimizing Regulatory Compliance with Big Data
Optimizing Regulatory Compliance with Big DataOptimizing Regulatory Compliance with Big Data
Optimizing Regulatory Compliance with Big Data
Cloudera, Inc.
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Cloudera, Inc.
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)
Cloudera, Inc.
 
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning

Cloudera, Inc.
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...
Cloudera, Inc.
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Cloudera, Inc.
 

What's hot (20)

Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
 
Engaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpEngaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap Up
 
Random Decision Forests at Scale
Random Decision Forests at ScaleRandom Decision Forests at Scale
Random Decision Forests at Scale
 
Protecting health and life science organizations from breaches and ransomware
Protecting health and life science organizations from breaches and ransomwareProtecting health and life science organizations from breaches and ransomware
Protecting health and life science organizations from breaches and ransomware
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 
Advanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningAdvanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine Learning
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
Data Drive Applications_Webinar
Data Drive Applications_WebinarData Drive Applications_Webinar
Data Drive Applications_Webinar
 
The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data Journey
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Optimizing Regulatory Compliance with Big Data
Optimizing Regulatory Compliance with Big DataOptimizing Regulatory Compliance with Big Data
Optimizing Regulatory Compliance with Big Data
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)The Transformation of your Data in modern IT (Presented by DellEMC)
The Transformation of your Data in modern IT (Presented by DellEMC)
 
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning

 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
 

Viewers also liked

Complex Models for Big Data
Complex Models for Big DataComplex Models for Big Data
Complex Models for Big Data
Data Science Research Center
 
Building new business models through big data dec 06 2012
Building new business models through big data   dec 06 2012Building new business models through big data   dec 06 2012
Building new business models through big data dec 06 2012
Aki Balogh
 
Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights
Joe Lamantia
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
Hisham Arafat
 
Data-Driven Innovation: 3 Ways to Create a New Level of Performance in Your O...
Data-Driven Innovation: 3 Ways to Create a New Level of Performance in Your O...Data-Driven Innovation: 3 Ways to Create a New Level of Performance in Your O...
Data-Driven Innovation: 3 Ways to Create a New Level of Performance in Your O...
Travis Barker
 
From insight to action - data analysis that makes a difference! - Heena Jethwa
From insight to action - data analysis that makes a difference! - Heena JethwaFrom insight to action - data analysis that makes a difference! - Heena Jethwa
From insight to action - data analysis that makes a difference! - Heena Jethwa
IBM SPSS Denmark
 
Becoming a Data Driven Organisation
Becoming a Data Driven OrganisationBecoming a Data Driven Organisation
Becoming a Data Driven Organisation
Wizdee
 
Automated Regulatory Compliance Management
Automated Regulatory Compliance ManagementAutomated Regulatory Compliance Management
Automated Regulatory Compliance Management
Adeel159
 
5 Essential Practices of the Data Driven Organization
5 Essential Practices of the Data Driven Organization5 Essential Practices of the Data Driven Organization
5 Essential Practices of the Data Driven Organization
Vivastream
 
Data-Driven Organisation
Data-Driven OrganisationData-Driven Organisation
Data-Driven Organisation
Jaakko Särelä
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing
Natalino Busa
 
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataA Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
Domino Data Lab
 
Cloudera for Internet of Things
Cloudera for Internet of ThingsCloudera for Internet of Things
Cloudera for Internet of Things
Cloudera, Inc.
 
SAP’s Utilities Roadmap Overview, The Evolution of Regulatory Compliance and ...
SAP’s Utilities Roadmap Overview, The Evolution of Regulatory Compliance and ...SAP’s Utilities Roadmap Overview, The Evolution of Regulatory Compliance and ...
SAP’s Utilities Roadmap Overview, The Evolution of Regulatory Compliance and ...
EnergySec
 
Cloudera - Enabling the IoT Revolution Driving Insights in a Connected World
Cloudera - Enabling the IoT Revolution Driving Insights in a Connected WorldCloudera - Enabling the IoT Revolution Driving Insights in a Connected World
Cloudera - Enabling the IoT Revolution Driving Insights in a Connected World
andreas kuncoro
 
How to create new business models with Big Data and Analytics
How to create new business models with Big Data and AnalyticsHow to create new business models with Big Data and Analytics
How to create new business models with Big Data and Analytics
Aki Balogh
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
Natalino Busa
 
Accenture Regulatory Compliance Platform
Accenture Regulatory Compliance PlatformAccenture Regulatory Compliance Platform
Accenture Regulatory Compliance Platform
accenture
 
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming dataUsing Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Mike Percy
 

Viewers also liked (19)

Complex Models for Big Data
Complex Models for Big DataComplex Models for Big Data
Complex Models for Big Data
 
Building new business models through big data dec 06 2012
Building new business models through big data   dec 06 2012Building new business models through big data   dec 06 2012
Building new business models through big data dec 06 2012
 
Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
 
Data-Driven Innovation: 3 Ways to Create a New Level of Performance in Your O...
Data-Driven Innovation: 3 Ways to Create a New Level of Performance in Your O...Data-Driven Innovation: 3 Ways to Create a New Level of Performance in Your O...
Data-Driven Innovation: 3 Ways to Create a New Level of Performance in Your O...
 
From insight to action - data analysis that makes a difference! - Heena Jethwa
From insight to action - data analysis that makes a difference! - Heena JethwaFrom insight to action - data analysis that makes a difference! - Heena Jethwa
From insight to action - data analysis that makes a difference! - Heena Jethwa
 
Becoming a Data Driven Organisation
Becoming a Data Driven OrganisationBecoming a Data Driven Organisation
Becoming a Data Driven Organisation
 
Automated Regulatory Compliance Management
Automated Regulatory Compliance ManagementAutomated Regulatory Compliance Management
Automated Regulatory Compliance Management
 
5 Essential Practices of the Data Driven Organization
5 Essential Practices of the Data Driven Organization5 Essential Practices of the Data Driven Organization
5 Essential Practices of the Data Driven Organization
 
Data-Driven Organisation
Data-Driven OrganisationData-Driven Organisation
Data-Driven Organisation
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing
 
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataA Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
 
Cloudera for Internet of Things
Cloudera for Internet of ThingsCloudera for Internet of Things
Cloudera for Internet of Things
 
SAP’s Utilities Roadmap Overview, The Evolution of Regulatory Compliance and ...
SAP’s Utilities Roadmap Overview, The Evolution of Regulatory Compliance and ...SAP’s Utilities Roadmap Overview, The Evolution of Regulatory Compliance and ...
SAP’s Utilities Roadmap Overview, The Evolution of Regulatory Compliance and ...
 
Cloudera - Enabling the IoT Revolution Driving Insights in a Connected World
Cloudera - Enabling the IoT Revolution Driving Insights in a Connected WorldCloudera - Enabling the IoT Revolution Driving Insights in a Connected World
Cloudera - Enabling the IoT Revolution Driving Insights in a Connected World
 
How to create new business models with Big Data and Analytics
How to create new business models with Big Data and AnalyticsHow to create new business models with Big Data and Analytics
How to create new business models with Big Data and Analytics
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
 
Accenture Regulatory Compliance Platform
Accenture Regulatory Compliance PlatformAccenture Regulatory Compliance Platform
Accenture Regulatory Compliance Platform
 
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming dataUsing Kafka and Kudu for fast, low-latency SQL analytics on streaming data
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
 

Similar to From Insight to Action: Using Data Science to Transform Your Organization

Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
The Hive
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
DataWorks Summit
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
Cloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

Cloudera, Inc.
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
Cloudera, Inc.
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
Revolution Analytics
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
DATAVERSITY
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
IBM
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera, Inc.
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
Dr. Wilfred Lin (Ph.D.)
 
Enterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, ClouderaEnterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, Cloudera
Neo4j
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
Dremio Corporation
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
Nathan Bijnens
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
Toronto-Oracle-Users-Group
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
Cloudera, Inc.
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
DATAVERSITY
 

Similar to From Insight to Action: Using Data Science to Transform Your Organization (20)

Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
 
Enterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, ClouderaEnterprise Metadata Integration, Cloudera
Enterprise Metadata Integration, Cloudera
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Hi-Fi Call Girls In Hyderabad 💯Call Us 🔝 7426014248 🔝Independent Hyderabad Es...
Hi-Fi Call Girls In Hyderabad 💯Call Us 🔝 7426014248 🔝Independent Hyderabad Es...Hi-Fi Call Girls In Hyderabad 💯Call Us 🔝 7426014248 🔝Independent Hyderabad Es...
Hi-Fi Call Girls In Hyderabad 💯Call Us 🔝 7426014248 🔝Independent Hyderabad Es...
sapnasaifi408
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
ICS
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
Alina Yurenko
 
Trailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptxTrailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptx
ImtiazBinMohiuddin
 
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery FleetStork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Vince Scalabrino
 
Refactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contextsRefactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contexts
Michał Kurzeja
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
Anand Bagmar
 
1 Million Orange Stickies later - Devoxx Poland 2024
1 Million Orange Stickies later - Devoxx Poland 20241 Million Orange Stickies later - Devoxx Poland 2024
1 Million Orange Stickies later - Devoxx Poland 2024
Alberto Brandolini
 
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
manji sharman06
 
NLJUG speaker academy 2024 - session 1, June 2024
NLJUG speaker academy 2024 - session 1, June 2024NLJUG speaker academy 2024 - session 1, June 2024
NLJUG speaker academy 2024 - session 1, June 2024
Bert Jan Schrijver
 
European Standard S1000D, an Unnecessary Expense to OEM.pptx
European Standard S1000D, an Unnecessary Expense to OEM.pptxEuropean Standard S1000D, an Unnecessary Expense to OEM.pptx
European Standard S1000D, an Unnecessary Expense to OEM.pptx
Digital Teacher
 
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdfThe Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
kalichargn70th171
 
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical OperationsEnsuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
OnePlan Solutions
 
Folding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a seriesFolding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a series
Philip Schwarz
 
🔥 Kolkata Call Girls  👉 9079923931 👫 High Profile Call Girls Whatsapp Number ...
🔥 Kolkata Call Girls  👉 9079923931 👫 High Profile Call Girls Whatsapp Number ...🔥 Kolkata Call Girls  👉 9079923931 👫 High Profile Call Girls Whatsapp Number ...
🔥 Kolkata Call Girls  👉 9079923931 👫 High Profile Call Girls Whatsapp Number ...
tinakumariji156
 
What’s new in VictoriaMetrics - Q2 2024 Update
What’s new in VictoriaMetrics - Q2 2024 UpdateWhat’s new in VictoriaMetrics - Q2 2024 Update
What’s new in VictoriaMetrics - Q2 2024 Update
VictoriaMetrics
 
Hands-on with Apache Druid: Installation & Data Ingestion Steps
Hands-on with Apache Druid: Installation & Data Ingestion StepsHands-on with Apache Druid: Installation & Data Ingestion Steps
Hands-on with Apache Druid: Installation & Data Ingestion Steps
servicesNitor
 
Call Girls in Varanasi || 7426014248 || Quick Booking at Affordable Price
Call Girls in Varanasi || 7426014248 || Quick Booking at Affordable PriceCall Girls in Varanasi || 7426014248 || Quick Booking at Affordable Price
Call Girls in Varanasi || 7426014248 || Quick Booking at Affordable Price
vickythakur209464
 
What’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 UpdateWhat’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 Update
VictoriaMetrics
 
Erotic Call Girls Bangalore🫱9079923931🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Bangalore🫱9079923931🫲 High Quality Call Girl Service Right ...Erotic Call Girls Bangalore🫱9079923931🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Bangalore🫱9079923931🫲 High Quality Call Girl Service Right ...
meenusingh4354543
 

Recently uploaded (20)

Hi-Fi Call Girls In Hyderabad 💯Call Us 🔝 7426014248 🔝Independent Hyderabad Es...
Hi-Fi Call Girls In Hyderabad 💯Call Us 🔝 7426014248 🔝Independent Hyderabad Es...Hi-Fi Call Girls In Hyderabad 💯Call Us 🔝 7426014248 🔝Independent Hyderabad Es...
Hi-Fi Call Girls In Hyderabad 💯Call Us 🔝 7426014248 🔝Independent Hyderabad Es...
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
 
Trailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptxTrailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptx
 
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery FleetStork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
 
Refactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contextsRefactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contexts
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
 
1 Million Orange Stickies later - Devoxx Poland 2024
1 Million Orange Stickies later - Devoxx Poland 20241 Million Orange Stickies later - Devoxx Poland 2024
1 Million Orange Stickies later - Devoxx Poland 2024
 
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
 
NLJUG speaker academy 2024 - session 1, June 2024
NLJUG speaker academy 2024 - session 1, June 2024NLJUG speaker academy 2024 - session 1, June 2024
NLJUG speaker academy 2024 - session 1, June 2024
 
European Standard S1000D, an Unnecessary Expense to OEM.pptx
European Standard S1000D, an Unnecessary Expense to OEM.pptxEuropean Standard S1000D, an Unnecessary Expense to OEM.pptx
European Standard S1000D, an Unnecessary Expense to OEM.pptx
 
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdfThe Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdf
 
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical OperationsEnsuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
 
Folding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a seriesFolding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a series
 
🔥 Kolkata Call Girls  👉 9079923931 👫 High Profile Call Girls Whatsapp Number ...
🔥 Kolkata Call Girls  👉 9079923931 👫 High Profile Call Girls Whatsapp Number ...🔥 Kolkata Call Girls  👉 9079923931 👫 High Profile Call Girls Whatsapp Number ...
🔥 Kolkata Call Girls  👉 9079923931 👫 High Profile Call Girls Whatsapp Number ...
 
What’s new in VictoriaMetrics - Q2 2024 Update
What’s new in VictoriaMetrics - Q2 2024 UpdateWhat’s new in VictoriaMetrics - Q2 2024 Update
What’s new in VictoriaMetrics - Q2 2024 Update
 
Hands-on with Apache Druid: Installation & Data Ingestion Steps
Hands-on with Apache Druid: Installation & Data Ingestion StepsHands-on with Apache Druid: Installation & Data Ingestion Steps
Hands-on with Apache Druid: Installation & Data Ingestion Steps
 
Call Girls in Varanasi || 7426014248 || Quick Booking at Affordable Price
Call Girls in Varanasi || 7426014248 || Quick Booking at Affordable PriceCall Girls in Varanasi || 7426014248 || Quick Booking at Affordable Price
Call Girls in Varanasi || 7426014248 || Quick Booking at Affordable Price
 
What’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 UpdateWhat’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 Update
 
Erotic Call Girls Bangalore🫱9079923931🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Bangalore🫱9079923931🫲 High Quality Call Girl Service Right ...Erotic Call Girls Bangalore🫱9079923931🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Bangalore🫱9079923931🫲 High Quality Call Girl Service Right ...
 

From Insight to Action: Using Data Science to Transform Your Organization

  • 1. 1© Cloudera, Inc. All rights reserved. From Insight to Action - Using Data Science to Transform Your Organization Rob Morrow, Chief Technologist US Government
  • 2. 2© Cloudera, Inc. All rights reserved. Deploy on any cloud infrastructure Cloudera Director: Management for IaaS-related and CDH cluster operations Easy Administration • Dynamic cluster lifecycle management • ICD-503 Support • Single pane of glass: multi-cluster view • Consumption based billing and metering Enterprise-grade • Integration across Cloudera Enterprise • Management of CDH deployments at scale Flexible Deployments • No cloud vendor lock-in: open plugin framework for IaaS platforms • Scaling of provisioned clusters • Spot instance provisioning Cloudera Director
  • 3. 3© Cloudera, Inc. All rights reserved. Enterprise Data Science Topics It took Todd Lipcon 3 years to create Kudu;10 years of work before that learning and gaining trust among OS Community as a committer. Government of the future: value created through interesting methods. If your organization is already good at the 5,000 Open Source Algorithms (Regression etc), you now need a Data Science Cadre. Open Source: Help Wanted. Methods, not raw DataMost problems are not really Data Science “Challenges”
  • 4. 4© Cloudera, Inc. All rights reserved. Data Engineering and Data Science Workloads Data Ingestion (Kafka, Navigator, Search) Cloudera enables users to build real-time, end- to-end data pipelines in order to power their business. Leadership in Apache Spark and Kafka have made Cloudera a trusted resource for users who want to capture real-time, streaming, and time series data without being presented with gaps in security. Data Processing (Spark, Hive) Cloudera is helping users accelerate their data pipelines with leadership in technologies like Apache Spark. Data processing in Cloudera Enterprise can help take processing windows from hours to minutes and enables faster access to data for a variety of users and skillsets. Data Science (Spark MLlib) Cloudera is bringing the most popular data science languages/libraries to our platform for easier collaboration, self-service exploration, and implementation at scale. Cloudera is advancing the state of distributed machine learning at scale. Cloudera enables exploratory data science and the ability to deliver robust data products.
  • 5. 5© Cloudera, Inc. All rights reserved. Closing Gaps in Critical Skills Areas in the Govt Data Science High Value, Low Frequency • Only a small set of problems require direct Data Science expertise (~5%) • Domain-general, algorithm-specific • Very high expertise Characterized by • Spark/Python Expertise • Advanced Algorithms • Hypothesis-testing Automation/Workload • Per-task/Algorithm automation Data Analysis High Frequency, Self-Service • The “other” 95% of Problems • More domain-specific Characterized by • Tools with UI’s (Data Robot) • “Exploratory” data investigation Automation/Workload • Easily automated Data Science “Unicorns” are even more valuable in the Govt. So how to you scale them out?
  • 6. 6© Cloudera, Inc. All rights reserved. Two Data Science Use Cases Improving decisions vs. improving products Decision Science (improving business decisions) Data Products (improving products for customers) • User: Data scientists and analysts • Data: New and changing; often sampled • Environment: Local machine, sandbox cluster • Tools: R, Python, SAS/SPSS, SQL; notebooks; data wrangling/discovery tools, … • Goal: Understand data, develop and improve models, share results • Production: Hosted/scheduled reports or dashboards • User: Data engineers, developers, SREs • Data: Known data; full scale • Environment: Production clusters • Tools: Java/Scala, C++; IDEs; continuous integration, source control, … • Goal: Build and maintain applications, improve model performance, manage models in production • Production: Online applications
  • 7. 7© Cloudera, Inc. All rights reserved. Ingest The Foundation of Hadoop’s Potential Data can come from a variety of “siloed” sources ▪ Existing databases ▪ Sensor data ▪ Server logs ▪ Chat transcripts Value of data is multiplied when combined and correlated with other data ▪ “40% value improvement from combining data from multiple IoT sources” McKinsey Global Institute
  • 8. 8© Cloudera, Inc. All rights reserved. Data Processing Leverage the right processing for your job Data may require unique processing characteristics ▪ Batch ▪ Streaming ▪ Real-time Hadoop arose to address one and now the ecosystem has evolved to answer the rest. ▪ “We’re doubling down on Spark. We invested earliest, and we’ve invested most, in making Hadoop enterprise-grade” Mike Olson
  • 9. 9© Cloudera, Inc. All rights reserved. Data Science A Unified Platform to Accelerate Data Science from Exploration to Production. Data Scientists need to use data to… ▪ Explore ▪ Model ▪ Test The field of data science blends math and statistics knowledge with advanced computer knowledge. ▪ “Data Scientist: Person who is better at statistics than any software engineer and better at software engineering than any statistician” Josh Wills
  • 10. 10© Cloudera, Inc. All rights reserved. MLlib Collection of mainstream machine learning algorithms built on Spark Including: •Classifiers: logistic regression, boosted trees, random forests, etc •Clustering: k-means, Latent Dirichlet Allocation (LDA) •Recommender Systems: Alternating Least Squares •Dimensionality Reduction: Principal Component Analysis (PCA) and Singular Value Decomposition (SVD) •Feature Engineering & Selection: TF-IDF, Word2Vec, Normalizer, etc •Statistical Functions: Chi-Squared Test, Pearson Correlation, etc
  • 11. 11© Cloudera, Inc. All rights reserved. Data Science Track Info Data Science Location: Severn Matrix Decomposition at Scale Juliet Hougland, Data Scientist, Cloudera Large-scale Agent-Based Modeling and Simulation on High-Performance Computers Dr. Robert Axtell, George Mason University Random Decision Forests at Scale Todd Boetticher, Solutions Consultant, Cloudera
  • 12. 12© Cloudera, Inc. All rights reserved. 1 Recommended Training for Data Engineering Learn how to identify which tool is the right one to use in a given situation, and gain hands-on experience using those tools Cloudera University’s three-day course helps participants understand what data scientists do, the problems they solve, and the tools and techniques they use Learn how to increase the ROI from big data investments, by delivering faster time to insight for your organization. Apache Spark and Hadoop Data Science on Hadoop Cloudera Search
  • 13. 13© Cloudera, Inc. All rights reserved. Thank you
  翻译: