尊敬的 微信汇率:1円 ≈ 0.046089 元 支付宝汇率:1円 ≈ 0.04618元 [退出登录]
SlideShare a Scribd company logo
DataOps at
TripActions
27 Oct 2020
Agenda
04 - Dev/Test Flow
TA goals, customers, and team How our team builds and tests changes
02 - Data at TripActions 05 - Deployments
Data team objectives and architectural objectives How code moves into production and is monitored
03 - Infrastructure 06 - The Future
Architecture, Platforms, and tooling Future objectives in tooling and process
01 - Intro TripActions
TripActions Overview
4
OUR MISSION
To Move People,
Ideas &
Businesses
Forward
63% feel they have
to handle everything on
their own when
something goes wrong
83% spend over
an hour booking a trip
Built for the traveler by the traveler
6
and many more...
97%
Traveler adoption
34%
Hotel cost savings
1.5M
Hotel rooms
By the numbers
Managing travel for >4000
companies
Partners range from small
businesses to Fortune 100
companies in a variety of industries
Supporting more than a
million travellers
TripActions provides booking and
support services for all forms of
business and personal travel
800 employees around the
globe
Headquartered in California,
TripActions has offices around the
globe including Amsterdam
Data at TripActions
Who is the Data Team?
BI Palo Alto - 3 CA, 1 IL
BI-PA ● Product BI
● Liquid (credit card product) reporting and analysis
BI Amsterdam - 6 AMS
BI-AMS ● Operational BI (Customer Service, Success, Supply)
● Finance reporting and analysis
Data Science - 7 AMS, 1 Israel
DS
● Insights and analytics
● Predictive modelling
● Production ML services
Data Engineering - 4 AMS, 1 CA
DE
● Data integration
● Data warehousing
● Infrastructure
● Tooling
BI @ TripActions
Business Intelligence
Pillars
Standardized
Reporting
Training and
Development
Ad Hoc
Reporting and
Analytics
● >50% of company uses standard
reporting daily, >1000 daily report
views
● >65% of company has attended BI
training
● ~100 weekly self-service ad hoc
reports
Data Science
Personalizing User Experience Empowering Decision Making
Architecture and Infrastructure
Overall BI/Data Engineering Architecture
Additional Services
Data Flows
Pipelinewise
What is it?
● Extensible, “any source to any target”,
singer.io wrapper
● Provides stitch-like experience for job
management via yml definition files
● TripActions maintains a custom fork
that extends logging, metrics, and
functionality
dbt: Puts the T in ELT
Code Architecture - dbt
Data Warehouse
Core integration of all data for
concepts around users, activity,
finance, etc
● Basis for all reporting and
data science
● Provides rich, integrated
data
● Updated every 30
minutes
Event Models
“Big data” models to transform
raw events from logs and event
tracking into usable data
● Integrates ~15TB of data
from three event sources
● Enriches and normalizes
to a common data model
Reporting Marts
Denormalized reporting views
for BI reporting and self-service
● Underlies every Tableau
dashboard and >1400
self-service reports
Data Science
Data transformations to feed
into our ML analytics and
services
● Used to power every site
interaction via
personalized experiences
● Drives target setting and
operations planning
How We Develop and Test
Development approach
Work close to the
truth
Let analysts use real data
and directly test against
prod DWH to measure
impact
Make it easy to
validate, hard to fail
Tooling should make it hard
to make mistakes and easy
to commit with confidence
ALWAYS test and
document
No change should be
deployed without
documentation and tests in
place first
Rapid, high quality code changes
Combining tooling, process, and education allows anyone to continuously,
confidently make changes to core data models
Analyst/developer workflow
Begin with a Jira
issue
Most changes begin with
Jira tickets to track the
development and
manage stakeholder
communication
Analyst builds
change in dbt
All analysts and
collaborators are
proficient in dbt and
100% of transformations
are built using it. Tooling
makes it easy
Automated quality
review - local
All analysts use an
automated suite which
verifies transformations
and repeatability, runs
tests, and adds
documentation and new
tests - dbt validator
Automated quality
review - remote
Automated tests on the
PR check for general
code quality, formatting,
dependencies, etc
Guided PR review
and merge
PR processes allow
minimum waiting for
review and minimum
distraction for others
Analyst/developer workflow
Analyst builds
change in dbt
All analysts and
collaborators are
proficient in dbt and
100% of transformations
are built using it. Tooling
makes it easy
Automated quality
review - local
All analysts use an
automated suite which
verifies transformations
and repeatability, runs
tests, and adds
documentation and new
tests - dbt validator
Automated quality
review - remote
Automated tests on the
PR check for general
code quality, formatting,
dependencies, etc
Guided PR review
and merge
PR processes allow
minimum waiting for
review and minimum
distraction for others
Begin with a Jira
issue
Most changes begin with
Jira tickets to track the
development and
manage stakeholder
communication
dbt Development
Every user has their own
dev database
Prior to starting, analysts can
either clone tables or create
views to production for project
dependencies
All raw data can be modelled
and tested based on actual prod
data
Local quality review
Quality review is intended to check the
following areas:
1. Code runnability
2. Existing tests
3. Data quality
4. Documentation
5. New tests
Code quality testing and automated tests
● Code quality checks run in the
following ways
○ Changed table and all dependent tables in
project
○ If models are incrementally loaded,
incremental refreshes
● Tests run on the changed model and
all dependents
● Other projects are then checked for
potential dependencies
Data Validation - Manual
Documentation and new tests
Net result: Low work, high confidence in changes
PR Review
● PRs follow a standard structure and
labelling
○ Local testing report card becomes the body
of the PR
● Slack automation coordinates the
review
○ Notifies the reviewers of the new PR
○ Informs dev of change requests
○ Tracks and labels when the PR is approved
and then merged
Deployments and Monitoring
Deploying into Snowflake
● Changes in dbt models are detected
when a PR is merged
● Deploy processes kick off
automatically, running
○ The changed model
○ Dependent models (based on model type
and name)
● Global data dictionaries are updated
on server and google sheets with new
information
In depth: deployment evaluation process
What if it goes wrong?
Monitoring via automated testing
● All data tested every six hours
● Any failing tests posted to channel
● SQL added to a pastebin for easy troubleshooting
Looking to the Future
What is it?
● Standardized data profiling and
testing
● Alerting on changes in data quality or
structure
Planned integration at TripActions
● Directly generate test profiles and
configurations via pipelinewise
● Integration of great_expectations
tests and data directly into tadoc /
dbt docs
Pipelinewise 2.0
● Extend to “anywhere to anywhere”
functionality with standardized JSON
API importer functionality
● Source data discovery and reporting
to show analysts/DS new data objects
dbt Validator 2.0
● Smart, dynamic re-cloning of objects
into dev databases for faster testing
○ Cleanup functionality to prevent testing on
stale objects
○ Fast clone based on dbt DAG to accelerate
development
● Extended test capabilities including
custom tests and data validation ->
automated tests
● Automated reporting of BI
dependencies on marts and tables
Rob Winters | Director, Data | rwinters@tripactions.com
Thank you!

More Related Content

What's hot

𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
VINCI Digital - Industrial IoT (IIoT) Strategic Advisory
 
AI: Built to Scale
AI: Built to ScaleAI: Built to Scale
AI: Built to Scale
accenture
 
Improving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureImproving Data Literacy Around Data Architecture
Improving Data Literacy Around Data Architecture
DATAVERSITY
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
Jason Packer
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
ConfluentInc1
 
Data Mesh
Data MeshData Mesh
Observability & Datadog
Observability & DatadogObservability & Datadog
Observability & Datadog
JamesAnderson599331
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019
DataKitchen
 
Future of Data and AI in Retail - NRF 2023
Future of Data and AI in Retail - NRF 2023Future of Data and AI in Retail - NRF 2023
Future of Data and AI in Retail - NRF 2023
Rob Saker
 
data-analytics-strategy-ebook.pptx
data-analytics-strategy-ebook.pptxdata-analytics-strategy-ebook.pptx
data-analytics-strategy-ebook.pptx
MohamedHendawy17
 
The essential elements of a digital transformation strategy
The essential elements of a digital transformation strategyThe essential elements of a digital transformation strategy
The essential elements of a digital transformation strategy
Marcel Santilli
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
Maxim Salnikov
 
Microsoft power platform
Microsoft power platformMicrosoft power platform
Microsoft power platform
Jenkins NS
 
Implementing Effective Data Governance
Implementing Effective Data GovernanceImplementing Effective Data Governance
Implementing Effective Data Governance
Christopher Bradley
 
Airbyte - Series-B deck
Airbyte - Series-B deckAirbyte - Series-B deck
Airbyte - Series-B deck
Airbyte
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
DATAVERSITY
 
Modern Data Flow
Modern Data FlowModern Data Flow
Modern Data Flow
confluent
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 

What's hot (20)

𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
 
AI: Built to Scale
AI: Built to ScaleAI: Built to Scale
AI: Built to Scale
 
Improving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureImproving Data Literacy Around Data Architecture
Improving Data Literacy Around Data Architecture
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Observability & Datadog
Observability & DatadogObservability & Datadog
Observability & Datadog
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019
 
Future of Data and AI in Retail - NRF 2023
Future of Data and AI in Retail - NRF 2023Future of Data and AI in Retail - NRF 2023
Future of Data and AI in Retail - NRF 2023
 
data-analytics-strategy-ebook.pptx
data-analytics-strategy-ebook.pptxdata-analytics-strategy-ebook.pptx
data-analytics-strategy-ebook.pptx
 
The essential elements of a digital transformation strategy
The essential elements of a digital transformation strategyThe essential elements of a digital transformation strategy
The essential elements of a digital transformation strategy
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
 
Microsoft power platform
Microsoft power platformMicrosoft power platform
Microsoft power platform
 
Implementing Effective Data Governance
Implementing Effective Data GovernanceImplementing Effective Data Governance
Implementing Effective Data Governance
 
Airbyte - Series-B deck
Airbyte - Series-B deckAirbyte - Series-B deck
Airbyte - Series-B deck
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Modern Data Flow
Modern Data FlowModern Data Flow
Modern Data Flow
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 

Similar to Data Ops at TripActions

Srujana Unnam Microstrategy Profile
Srujana Unnam Microstrategy ProfileSrujana Unnam Microstrategy Profile
Srujana Unnam Microstrategy Profile
srujana unnam
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
sharpan
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform
Michael Ghen
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
AgileNetwork
 
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avullaPysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
Bilot
 
Architecting for analytics
Architecting for analyticsArchitecting for analytics
Architecting for analytics
Rob Winters
 
Anusaa_Qlikview
Anusaa_QlikviewAnusaa_Qlikview
Anusaa_Qlikview
anusha vemuri
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
sharpan
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics Platform
Arvind Sathi
 
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Piyush Kumar
 
Pradeep_ETL Testing_CV with 3 years of Exerience
Pradeep_ETL Testing_CV with 3 years of ExeriencePradeep_ETL Testing_CV with 3 years of Exerience
Pradeep_ETL Testing_CV with 3 years of Exerience
Pradeep Shahapur
 
Observability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxObservability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptx
OpsTree solutions
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Impetus Technologies
 
Confluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPointConfluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPoint
confluent
 
SANJAY_SINGH
SANJAY_SINGHSANJAY_SINGH
SANJAY_SINGH
SANJAY SINGH
 
Resume_Arun_Baby_03Jan17
Resume_Arun_Baby_03Jan17Resume_Arun_Baby_03Jan17
Resume_Arun_Baby_03Jan17
Arun Baby
 
Puneet Verma CV
Puneet Verma CVPuneet Verma CV
Puneet Verma CV
Puneet Verma
 
Abdul ETL Resume
Abdul ETL ResumeAbdul ETL Resume
Abdul ETL Resume
Abdul mohammed
 
Maximizing Your Data’s Potential: DOTs & DPWs Edition
Maximizing Your Data’s Potential: DOTs & DPWs EditionMaximizing Your Data’s Potential: DOTs & DPWs Edition
Maximizing Your Data’s Potential: DOTs & DPWs Edition
Safe Software
 
Copy of Alok_Singh_CV
Copy of Alok_Singh_CVCopy of Alok_Singh_CV
Copy of Alok_Singh_CV
Alok Singh
 

Similar to Data Ops at TripActions (20)

Srujana Unnam Microstrategy Profile
Srujana Unnam Microstrategy ProfileSrujana Unnam Microstrategy Profile
Srujana Unnam Microstrategy Profile
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
 
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avullaPysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
 
Architecting for analytics
Architecting for analyticsArchitecting for analytics
Architecting for analytics
 
Anusaa_Qlikview
Anusaa_QlikviewAnusaa_Qlikview
Anusaa_Qlikview
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics Platform
 
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
 
Pradeep_ETL Testing_CV with 3 years of Exerience
Pradeep_ETL Testing_CV with 3 years of ExeriencePradeep_ETL Testing_CV with 3 years of Exerience
Pradeep_ETL Testing_CV with 3 years of Exerience
 
Observability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptxObservability for Application Developers (1)-1.pptx
Observability for Application Developers (1)-1.pptx
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Confluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPointConfluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPoint
 
SANJAY_SINGH
SANJAY_SINGHSANJAY_SINGH
SANJAY_SINGH
 
Resume_Arun_Baby_03Jan17
Resume_Arun_Baby_03Jan17Resume_Arun_Baby_03Jan17
Resume_Arun_Baby_03Jan17
 
Puneet Verma CV
Puneet Verma CVPuneet Verma CV
Puneet Verma CV
 
Abdul ETL Resume
Abdul ETL ResumeAbdul ETL Resume
Abdul ETL Resume
 
Maximizing Your Data’s Potential: DOTs & DPWs Edition
Maximizing Your Data’s Potential: DOTs & DPWs EditionMaximizing Your Data’s Potential: DOTs & DPWs Edition
Maximizing Your Data’s Potential: DOTs & DPWs Edition
 
Copy of Alok_Singh_CV
Copy of Alok_Singh_CVCopy of Alok_Singh_CV
Copy of Alok_Singh_CV
 

More from Rob Winters

A brief history of data warehousing
A brief history of data warehousingA brief history of data warehousing
A brief history of data warehousing
Rob Winters
 
Building data "Py-pelines"
Building data "Py-pelines"Building data "Py-pelines"
Building data "Py-pelines"
Rob Winters
 
Building a Personalized Offer Using Machine Learning
Building a Personalized Offer Using Machine LearningBuilding a Personalized Offer Using Machine Learning
Building a Personalized Offer Using Machine Learning
Rob Winters
 
Architecting for Real-Time Big Data Analytics
Architecting for Real-Time Big Data AnalyticsArchitecting for Real-Time Big Data Analytics
Architecting for Real-Time Big Data Analytics
Rob Winters
 
Design Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseDesign Principles for a Modern Data Warehouse
Design Principles for a Modern Data Warehouse
Rob Winters
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
Rob Winters
 
Data Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfData Vault Automation at the Bijenkorf
Data Vault Automation at the Bijenkorf
Rob Winters
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big Data
Rob Winters
 
Getting Started with Big Data Analytics
Getting Started with Big Data AnalyticsGetting Started with Big Data Analytics
Getting Started with Big Data Analytics
Rob Winters
 
Billions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right NowBillions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right Now
Rob Winters
 
Tableau @ Spil Games
Tableau @ Spil GamesTableau @ Spil Games
Tableau @ Spil Games
Rob Winters
 

More from Rob Winters (11)

A brief history of data warehousing
A brief history of data warehousingA brief history of data warehousing
A brief history of data warehousing
 
Building data "Py-pelines"
Building data "Py-pelines"Building data "Py-pelines"
Building data "Py-pelines"
 
Building a Personalized Offer Using Machine Learning
Building a Personalized Offer Using Machine LearningBuilding a Personalized Offer Using Machine Learning
Building a Personalized Offer Using Machine Learning
 
Architecting for Real-Time Big Data Analytics
Architecting for Real-Time Big Data AnalyticsArchitecting for Real-Time Big Data Analytics
Architecting for Real-Time Big Data Analytics
 
Design Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseDesign Principles for a Modern Data Warehouse
Design Principles for a Modern Data Warehouse
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
 
Data Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfData Vault Automation at the Bijenkorf
Data Vault Automation at the Bijenkorf
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big Data
 
Getting Started with Big Data Analytics
Getting Started with Big Data AnalyticsGetting Started with Big Data Analytics
Getting Started with Big Data Analytics
 
Billions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right NowBillions of Rows, Millions of Insights, Right Now
Billions of Rows, Millions of Insights, Right Now
 
Tableau @ Spil Games
Tableau @ Spil GamesTableau @ Spil Games
Tableau @ Spil Games
 

Recently uploaded

Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls HyderabadHyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
binna singh$A17
 
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
gebegu
 
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
meenusingh4354543
 
Hot Call Girls In Bangalore 🔥 9352988975 🔥 Real Fun With Sexual Girl Availabl...
Hot Call Girls In Bangalore 🔥 9352988975 🔥 Real Fun With Sexual Girl Availabl...Hot Call Girls In Bangalore 🔥 9352988975 🔥 Real Fun With Sexual Girl Availabl...
Hot Call Girls In Bangalore 🔥 9352988975 🔥 Real Fun With Sexual Girl Availabl...
nainasharmans346
 
Call Girls Lucknow 8923113531 Independent Call Girl Service in Lucknow
Call Girls Lucknow 8923113531 Independent Call Girl Service in LucknowCall Girls Lucknow 8923113531 Independent Call Girl Service in Lucknow
Call Girls Lucknow 8923113531 Independent Call Girl Service in Lucknow
hiju9823
 
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering RoadshowFabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Gabi Münster
 
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
PsychoTech Services
 
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
jasodak99
 
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOWAI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
arash10gamer
 
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENTHigh Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
ranjeet3341
 
Bangalore Call Girls ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
Bangalore Call Girls  ♠ 9079923931 ♠ Beautiful Call Girls In BangaloreBangalore Call Girls  ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
Bangalore Call Girls ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
yashusingh54876
 
machine learning notes by Andrew Ng and Tengyu Ma
machine learning notes by Andrew Ng and Tengyu Mamachine learning notes by Andrew Ng and Tengyu Ma
machine learning notes by Andrew Ng and Tengyu Ma
Vijayabaskar Uthirapathy
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
sapna sharmap11
 
❻❸❼⓿❽❻❷⓿⓿❼KALYAN MATKA CHART FINAL OPEN JODI PANNA FIXXX DPBOSS MATKA RESULT ...
❻❸❼⓿❽❻❷⓿⓿❼KALYAN MATKA CHART FINAL OPEN JODI PANNA FIXXX DPBOSS MATKA RESULT ...❻❸❼⓿❽❻❷⓿⓿❼KALYAN MATKA CHART FINAL OPEN JODI PANNA FIXXX DPBOSS MATKA RESULT ...
❻❸❼⓿❽❻❷⓿⓿❼KALYAN MATKA CHART FINAL OPEN JODI PANNA FIXXX DPBOSS MATKA RESULT ...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
IBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTXIBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTX
EbtsamRashed
 
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
nitachopra
 
🔥Call Girl Price Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servic...
🔥Call Girl Price Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servic...🔥Call Girl Price Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servic...
🔥Call Girl Price Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servic...
Ak47
 
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
mparmparousiskostas
 
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
Ak47
 
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 

Recently uploaded (20)

Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls HyderabadHyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
 
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
 
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
Erotic Call Girls Hyderabad🫱9352988975🫲 High Quality Call Girl Service Right ...
 
Hot Call Girls In Bangalore 🔥 9352988975 🔥 Real Fun With Sexual Girl Availabl...
Hot Call Girls In Bangalore 🔥 9352988975 🔥 Real Fun With Sexual Girl Availabl...Hot Call Girls In Bangalore 🔥 9352988975 🔥 Real Fun With Sexual Girl Availabl...
Hot Call Girls In Bangalore 🔥 9352988975 🔥 Real Fun With Sexual Girl Availabl...
 
Call Girls Lucknow 8923113531 Independent Call Girl Service in Lucknow
Call Girls Lucknow 8923113531 Independent Call Girl Service in LucknowCall Girls Lucknow 8923113531 Independent Call Girl Service in Lucknow
Call Girls Lucknow 8923113531 Independent Call Girl Service in Lucknow
 
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering RoadshowFabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
 
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
Essential Skills for Family Assessment - Marital and Family Therapy and Couns...
 
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
 
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOWAI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
 
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENTHigh Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
 
Bangalore Call Girls ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
Bangalore Call Girls  ♠ 9079923931 ♠ Beautiful Call Girls In BangaloreBangalore Call Girls  ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
Bangalore Call Girls ♠ 9079923931 ♠ Beautiful Call Girls In Bangalore
 
machine learning notes by Andrew Ng and Tengyu Ma
machine learning notes by Andrew Ng and Tengyu Mamachine learning notes by Andrew Ng and Tengyu Ma
machine learning notes by Andrew Ng and Tengyu Ma
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
 
❻❸❼⓿❽❻❷⓿⓿❼KALYAN MATKA CHART FINAL OPEN JODI PANNA FIXXX DPBOSS MATKA RESULT ...
❻❸❼⓿❽❻❷⓿⓿❼KALYAN MATKA CHART FINAL OPEN JODI PANNA FIXXX DPBOSS MATKA RESULT ...❻❸❼⓿❽❻❷⓿⓿❼KALYAN MATKA CHART FINAL OPEN JODI PANNA FIXXX DPBOSS MATKA RESULT ...
❻❸❼⓿❽❻❷⓿⓿❼KALYAN MATKA CHART FINAL OPEN JODI PANNA FIXXX DPBOSS MATKA RESULT ...
 
IBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTXIBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTX
 
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
 
🔥Call Girl Price Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servic...
🔥Call Girl Price Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servic...🔥Call Girl Price Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servic...
🔥Call Girl Price Pune 💯Call Us 🔝 7014168258 🔝💃Independent Pune Escorts Servic...
 
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
 
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
9711199012⎷❤✨ Call Girls RK Puram Special Price with a special young
 
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
satta matka Dpboss guessing Kalyan matka Today Kalyan Panel Chart Kalyan Jodi...
 

Data Ops at TripActions

  • 2. Agenda 04 - Dev/Test Flow TA goals, customers, and team How our team builds and tests changes 02 - Data at TripActions 05 - Deployments Data team objectives and architectural objectives How code moves into production and is monitored 03 - Infrastructure 06 - The Future Architecture, Platforms, and tooling Future objectives in tooling and process 01 - Intro TripActions
  • 4. 4 OUR MISSION To Move People, Ideas & Businesses Forward
  • 5. 63% feel they have to handle everything on their own when something goes wrong 83% spend over an hour booking a trip
  • 6. Built for the traveler by the traveler 6 and many more... 97% Traveler adoption 34% Hotel cost savings 1.5M Hotel rooms
  • 7. By the numbers Managing travel for >4000 companies Partners range from small businesses to Fortune 100 companies in a variety of industries Supporting more than a million travellers TripActions provides booking and support services for all forms of business and personal travel 800 employees around the globe Headquartered in California, TripActions has offices around the globe including Amsterdam
  • 9. Who is the Data Team? BI Palo Alto - 3 CA, 1 IL BI-PA ● Product BI ● Liquid (credit card product) reporting and analysis BI Amsterdam - 6 AMS BI-AMS ● Operational BI (Customer Service, Success, Supply) ● Finance reporting and analysis Data Science - 7 AMS, 1 Israel DS ● Insights and analytics ● Predictive modelling ● Production ML services Data Engineering - 4 AMS, 1 CA DE ● Data integration ● Data warehousing ● Infrastructure ● Tooling
  • 10. BI @ TripActions Business Intelligence Pillars Standardized Reporting Training and Development Ad Hoc Reporting and Analytics ● >50% of company uses standard reporting daily, >1000 daily report views ● >65% of company has attended BI training ● ~100 weekly self-service ad hoc reports
  • 11. Data Science Personalizing User Experience Empowering Decision Making
  • 13. Overall BI/Data Engineering Architecture Additional Services
  • 15. Pipelinewise What is it? ● Extensible, “any source to any target”, singer.io wrapper ● Provides stitch-like experience for job management via yml definition files ● TripActions maintains a custom fork that extends logging, metrics, and functionality
  • 16. dbt: Puts the T in ELT
  • 17. Code Architecture - dbt Data Warehouse Core integration of all data for concepts around users, activity, finance, etc ● Basis for all reporting and data science ● Provides rich, integrated data ● Updated every 30 minutes Event Models “Big data” models to transform raw events from logs and event tracking into usable data ● Integrates ~15TB of data from three event sources ● Enriches and normalizes to a common data model Reporting Marts Denormalized reporting views for BI reporting and self-service ● Underlies every Tableau dashboard and >1400 self-service reports Data Science Data transformations to feed into our ML analytics and services ● Used to power every site interaction via personalized experiences ● Drives target setting and operations planning
  • 18. How We Develop and Test
  • 19. Development approach Work close to the truth Let analysts use real data and directly test against prod DWH to measure impact Make it easy to validate, hard to fail Tooling should make it hard to make mistakes and easy to commit with confidence ALWAYS test and document No change should be deployed without documentation and tests in place first Rapid, high quality code changes Combining tooling, process, and education allows anyone to continuously, confidently make changes to core data models
  • 20. Analyst/developer workflow Begin with a Jira issue Most changes begin with Jira tickets to track the development and manage stakeholder communication Analyst builds change in dbt All analysts and collaborators are proficient in dbt and 100% of transformations are built using it. Tooling makes it easy Automated quality review - local All analysts use an automated suite which verifies transformations and repeatability, runs tests, and adds documentation and new tests - dbt validator Automated quality review - remote Automated tests on the PR check for general code quality, formatting, dependencies, etc Guided PR review and merge PR processes allow minimum waiting for review and minimum distraction for others
  • 21. Analyst/developer workflow Analyst builds change in dbt All analysts and collaborators are proficient in dbt and 100% of transformations are built using it. Tooling makes it easy Automated quality review - local All analysts use an automated suite which verifies transformations and repeatability, runs tests, and adds documentation and new tests - dbt validator Automated quality review - remote Automated tests on the PR check for general code quality, formatting, dependencies, etc Guided PR review and merge PR processes allow minimum waiting for review and minimum distraction for others Begin with a Jira issue Most changes begin with Jira tickets to track the development and manage stakeholder communication
  • 22. dbt Development Every user has their own dev database Prior to starting, analysts can either clone tables or create views to production for project dependencies All raw data can be modelled and tested based on actual prod data
  • 23. Local quality review Quality review is intended to check the following areas: 1. Code runnability 2. Existing tests 3. Data quality 4. Documentation 5. New tests
  • 24. Code quality testing and automated tests ● Code quality checks run in the following ways ○ Changed table and all dependent tables in project ○ If models are incrementally loaded, incremental refreshes ● Tests run on the changed model and all dependents ● Other projects are then checked for potential dependencies
  • 27. Net result: Low work, high confidence in changes
  • 28. PR Review ● PRs follow a standard structure and labelling ○ Local testing report card becomes the body of the PR ● Slack automation coordinates the review ○ Notifies the reviewers of the new PR ○ Informs dev of change requests ○ Tracks and labels when the PR is approved and then merged
  • 30. Deploying into Snowflake ● Changes in dbt models are detected when a PR is merged ● Deploy processes kick off automatically, running ○ The changed model ○ Dependent models (based on model type and name) ● Global data dictionaries are updated on server and google sheets with new information
  • 31. In depth: deployment evaluation process
  • 32. What if it goes wrong?
  • 33. Monitoring via automated testing ● All data tested every six hours ● Any failing tests posted to channel ● SQL added to a pastebin for easy troubleshooting
  • 34. Looking to the Future
  • 35. What is it? ● Standardized data profiling and testing ● Alerting on changes in data quality or structure Planned integration at TripActions ● Directly generate test profiles and configurations via pipelinewise ● Integration of great_expectations tests and data directly into tadoc / dbt docs
  • 36. Pipelinewise 2.0 ● Extend to “anywhere to anywhere” functionality with standardized JSON API importer functionality ● Source data discovery and reporting to show analysts/DS new data objects
  • 37. dbt Validator 2.0 ● Smart, dynamic re-cloning of objects into dev databases for faster testing ○ Cleanup functionality to prevent testing on stale objects ○ Fast clone based on dbt DAG to accelerate development ● Extended test capabilities including custom tests and data validation -> automated tests ● Automated reporting of BI dependencies on marts and tables
  • 38. Rob Winters | Director, Data | rwinters@tripactions.com Thank you!
  翻译: