尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
SKILLWISE-ADVANCED INTEGRATION
SERVICES ON MICROSOFT SSIS
Advanced Integration
Services on Microsoft
SSIS
What is the Problem?
• Resource Contention
• Unreliable access to historical data
• Inconsistent application of
business rules
• Data structure results in slower,
more complex queries
The Data Warehouse Solution
• Initial Load
• Ongoing Load
Dimensional Modeling
Fact Table Characteristics
Dimension Table
Characteristics
Primary key Surrogate Key or
YYYYMMDD for Date Dimension
Business or Natural Key
Dimension columns: Labels, grouping,
filtering, sorting, drilling All data types
allowed
Data Profiling
ETL Design Patterns
Create snapshot of
database to simplify
error recovery
Master Extract Package
Master Transform-Load Package
Extract Package
Truncate staging before extraction
Add record to audit table
Extract data from source and load into staging table
Count records in staging table
Update audit record
Extract Data Flow
Connect to source
Store extracted row count in variable
Load extracted records into staging table
Store error row count in variable
Save error records
Load Patterns
Extract only new records
and load into transactional
fact table
Extract all records
Exists? Update if changes (maybe)
Fact Loads Dimension Loads
Fact Extract for Ongoing Load
@[User::ExtractSQL]
=@[$Project::initialLoad] ?
@[User::ExtractSQL] :
@[User::ExtractSQL] + " where
OrderDate > '" + (DT_WSTR, 50)
(DT_DBTIMESTAMP)
@[User::MaxDateTime] + "' "
Create SQL statement for
extract based on initialLoad
value
Look up max date for related fact table,
store in MaxDateTime variable
Slowly Changing Dimensions
• Type 0
• Type 1
• Type 2
Add new record
Expire old record
After LoadBefore Load
Dimension Load Pattern: Type
0
Connect to staging table
Store row count from staging in variable
Add AuditKey into pipeline
Transformation – varies by dimension
Test for dimension record existence with Lookup
Store row count for records to insert in variable
Load record into dimension table
Store error row count in variable
Save error records
Dimension Load Pattern: Type
1 or Type 2
Standard error handling
Update onlyExpire old record
Transformations here
Slowly Changing
Dimension
Transformation
Combine new & Type 2Add Start Date Insert new & Type 2
Connect to staging
Capture row count
Add audit key
Other Slowly Changing
Dimension Patterns
• Merge
• Change Data Capture (CD)
• Custom Components
Dimension Load Pattern
Extract data from staging and load into
dimension table
Count records in dimension table
Update audit record
Add record to audit table
Count records in dimension table
Fact Table Load Pattern
Capture and store last date processed
Fact load & error handling
Surrogate key lookups
Connect to staging
Add audit key
Analysis Services Objects
T-SQL MERGE Statement
• Datawarehouse Dimension Table
– Type 1 SCD
MERGE dw.DimProductAS target
USING (
SELECT Name, . . .
FROM tmp.scdProduct) AS source
ON target.ProductAlternateKey= source.ProductNumber
AND target.Status= 'Current'
WHEN MATCHED AND NOT (Source.Name=
ISNULL(target.EnglishProductName, ''))
THEN
UPDATE
SET target.EnglishProductName= source.Name
WHEN NOT MATCHED BY target AND source.SellEndDateIS NULL
THEN
INSERT (ProductAlternateKey, ProductSubcategoryKey, . . .)
VALUES (source.ProductNumber, source.ProductSubcategoryKey,
. . .)
OUTPUT $action as MergeAction, source.*
Type 1 SCD
MERGE dw.DimProductAS target
USING (
SELECT Name, . . .
FROM tmp.scdProduct) AS source
ON target.ProductAlternateKey= source.ProductNumber
AND target.Status= 'Current'
WHEN MATCHED AND NOT (Source.Name=
ISNULL(target.EnglishProductName, ''))
THEN
UPDATE
SET target.EnglishProductName= source.Name
WHEN NOT MATCHED BY target AND source.SellEndDateIS NULL
THEN
INSERT (ProductAlternateKey, ProductSubcategoryKey, . . .)
VALUES (source.ProductNumber,
source.ProductSubcategoryKey, . . .)
OUTPUT $action as MergeAction, source.*
Define join between tables:
Business key
Status indicator or date range
• Type 2 SCD
MERGE dw.DimProductAS target
USING (
SELECT Name, . . .
FROM tmp.scdProduct) AS source
ON target.ProductAlternateKey=
source.ProductNumber
AND target.Status= 'Current'
WHEN MATCHED AND NOT (Source.ListPrice=
ISNULL(target.ListPrice, ''))
THEN
UPDATE
SET target.Status= NULL, target.EndDate=
GETDATE()
WHEN NOT MATCHED BY target AND
source.SellEndDateIS NULL
THEN
INSERT (ProductAlternateKey,
ProductSubcategoryKey, . . .)
VALUES (source.ProductNumber,
source.ProductSubcategoryKey, . . .)
OUTPUT $action as MergeAction, source.*
Auditing and Type 2 Inserts
CREATE TABLE #DimProduct(MergeAction
NVARCHAR(10), Name NVARCHAR(50), . . .)
INSERT INTO #DimProduct
SELECT * FROM (
MERGE dw.DimProductAS target . . . ) mergeOutput
INSERT INTO dw.DimProduct
SELECT Name, . . . FROM #DimProduct
WHERE MergeAction= 'UPDATE'
SELECT
SUM(CASE WHEN MergeAction= 'INSERT'THEN 1 ELSE 0
END)
AS RowCountInsert,
SUM(CASE WHEN MergeAction= 'UPDATE'THEN 1 ELSE 0
END)
AS RowCountUpdate
FROM #DimProduct
SELECT
SUM(CASE WHEN MergeActionIN ('INSERT','UPDATE')
THEN 1 ELSE 0 END)
AS RowCountInsert,
SUM(CASE WHEN MergeAction= 'UPDATE'THEN 1 ELSE 0
END)
AS RowCountUpdate
FROM #DimProduct
Type 2 Inserts
Type
1
Audit
ing
Type 2
Auditi
ng
Error Handling
BEGIN TRY
BEGIN TRANSACTION
CREATE TABLE #DimProduct(MergeActionNVARCHAR(10),
Name NVARCHAR(50), . . .)
INSERT INTO #DimProduct
SELECT * FROM (
MERGE dw.DimProductAS target . . . ) mergeOutput
SELECT
SUM(CASE WHEN MergeAction= 'INSERT' THEN 1 ELSE 0
END)
AS RowCountInsert,
SUM(CASE WHEN MergeAction= 'UPDATE'THEN 1 ELSE 0 END)
AS RowCountUpdate
FROM #DimProduct
COMMIT
END TRY
BEGIN CATCH
. . . <Error Handling code> . . .
END CATCH
Dimension Load Pattern with MERGE
Change Data Capture (CDC)
• In Enterprise Edition
DATA CLEANING
Types of Dirty Data
• Column Problems
• Record Problems
• Business Rule Problems
SQL Query or Integration
Services?
But…. What if the source isn’t
relational? What if the
resulting query is very
complex?
But…. Packages can be
tedious to build However,
logging or data viewer
helpful
Column Problems: Missing
Data Default
• Problem: Missing Data
• Solution: Add Default
Column Problems: Derive
Missing Data
• Problem: Missing Data
• Solution: Use Lookup Query to
Derive Data
Column Problems:
Translation
• Problem: Code to Translate
• Solution: Lookup (optionally with
Cache Transform)
Columns Problems: Data Type
• Problem: Incompatible Data Types
• Solution: Data Conversion
Column Problems: Truncation
• Problem: Input Column Size > Destination
Column Size
• Solution 1: Derived Column with Left() or
Substring()
– But…Consider the impact on downstream
operations
• Solution 2: Flag Record for Manual
Intervention
Record Problems: Missing Dimension
Data
• Problem: Lookup Failures
– Fail
– Flag
– Fix
• Solution: Inferred Members
– Option 1: Insert unmatched records in advance of fact
load
– Option 2: Use script component to insert record and
return key
– Option 3: Use partial cache lookup task
Record Problems: Lookup Failures
• Problem: Inconsistent Data Across Sources
• Solution: Fuzzy Lookup Transformation
Record Problems: Duplicate Data
• Problem: Similar Data in Same Data Set
• Solution: Fuzzy Grouping Transformation
– Enterprise Edition Only
Business Rule Problems: Out of Range
Values
• Problem: Invalid Values
• Solution: Conditional Split to Flag
Data Quality Services
• DQS Connection Manager
• DQS Cleansing Transformation
Advanced integration services on microsoft ssis 1

More Related Content

What's hot

Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
SolarWinds
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testing
Vibrant Event
 
SQL Server 2016 novelties
SQL Server 2016 noveltiesSQL Server 2016 novelties
SQL Server 2016 novelties
MSDEVMTL
 
Stored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sqlStored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sql
nishantdavid9
 
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
ITCamp
 
Uncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony DavisUncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony Davis
Red Gate Software
 
Maximize Dynamics AX System Performance with a Health Check
Maximize Dynamics AX System Performance with a Health CheckMaximize Dynamics AX System Performance with a Health Check
Maximize Dynamics AX System Performance with a Health Check
Stoneridge Software
 
World2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverviewWorld2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverview
Farah Omer
 
Evolutionary database design
Evolutionary database designEvolutionary database design
Evolutionary database design
Salehein Syed
 
ETL Testing Overview
ETL Testing OverviewETL Testing Overview
ETL Testing Overview
Chetan Gadodia
 
World2016_T5_S5_SQLServerFunctionalOverview
World2016_T5_S5_SQLServerFunctionalOverviewWorld2016_T5_S5_SQLServerFunctionalOverview
World2016_T5_S5_SQLServerFunctionalOverview
Farah Omer
 
OLAP
OLAPOLAP
OLAP
Ashir Ali
 
Top 10 tips for Oracle performance (Updated April 2015)
Top 10 tips for Oracle performance (Updated April 2015)Top 10 tips for Oracle performance (Updated April 2015)
Top 10 tips for Oracle performance (Updated April 2015)
Guy Harrison
 
How we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayHow we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the way
Grega Kespret
 
The olap tutorial 2012
The olap tutorial 2012The olap tutorial 2012
The olap tutorial 2012
Amin Jalali
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
Uday Kothari
 
Optimization SQL Server for Dynamics AX 2012 R3
Optimization SQL Server for Dynamics AX 2012 R3Optimization SQL Server for Dynamics AX 2012 R3
Optimization SQL Server for Dynamics AX 2012 R3
Juan Fabian
 
Teradata 13.10
Teradata 13.10Teradata 13.10
Teradata 13.10
Teradata
 
SQL Server 2016 Editions
SQL Server 2016 Editions SQL Server 2016 Editions
SQL Server 2016 Editions
Onomi
 
Practical examples of using extended events
Practical examples of using extended eventsPractical examples of using extended events
Practical examples of using extended events
Dean Richards
 

What's hot (20)

Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
Getting the most out of your Oracle 12.2 Optimizer (i.e. The Brain)
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testing
 
SQL Server 2016 novelties
SQL Server 2016 noveltiesSQL Server 2016 novelties
SQL Server 2016 novelties
 
Stored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sqlStored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sql
 
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
 
Uncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony DavisUncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony Davis
 
Maximize Dynamics AX System Performance with a Health Check
Maximize Dynamics AX System Performance with a Health CheckMaximize Dynamics AX System Performance with a Health Check
Maximize Dynamics AX System Performance with a Health Check
 
World2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverviewWorld2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverview
 
Evolutionary database design
Evolutionary database designEvolutionary database design
Evolutionary database design
 
ETL Testing Overview
ETL Testing OverviewETL Testing Overview
ETL Testing Overview
 
World2016_T5_S5_SQLServerFunctionalOverview
World2016_T5_S5_SQLServerFunctionalOverviewWorld2016_T5_S5_SQLServerFunctionalOverview
World2016_T5_S5_SQLServerFunctionalOverview
 
OLAP
OLAPOLAP
OLAP
 
Top 10 tips for Oracle performance (Updated April 2015)
Top 10 tips for Oracle performance (Updated April 2015)Top 10 tips for Oracle performance (Updated April 2015)
Top 10 tips for Oracle performance (Updated April 2015)
 
How we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayHow we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the way
 
The olap tutorial 2012
The olap tutorial 2012The olap tutorial 2012
The olap tutorial 2012
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
 
Optimization SQL Server for Dynamics AX 2012 R3
Optimization SQL Server for Dynamics AX 2012 R3Optimization SQL Server for Dynamics AX 2012 R3
Optimization SQL Server for Dynamics AX 2012 R3
 
Teradata 13.10
Teradata 13.10Teradata 13.10
Teradata 13.10
 
SQL Server 2016 Editions
SQL Server 2016 Editions SQL Server 2016 Editions
SQL Server 2016 Editions
 
Practical examples of using extended events
Practical examples of using extended eventsPractical examples of using extended events
Practical examples of using extended events
 

Similar to Advanced integration services on microsoft ssis 1

Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
skymusic
 
Analysis Services en SQL Server 2008
Analysis Services en SQL Server 2008Analysis Services en SQL Server 2008
Analysis Services en SQL Server 2008
Eduardo Castro
 
Ssis event handler
Ssis event handlerSsis event handler
Ssis event handler
Kiki Noviandi
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Grega Kespret
 
MMYERS Portfolio
MMYERS PortfolioMMYERS Portfolio
MMYERS Portfolio
Mike Myers
 
Kevin Bengtson Portfolio
Kevin Bengtson PortfolioKevin Bengtson Portfolio
Kevin Bengtson Portfolio
Kbengt521
 
Rodney Matejek Portfolio
Rodney Matejek PortfolioRodney Matejek Portfolio
Rodney Matejek Portfolio
rmatejek
 
Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
eileensauer
 
Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
eileensauer
 
Custom Star Creation for Ellucain's Enterprise Data Warehouse
Custom Star Creation for Ellucain's Enterprise Data WarehouseCustom Star Creation for Ellucain's Enterprise Data Warehouse
Custom Star Creation for Ellucain's Enterprise Data Warehouse
Bryan L. Mack
 
Sql server T-sql basics ppt-3
Sql server T-sql basics  ppt-3Sql server T-sql basics  ppt-3
Sql server T-sql basics ppt-3
Vibrant Technologies & Computers
 
Avoiding cursors with sql server 2005 tech republic
Avoiding cursors with sql server 2005   tech republicAvoiding cursors with sql server 2005   tech republic
Avoiding cursors with sql server 2005 tech republic
Kaing Menglieng
 
Introduction to SQLite in Adobe AIR
Introduction to SQLite in Adobe AIRIntroduction to SQLite in Adobe AIR
Introduction to SQLite in Adobe AIR
Peter Elst
 
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
Jürgen Ambrosi
 
How Clean is your Database? Data Scrubbing for all Skill Sets
How Clean is your Database? Data Scrubbing for all Skill SetsHow Clean is your Database? Data Scrubbing for all Skill Sets
How Clean is your Database? Data Scrubbing for all Skill Sets
Chad Petrovay
 
Running Databases on AWS
Running Databases on AWSRunning Databases on AWS
Running Databases on AWS
Amazon Web Services
 
Database Refactoring Sreeni Ananthakrishna 2006 Nov
Database Refactoring Sreeni Ananthakrishna 2006 NovDatabase Refactoring Sreeni Ananthakrishna 2006 Nov
Database Refactoring Sreeni Ananthakrishna 2006 Nov
melbournepatterns
 
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
RTTS
 
ETL
ETL ETL
ETL
butest
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Databricks
 

Similar to Advanced integration services on microsoft ssis 1 (20)

Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
 
Analysis Services en SQL Server 2008
Analysis Services en SQL Server 2008Analysis Services en SQL Server 2008
Analysis Services en SQL Server 2008
 
Ssis event handler
Ssis event handlerSsis event handler
Ssis event handler
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
MMYERS Portfolio
MMYERS PortfolioMMYERS Portfolio
MMYERS Portfolio
 
Kevin Bengtson Portfolio
Kevin Bengtson PortfolioKevin Bengtson Portfolio
Kevin Bengtson Portfolio
 
Rodney Matejek Portfolio
Rodney Matejek PortfolioRodney Matejek Portfolio
Rodney Matejek Portfolio
 
Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
 
Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
 
Custom Star Creation for Ellucain's Enterprise Data Warehouse
Custom Star Creation for Ellucain's Enterprise Data WarehouseCustom Star Creation for Ellucain's Enterprise Data Warehouse
Custom Star Creation for Ellucain's Enterprise Data Warehouse
 
Sql server T-sql basics ppt-3
Sql server T-sql basics  ppt-3Sql server T-sql basics  ppt-3
Sql server T-sql basics ppt-3
 
Avoiding cursors with sql server 2005 tech republic
Avoiding cursors with sql server 2005   tech republicAvoiding cursors with sql server 2005   tech republic
Avoiding cursors with sql server 2005 tech republic
 
Introduction to SQLite in Adobe AIR
Introduction to SQLite in Adobe AIRIntroduction to SQLite in Adobe AIR
Introduction to SQLite in Adobe AIR
 
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
 
How Clean is your Database? Data Scrubbing for all Skill Sets
How Clean is your Database? Data Scrubbing for all Skill SetsHow Clean is your Database? Data Scrubbing for all Skill Sets
How Clean is your Database? Data Scrubbing for all Skill Sets
 
Running Databases on AWS
Running Databases on AWSRunning Databases on AWS
Running Databases on AWS
 
Database Refactoring Sreeni Ananthakrishna 2006 Nov
Database Refactoring Sreeni Ananthakrishna 2006 NovDatabase Refactoring Sreeni Ananthakrishna 2006 Nov
Database Refactoring Sreeni Ananthakrishna 2006 Nov
 
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
 
ETL
ETL ETL
ETL
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
 

More from Skillwise Group

Skillwise Consulting New updated
Skillwise Consulting New updatedSkillwise Consulting New updated
Skillwise Consulting New updated
Skillwise Group
 
Email Etiquette
Email Etiquette Email Etiquette
Email Etiquette
Skillwise Group
 
Healthcare profile
Healthcare profileHealthcare profile
Healthcare profile
Skillwise Group
 
Manufacturing courses
Manufacturing coursesManufacturing courses
Manufacturing courses
Skillwise Group
 
Retailing & logistics profile
Retailing & logistics profileRetailing & logistics profile
Retailing & logistics profile
Skillwise Group
 
Skillwise orientation
Skillwise orientationSkillwise orientation
Skillwise orientation
Skillwise Group
 
Overview- Skillwise Consulting
Overview- Skillwise Consulting Overview- Skillwise Consulting
Overview- Skillwise Consulting
Skillwise Group
 
Skillwise corporate presentation
Skillwise corporate presentationSkillwise corporate presentation
Skillwise corporate presentation
Skillwise Group
 
Skillwise Profile
Skillwise ProfileSkillwise Profile
Skillwise Profile
Skillwise Group
 
Skillwise Softskill Training Workshop
Skillwise Softskill Training WorkshopSkillwise Softskill Training Workshop
Skillwise Softskill Training Workshop
Skillwise Group
 
Skillwise Insurance profile
Skillwise Insurance profileSkillwise Insurance profile
Skillwise Insurance profile
Skillwise Group
 
Skillwise Train and Hire Services
Skillwise Train and Hire ServicesSkillwise Train and Hire Services
Skillwise Train and Hire Services
Skillwise Group
 
Skillwise Digital Technology
Skillwise Digital Technology Skillwise Digital Technology
Skillwise Digital Technology
Skillwise Group
 
Skillwise Boot Camp Training
Skillwise Boot Camp TrainingSkillwise Boot Camp Training
Skillwise Boot Camp Training
Skillwise Group
 
Skillwise Academy Profile
Skillwise Academy ProfileSkillwise Academy Profile
Skillwise Academy Profile
Skillwise Group
 
Skillwise Overview
Skillwise OverviewSkillwise Overview
Skillwise Overview
Skillwise Group
 
SKILLWISE - OOPS CONCEPT
SKILLWISE - OOPS CONCEPTSKILLWISE - OOPS CONCEPT
SKILLWISE - OOPS CONCEPT
Skillwise Group
 
Skillwise - Business writing
Skillwise - Business writing Skillwise - Business writing
Skillwise - Business writing
Skillwise Group
 
Imc.ppt
Imc.pptImc.ppt
Skillwise cics part 1
Skillwise cics part 1Skillwise cics part 1
Skillwise cics part 1
Skillwise Group
 

More from Skillwise Group (20)

Skillwise Consulting New updated
Skillwise Consulting New updatedSkillwise Consulting New updated
Skillwise Consulting New updated
 
Email Etiquette
Email Etiquette Email Etiquette
Email Etiquette
 
Healthcare profile
Healthcare profileHealthcare profile
Healthcare profile
 
Manufacturing courses
Manufacturing coursesManufacturing courses
Manufacturing courses
 
Retailing & logistics profile
Retailing & logistics profileRetailing & logistics profile
Retailing & logistics profile
 
Skillwise orientation
Skillwise orientationSkillwise orientation
Skillwise orientation
 
Overview- Skillwise Consulting
Overview- Skillwise Consulting Overview- Skillwise Consulting
Overview- Skillwise Consulting
 
Skillwise corporate presentation
Skillwise corporate presentationSkillwise corporate presentation
Skillwise corporate presentation
 
Skillwise Profile
Skillwise ProfileSkillwise Profile
Skillwise Profile
 
Skillwise Softskill Training Workshop
Skillwise Softskill Training WorkshopSkillwise Softskill Training Workshop
Skillwise Softskill Training Workshop
 
Skillwise Insurance profile
Skillwise Insurance profileSkillwise Insurance profile
Skillwise Insurance profile
 
Skillwise Train and Hire Services
Skillwise Train and Hire ServicesSkillwise Train and Hire Services
Skillwise Train and Hire Services
 
Skillwise Digital Technology
Skillwise Digital Technology Skillwise Digital Technology
Skillwise Digital Technology
 
Skillwise Boot Camp Training
Skillwise Boot Camp TrainingSkillwise Boot Camp Training
Skillwise Boot Camp Training
 
Skillwise Academy Profile
Skillwise Academy ProfileSkillwise Academy Profile
Skillwise Academy Profile
 
Skillwise Overview
Skillwise OverviewSkillwise Overview
Skillwise Overview
 
SKILLWISE - OOPS CONCEPT
SKILLWISE - OOPS CONCEPTSKILLWISE - OOPS CONCEPT
SKILLWISE - OOPS CONCEPT
 
Skillwise - Business writing
Skillwise - Business writing Skillwise - Business writing
Skillwise - Business writing
 
Imc.ppt
Imc.pptImc.ppt
Imc.ppt
 
Skillwise cics part 1
Skillwise cics part 1Skillwise cics part 1
Skillwise cics part 1
 

Recently uploaded

ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes
 
Multivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back againMultivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back again
Kieran Kunhya
 
Day 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data ManipulationDay 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data Manipulation
UiPathCommunity
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
ScyllaDB
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
AlexanderRichford
 
Introduction to ThousandEyes AMER Webinar
Introduction  to ThousandEyes AMER WebinarIntroduction  to ThousandEyes AMER Webinar
Introduction to ThousandEyes AMER Webinar
ThousandEyes
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
CTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database MigrationCTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database Migration
ScyllaDB
 
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
anilsa9823
 
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
Ortus Solutions, Corp
 
ScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDCScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDC
ScyllaDB
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
APJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes WebinarAPJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes Webinar
ThousandEyes
 
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
manji sharman06
 
intra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_Enintra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_En
NTTDATA INTRAMART
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 
Automation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI AutomationAutomation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI Automation
UiPathCommunity
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
ScyllaDB
 
So You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental DowntimeSo You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental Downtime
ScyllaDB
 

Recently uploaded (20)

ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024
 
Multivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back againMultivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back again
 
Day 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data ManipulationDay 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data Manipulation
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
 
Introduction to ThousandEyes AMER Webinar
Introduction  to ThousandEyes AMER WebinarIntroduction  to ThousandEyes AMER Webinar
Introduction to ThousandEyes AMER Webinar
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
CTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database MigrationCTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database Migration
 
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
 
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
 
ScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDCScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDC
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
APJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes WebinarAPJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes Webinar
 
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
 
intra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_Enintra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_En
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 
Automation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI AutomationAutomation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI Automation
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
 
So You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental DowntimeSo You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental Downtime
 

Advanced integration services on microsoft ssis 1

  • 3. What is the Problem? • Resource Contention • Unreliable access to historical data • Inconsistent application of business rules • Data structure results in slower, more complex queries
  • 4. The Data Warehouse Solution • Initial Load • Ongoing Load
  • 7. Dimension Table Characteristics Primary key Surrogate Key or YYYYMMDD for Date Dimension Business or Natural Key Dimension columns: Labels, grouping, filtering, sorting, drilling All data types allowed
  • 9. ETL Design Patterns Create snapshot of database to simplify error recovery Master Extract Package Master Transform-Load Package
  • 10. Extract Package Truncate staging before extraction Add record to audit table Extract data from source and load into staging table Count records in staging table Update audit record
  • 11. Extract Data Flow Connect to source Store extracted row count in variable Load extracted records into staging table Store error row count in variable Save error records
  • 12. Load Patterns Extract only new records and load into transactional fact table Extract all records Exists? Update if changes (maybe) Fact Loads Dimension Loads
  • 13. Fact Extract for Ongoing Load @[User::ExtractSQL] =@[$Project::initialLoad] ? @[User::ExtractSQL] : @[User::ExtractSQL] + " where OrderDate > '" + (DT_WSTR, 50) (DT_DBTIMESTAMP) @[User::MaxDateTime] + "' " Create SQL statement for extract based on initialLoad value Look up max date for related fact table, store in MaxDateTime variable
  • 14. Slowly Changing Dimensions • Type 0 • Type 1 • Type 2 Add new record Expire old record After LoadBefore Load
  • 15. Dimension Load Pattern: Type 0 Connect to staging table Store row count from staging in variable Add AuditKey into pipeline Transformation – varies by dimension Test for dimension record existence with Lookup Store row count for records to insert in variable Load record into dimension table Store error row count in variable Save error records
  • 16. Dimension Load Pattern: Type 1 or Type 2 Standard error handling Update onlyExpire old record Transformations here Slowly Changing Dimension Transformation Combine new & Type 2Add Start Date Insert new & Type 2 Connect to staging Capture row count Add audit key
  • 17. Other Slowly Changing Dimension Patterns • Merge • Change Data Capture (CD) • Custom Components
  • 18. Dimension Load Pattern Extract data from staging and load into dimension table Count records in dimension table Update audit record Add record to audit table Count records in dimension table
  • 19. Fact Table Load Pattern Capture and store last date processed Fact load & error handling Surrogate key lookups Connect to staging Add audit key
  • 21. T-SQL MERGE Statement • Datawarehouse Dimension Table – Type 1 SCD MERGE dw.DimProductAS target USING ( SELECT Name, . . . FROM tmp.scdProduct) AS source ON target.ProductAlternateKey= source.ProductNumber AND target.Status= 'Current' WHEN MATCHED AND NOT (Source.Name= ISNULL(target.EnglishProductName, '')) THEN UPDATE SET target.EnglishProductName= source.Name WHEN NOT MATCHED BY target AND source.SellEndDateIS NULL THEN INSERT (ProductAlternateKey, ProductSubcategoryKey, . . .) VALUES (source.ProductNumber, source.ProductSubcategoryKey, . . .) OUTPUT $action as MergeAction, source.*
  • 22. Type 1 SCD MERGE dw.DimProductAS target USING ( SELECT Name, . . . FROM tmp.scdProduct) AS source ON target.ProductAlternateKey= source.ProductNumber AND target.Status= 'Current' WHEN MATCHED AND NOT (Source.Name= ISNULL(target.EnglishProductName, '')) THEN UPDATE SET target.EnglishProductName= source.Name WHEN NOT MATCHED BY target AND source.SellEndDateIS NULL THEN INSERT (ProductAlternateKey, ProductSubcategoryKey, . . .) VALUES (source.ProductNumber, source.ProductSubcategoryKey, . . .) OUTPUT $action as MergeAction, source.* Define join between tables: Business key Status indicator or date range
  • 23. • Type 2 SCD MERGE dw.DimProductAS target USING ( SELECT Name, . . . FROM tmp.scdProduct) AS source ON target.ProductAlternateKey= source.ProductNumber AND target.Status= 'Current' WHEN MATCHED AND NOT (Source.ListPrice= ISNULL(target.ListPrice, '')) THEN UPDATE SET target.Status= NULL, target.EndDate= GETDATE() WHEN NOT MATCHED BY target AND source.SellEndDateIS NULL THEN INSERT (ProductAlternateKey, ProductSubcategoryKey, . . .) VALUES (source.ProductNumber, source.ProductSubcategoryKey, . . .) OUTPUT $action as MergeAction, source.*
  • 24. Auditing and Type 2 Inserts CREATE TABLE #DimProduct(MergeAction NVARCHAR(10), Name NVARCHAR(50), . . .) INSERT INTO #DimProduct SELECT * FROM ( MERGE dw.DimProductAS target . . . ) mergeOutput INSERT INTO dw.DimProduct SELECT Name, . . . FROM #DimProduct WHERE MergeAction= 'UPDATE' SELECT SUM(CASE WHEN MergeAction= 'INSERT'THEN 1 ELSE 0 END) AS RowCountInsert, SUM(CASE WHEN MergeAction= 'UPDATE'THEN 1 ELSE 0 END) AS RowCountUpdate FROM #DimProduct SELECT SUM(CASE WHEN MergeActionIN ('INSERT','UPDATE') THEN 1 ELSE 0 END) AS RowCountInsert, SUM(CASE WHEN MergeAction= 'UPDATE'THEN 1 ELSE 0 END) AS RowCountUpdate FROM #DimProduct Type 2 Inserts Type 1 Audit ing Type 2 Auditi ng
  • 25. Error Handling BEGIN TRY BEGIN TRANSACTION CREATE TABLE #DimProduct(MergeActionNVARCHAR(10), Name NVARCHAR(50), . . .) INSERT INTO #DimProduct SELECT * FROM ( MERGE dw.DimProductAS target . . . ) mergeOutput SELECT SUM(CASE WHEN MergeAction= 'INSERT' THEN 1 ELSE 0 END) AS RowCountInsert, SUM(CASE WHEN MergeAction= 'UPDATE'THEN 1 ELSE 0 END) AS RowCountUpdate FROM #DimProduct COMMIT END TRY BEGIN CATCH . . . <Error Handling code> . . . END CATCH
  • 27. Change Data Capture (CDC) • In Enterprise Edition
  • 29. Types of Dirty Data • Column Problems • Record Problems • Business Rule Problems
  • 30. SQL Query or Integration Services? But…. What if the source isn’t relational? What if the resulting query is very complex? But…. Packages can be tedious to build However, logging or data viewer helpful
  • 31. Column Problems: Missing Data Default • Problem: Missing Data • Solution: Add Default
  • 32. Column Problems: Derive Missing Data • Problem: Missing Data • Solution: Use Lookup Query to Derive Data
  • 33. Column Problems: Translation • Problem: Code to Translate • Solution: Lookup (optionally with Cache Transform)
  • 34. Columns Problems: Data Type • Problem: Incompatible Data Types • Solution: Data Conversion
  • 35. Column Problems: Truncation • Problem: Input Column Size > Destination Column Size • Solution 1: Derived Column with Left() or Substring() – But…Consider the impact on downstream operations • Solution 2: Flag Record for Manual Intervention
  • 36. Record Problems: Missing Dimension Data • Problem: Lookup Failures – Fail – Flag – Fix • Solution: Inferred Members – Option 1: Insert unmatched records in advance of fact load – Option 2: Use script component to insert record and return key – Option 3: Use partial cache lookup task
  • 37. Record Problems: Lookup Failures • Problem: Inconsistent Data Across Sources • Solution: Fuzzy Lookup Transformation
  • 38. Record Problems: Duplicate Data • Problem: Similar Data in Same Data Set • Solution: Fuzzy Grouping Transformation – Enterprise Edition Only
  • 39. Business Rule Problems: Out of Range Values • Problem: Invalid Values • Solution: Conditional Split to Flag
  • 40. Data Quality Services • DQS Connection Manager • DQS Cleansing Transformation
  翻译: