尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
DQ Project: DQ Tool Usage Highlights
DV
E) dvsingh@aninfosys.com M)+31629014650
These features form the basis for the success of overall Project :
– DQ Methodology
– Desired Special Product Features
– GxP Compliant Operational Model
– Business Friendly Reporting Model
Vision for DQ Project:
- Initial Phase: Manual update of Source Systems by DMM (Data
Stewards)
- Future: Auto Update of Source Systems on case basis
Analyst
Business
Users, Analyst
Developer &
Architect
1. Profile &
Discover
2. Define & Design
Metrics and
Rules
3. Design &
Implement Rules
and Routines
4. Manage
Exceptions
5. Monitor
DQ Metrics
DMM
BPM /
SuperUser
Role-Based Data Quality Process
Managers
Project
Dash
board
DMM
- Bad records /
exceptions mgmt
- Duplicates mgmt
Business
Analysts
- Profiling
- Business Rules
- Mapping specification
- Reference Tables
- DQ scorecards
- Business Glossary
Work Aspect Matrix
• (4) Exception/ Bad Records Handling/Creation
• (22) Business/Human Tasks Creation for Bad Records
• (22) Business/Human Tasks Workflow Management
• (22) Working on Bad Records/ Task items using Web UI
• (11) Security & Authorization around Tasks / Records viewing
 DMM can get easy view of all open task items related to
data errors and can then follow them one by one and close
them after updating the correction in Source System.
 There is option to introduce multi level approvals / third eye
check mechanism too before a correction can be marked
as correctly completed
7
Manage Data Exceptions/Errors
Business Process Flow Supported:
 The user closes the tasks in Analyst tool after he fixes the rule error in the data
source system.
 The tasks are created and closed at data item level+rule level.
Technical process flow Supported:
 For checking data and creation of DQ tasks
 Tasks are created on the basis of Business Rules within a data domain, that needs
to be handled by users allocated to a particular business area. The data items are
already associated to a particular business area.
Source the
Product
Data
Run Batch
Validations
Identify
Exceptions
Notify
Responsibl
e
Correct
Data in
Source
Systems
Re-Validate
(4) Exception/ Bad Records Creation: it can be achieved using a
special Exception Control task.
The Exception transformation identifies the following types of records based on each record score:
Good records
 Records with scores greater than or equal to the upper threshold. Good records are valid and do not require review. For example, if
configure the upper threshold as 90, any record with a 90 or higher does not need review.
Bad records
 Records with scores less than the upper threshold and scores greater than or equal to the lower threshold. Bad records are the
exceptions that you need to review in the Analyst tool. For example, when the lower threshold is 40, any record with a score from 40 to 90
needs manual review.
Rejected records
 Records with scores less than the lower threshold. Rejected records are not valid. By default, the Exception transformation drops
rejected records from the data flow. For this example, any record with a score 40 or less is a rejected record.
In right side, we show:
a) Set of mappings to
perform multi level of
checks with increasing
complexity.
b) It shows a Excep task
to fill all the BAD table
c) Human Task : Manage
data governance for
example : Record
distribution to users is
based on Country code :
(can be distributed
by values from a
reference table too)
(a)
(b)
(c)
 Once we fill the BAD records table, we need to identify who & how to allocate the
correction of each of those line items.
 It can be done using a special component called «Human Task» component in IDQ
DQ System
DMM
Approve/Reje
ctchanges
= Automated= Manual
ExecuteDQ
mappings
Correctrecords in
Source System
Review/
openassigned
tasks
NextSteps
BPM /
Superuser Review/
openassigned
tasks
Good records
Badrecords
Rejected records
Approved records
Scenario (a) : SourceSystem is Manually updated
(22)Business/HumanTasks WorkflowManagement
DQ System
DMM
Approve/Reje
ctchanges
= Automated= Manual
ExecuteDQ
mappings
Correctrecords in
DQ UI
Review/
openassigned
tasks
Collect &
Writetotarget
BPM /
Superuser Review/
openassigned
tasks
Good records
Badrecords
Rejected records
Approved records
Scenario (b) : Source System is Autoupdated from DQPlatform
(22) Working on Bad Records/ Task items using Web UI
 A user sees all data correction Tasks assigned to him
 He can access the bad records with those errors
 He can visualize all errors at record level and update them
 See all audits related to a data record
 It shows all Tasks Overview associated to you
 Yousee all your DQTasks onthe main login page of Analyst Tool
 Here is summarizedview and youneed to clickon it to see full list of records and errors
 It includes workingTasks (DMM) and Review Tasks ( Superuser)
My Task Page
 Here youable to see the full set of records associated to oneTask, havingone or more errors
 Errorfields are markedinRED color
 InitialPhase: Perform the correction inSource System for these fields
 Allows for Correction,Adding Notes, view field errors,
DetailedTask View: Recordlisting
 This screen also allows you toedit the Errors directly here
 (useful for automated Source system updation scenarion only)
 Updates done herecan be reviewed by BPM /Superuser and
 henceuseful to update it heretoo apart from manual update of Source system directly
Detailed Task View : RecordEditing
 Here youcan view the ChangeHistory for a data record.
 Useful for sensitive data
 Thechangefields show up with green tick sign
Tasks: DataAuditing Tab
Manage Exceptions
Business
Users, Analyst
Developer
BPM /
SuperUser /
Manager
DMM &
Analyst &
Data Steward
Manage Review (Third
Eye principle)
Integrate People with Automated DQ
Lifecycle Processes
 Around Tasks / Records viewing
 Around Tasks Distribution
 What does it mean & How we can do
it in DQ Tool
19
Security & Authorization
A client implementationcaninclude:
100+ rolesforMaterialMaster
400+ rolesforVendor
800+ rolesforCustomers
•Need to Support Complex data hierarchywithin each subject area
•Vast no of DMM exists
•Need to restrict data visibility to the DMM to there own data only
•Support for Largeno of Data roles
•Superusers need to havemore flexible data access
•Support for multiple Subject area/Data Domain Access
•Support capability to export data to excel with restricted data
•GUIcapability to navigate through errordata and read associated errors
seamlessly
•Data is highlysensitive and desires restrictions at various levels
•Employees has complex organization setup and manage mutiple data sets
Data & Security Challenge
Level 1 restriction on data
can be done by doing data
distribution based on data
values. It can be done in
Workflow tab of DQ
Developer using a special
component called «Human
Task».
In right side, we show:
a) Human Task : Manage
data distribution governance
for example : Record
distribution to users is based
on Material Type code : (can
be distributed by values
from a reference table too)
b) The Task Performer could
be a Group in LDAP even
and he will be restricted to
specific data only
Task Creation : Data Restriction
Level 2 : Configuration
of Human Task to
restrict data to limited
DMM
We need to further
configure the Human
Task step to define the
full process around
Tasks and Review
process:
 With an Exception
Step (DMM corrects
data) restrict people
to act as DMM
 and with a Review
Step (BPM/
Superuser manager
approves changes)
restrict list of
reviewers
Task Creation : DMM Restriction
 There is support for groups and roles to organize security around data visibility
and actions that can be executed on data
 It is seamlessly supported across Analyst and Developer
DQ-Security Model Schematic
a) Users : Create normal users in
Analyst tool or in LDAP
b) Groups: Create Groups based on
Tools Access and functionality. This is
required from Tools Access aspects.
Also create Groups based on Subject
Area ROLES as defined within
corporate functions. This is needed
for Data Security aspect.
Roles can also be created in LDAP
c) Human Task creation:
Do it based on data value of Subject
Area Roles and assign it to Tools Data
Roles derived from Tool config , which
is 1 to 1.
Manage all allocations in a table for
Dynamic usage
Approach to Configuring Roles,
Groups and Task Access
 Management desires full visibility on periodic progress
 Ability to drill down at various aspects in Real time is
satisfying for Upper management
 Access to reports over Web UI is critical
 DQ Project and Tools need to provide out of box or
support a Operational Model for Aspects like GxP
compliance
 DQ Project and Tools need to provide out of box or
support a Reporting Model
25
Reporting and GxP Compliance
DQ HUBAnalyst / DQ Tool DQ/ ETL Tool
Landing
tables
Staging
tables Errors &
Data
tables
Extractor
(no
transform)
Define &
Execute
Validation
Rules
(Mappings &
Workflows)
Define
Validation
Rules
(Maplets)
UI Access
DQ Errors
Microsoft Excel
Source
Replica
tables
Aggregato
r &
Joiners
(Incl.
Filtering)
DQ
Source
tables
Landing Staging Base Object
Analytics Tool
Sourc
eSourc
e
Data Stewards –
Role/Group
Reporting Tool
Data Stewards –
Fixes Manually
Reportin
g Model
•Out of box Reporting model to support full scale corporate reporting doesnt come
with product and hence we need to sketch the process and data model around it
•Out of box data model for GxP, Process Audit etc, doesn’t come with product and
hence we need to sketch the process and data model as part of DQ Rule checking
framework
Internal and Confidential
Application Schema Analyst / IDQ for DQ Configuration
Internal and Confidential
Table Description
DQC_BUS_ERROR_DTL Error message enhancement based on business input against rules failure.
DQC_ERROR_MST System generated error message against each rule.
DQC_DATA_STEW_MST Table contain list of data steward along with their managers.
DQC_PRM_THRESHOLD Threshold value set for routing records based on score between Good records, bad records and rejected records.
DQC_ROLE_DATA_STEW_ASSGN Association between Role, Data Steward for Site and Source System.
DQC_ROLE_MST All role defined for DQ solution across site and source system.
DQC_ROLE_RULE_ASSIGN Relationship Rules assigned to various Roles across site and source system along with Active Flag.
DQC_RULE_BOOK Definition of all roles configured along Activation Flag and Date.
DQC_RULE_OBJECT_ASSIGN Role assigned to data object or slice of data object.
DQC_RULE_OBJVALUE_ON_OFF Rule switch on / off at data object / data object value level.
DQC_SAP_FLD_NAMES Techical field names along with field description for reporting.
DQC_SITE_MST Site configured for DQ system.
DQC_SRC_SYSTEM Source System configured for DQ system.
DQC_SUB_SUBJ_AREA Category within Subject area like within MM we have Plant, S-Org, BOM, Storage Loc etc.
DQC_SUBJ_AREA_MST Subject area master like MM, Vendor, Customer.
DQC_ROLE_OBJVAL_ASSIGN_MM_PLNT (*) Assignment of data slice to Role (slice would be plant, site, material type etc).
DQO_MM_GENL (*) Opertaion table contain MM General Data.
DQO_MM_GENL_PLNT (*) Opertaion table contain MM Plant data along with score and check type details added.
DQO_MM_GENL_PLNT_BAD (*) System generated - opertaion table contain MM Plant data where records are categorised as bad records.
DQO_MM_GENL_PLNT_BAD_ISSUE (*)
System generated - opertaion table contain MM Plant records with issues details / error details where records are categorised as bad
records.
DQO_MM_GENL_PLNT_SRC (*) Operation table contain MM Plant data in raw format.
DQO_MM_GENL_SRC (*) Operation table contain MM general data in raw format.
DQO_MM_GENL_WHLST (*) Records whitelist for general data. For these records we don't perform any DQ checks.
DQO_RUN_DTL Table containing DQ result for each run at field level for all rules attached to the field.
DQR_DIM_DATE Reporting Time Dimension (Draft).
DQR_DIM_ERROR_MST Reporting Error Dimension (Draft).
DQR_DIM_ORG Reporting Organisation Dimension - Role / Data Steward / Site / Source System (Draft).
DQR_DIM_RULE Reporting Rule Dimension (Draft).
DQR_FCT_DATA_QUALITY Reporting Data Quality Fact table (Draft).
(*) These tables are object specific and will be duplicated
Internal and ConfidentialRules Roles Exception Data owners Data Segmentation
Internal and Confidential
Source
System 1
Source
System 2
Reference
Tables
Data object
Definition
Configuration Tables
Check
Results
Distribute
Results /
Tasks
Material
Type
Material
Category
List of Plants
Material
Group
Internal and Confidential
Data Quality Operational / DQO
• These are operational tables; tables on
which data quality check is performed and
tables that store the outcomes.
• Material Source data example would be
plant data table or PIR data or general
data.
• Here each source data object table is
complimented with system generated
tables like bad table, issue table along with
application table like Whitelist table and
Source table. The right side ER shows MM
Plant data object tables using MM_GENL_
in table naming
• (*) These tables are object specific
and will be duplicated
• Here DQO_Run_DTL stores all DQ check
results at data attribute level.
• There is no relationship build on database
level on purpose for these tables.
DQ – Reporting Dimensional Data Model
(Extension)
D_L_RULE_BOOK
RULE_NUMBER
RULE_NAME
RULE_DESCRIPTIO
N
RULE_PURPOSE
....
F_H_DQ_ANALYSIS_AGGR F_L_DQ_RUN
D_L_ROLE_ASSIG
NMENT
F_L_RULE_RUN
D_L_SUBJECT_AR
EA
D_DQ_STEWARD
D_L_RULE_CATEG
ORY
F_L_DQ_ERROR
D_L_SOURCE_SYS
D_L_SEVERITY
D_L_DATE
Entities/ Tables Description
D_L_ROLE_ASSIGNMENT
D_L_RULE_BOOK
D_L_RULE_CATEGORY
D_L_SEVERITY
D_L_SOURCE_SYS
D_L_SUBJECT_AREA
Dimension Tables that contains:
- All DQ rules
- Rule Severity
- Data Hierarchy
- Data Steward role
- Assigning Data Steward access to various data
hierarchy
- Various subject area
F_H_DQ_ANALYSIS_AGGR
F_L_DQ_ERROR
F_L_DQ_RUN
F_L_RULE_RUN
Fact tables to capture all stats for reporting purpose at
lower record level and at aggregated level too
- generic error data
- generic analysis data
- All run details
- Details of all rules applied during each run
Reporting Data Model Description
Feature Business Benefit
Proven Framework Quick ROI & Low learning curve to implement a DQ
project
Platform and technology independent DQ
framework
Eliminates need for any new HW & SW purchases.
Any kind of DQ/ ETL tool, UI interface, Analysis
tool can be used
Zero investment in new products.
Reuse of inhouse tecnology and applications
Can be used for any kind of data, be it Material
Master , Vendor, Customer or any other non
standard data like Vehicle, Crane
100% flexibility to extend to any new data area.
Quick and easy to use and build Short implementation time.
Normal low cost developers can be used to build it.
New Reporting Dimensions can be added
easily
Flexibility to extend the framework to accomodate
project based dimensions without dependency on
product supplier or expert consultants
Flexible data security model that supports
valrious complex data hierarchy and vast
combinations of data stewards
It helps eliminate data security and data sensitivity
discussions and prevents DQ project planning from
unneccessary delay.
Benefits of Framework

More Related Content

What's hot

BATCH DATA COMMUNICATION
BATCH DATA COMMUNICATIONBATCH DATA COMMUNICATION
BATCH DATA COMMUNICATION
Kranthi Kumar
 
Quick user guide to the Clear Clinica Cloud EDC system
Quick user guide to the Clear Clinica Cloud EDC systemQuick user guide to the Clear Clinica Cloud EDC system
Quick user guide to the Clear Clinica Cloud EDC system
Flaskdata.io
 
Traffic Simulator
Traffic SimulatorTraffic Simulator
Traffic Simulator
gystell
 
Programming Interface & SAP BDC
Programming Interface & SAP BDCProgramming Interface & SAP BDC
Programming Interface & SAP BDC
Syam Sasi
 
RAD10987USEN.PDF
RAD10987USEN.PDFRAD10987USEN.PDF
Deployment of a test management solution for a defence project using an integ...
Deployment of a test management solution for a defence project using an integ...Deployment of a test management solution for a defence project using an integ...
Deployment of a test management solution for a defence project using an integ...
Einar Karlsen
 
Sap query for task list data extraction
Sap query for task list data extractionSap query for task list data extraction
Sap query for task list data extraction
Srinivasa Rao Mullapudi
 
FME World Tour 2015 - FME & Data Migration Simon McCabe
FME World Tour 2015 -  FME & Data Migration Simon McCabeFME World Tour 2015 -  FME & Data Migration Simon McCabe
FME World Tour 2015 - FME & Data Migration Simon McCabe
IMGS
 
Mahesh_ETL
Mahesh_ETLMahesh_ETL
Mahesh_ETL
Mahesh Gajula
 
NDS Design Study
NDS Design StudyNDS Design Study
NDS Design Study
Wei Wei
 
Etl And Data Test Guidelines For Large Applications
Etl And Data Test Guidelines For Large ApplicationsEtl And Data Test Guidelines For Large Applications
Etl And Data Test Guidelines For Large Applications
Wayne Yaddow
 
Bdc
BdcBdc
Interfacing sap - BDC
Interfacing sap - BDC Interfacing sap - BDC
Interfacing sap - BDC
Syam Sasi
 
Data flow diagrams
Data flow diagramsData flow diagrams
Data flow diagrams
Ujjwal 'Shanu'
 
Lsmw (Legacy System Migration Workbench)
Lsmw (Legacy System Migration Workbench)Lsmw (Legacy System Migration Workbench)
Lsmw (Legacy System Migration Workbench)
Leila Morteza
 

What's hot (15)

BATCH DATA COMMUNICATION
BATCH DATA COMMUNICATIONBATCH DATA COMMUNICATION
BATCH DATA COMMUNICATION
 
Quick user guide to the Clear Clinica Cloud EDC system
Quick user guide to the Clear Clinica Cloud EDC systemQuick user guide to the Clear Clinica Cloud EDC system
Quick user guide to the Clear Clinica Cloud EDC system
 
Traffic Simulator
Traffic SimulatorTraffic Simulator
Traffic Simulator
 
Programming Interface & SAP BDC
Programming Interface & SAP BDCProgramming Interface & SAP BDC
Programming Interface & SAP BDC
 
RAD10987USEN.PDF
RAD10987USEN.PDFRAD10987USEN.PDF
RAD10987USEN.PDF
 
Deployment of a test management solution for a defence project using an integ...
Deployment of a test management solution for a defence project using an integ...Deployment of a test management solution for a defence project using an integ...
Deployment of a test management solution for a defence project using an integ...
 
Sap query for task list data extraction
Sap query for task list data extractionSap query for task list data extraction
Sap query for task list data extraction
 
FME World Tour 2015 - FME & Data Migration Simon McCabe
FME World Tour 2015 -  FME & Data Migration Simon McCabeFME World Tour 2015 -  FME & Data Migration Simon McCabe
FME World Tour 2015 - FME & Data Migration Simon McCabe
 
Mahesh_ETL
Mahesh_ETLMahesh_ETL
Mahesh_ETL
 
NDS Design Study
NDS Design StudyNDS Design Study
NDS Design Study
 
Etl And Data Test Guidelines For Large Applications
Etl And Data Test Guidelines For Large ApplicationsEtl And Data Test Guidelines For Large Applications
Etl And Data Test Guidelines For Large Applications
 
Bdc
BdcBdc
Bdc
 
Interfacing sap - BDC
Interfacing sap - BDC Interfacing sap - BDC
Interfacing sap - BDC
 
Data flow diagrams
Data flow diagramsData flow diagrams
Data flow diagrams
 
Lsmw (Legacy System Migration Workbench)
Lsmw (Legacy System Migration Workbench)Lsmw (Legacy System Migration Workbench)
Lsmw (Legacy System Migration Workbench)
 

Viewers also liked

Cultural heritage2 Spain
Cultural heritage2 SpainCultural heritage2 Spain
Cultural heritage2 Spain
Mehmet Tokgöz
 
DUP_1048-Business-ecosystems-come-of-age_MASTER_FINAL
DUP_1048-Business-ecosystems-come-of-age_MASTER_FINALDUP_1048-Business-ecosystems-come-of-age_MASTER_FINAL
DUP_1048-Business-ecosystems-come-of-age_MASTER_FINAL
Tyler Logigian
 
20151219非日常tocfe(公開版)
20151219非日常tocfe(公開版)20151219非日常tocfe(公開版)
20151219非日常tocfe(公開版)
Takahiro Nohdomi
 
Marvin and the machines
Marvin and the machinesMarvin and the machines
Marvin and the machines
JamesRMarshall
 
DUP-1374_Future-of-mobility_vFINAL_4.15.16
DUP-1374_Future-of-mobility_vFINAL_4.15.16DUP-1374_Future-of-mobility_vFINAL_4.15.16
DUP-1374_Future-of-mobility_vFINAL_4.15.16
Tyler Logigian
 
Boost Email Marketing Revenue with the Power of Consumer Psychology
Boost Email Marketing Revenue with the Power of Consumer PsychologyBoost Email Marketing Revenue with the Power of Consumer Psychology
Boost Email Marketing Revenue with the Power of Consumer Psychology
Holly Wright
 
Alpi
AlpiAlpi
Alpi
arteluce
 
Club de lectura
Club de lecturaClub de lectura
Club de lectura
Antonia Toscano Lopez
 
Email Marketing Metrics and Reporting with Excel Dashboards
Email Marketing Metrics and Reporting with Excel DashboardsEmail Marketing Metrics and Reporting with Excel Dashboards
Email Marketing Metrics and Reporting with Excel Dashboards
Holly Wright
 
The Culture of Turkey
The Culture of TurkeyThe Culture of Turkey
The Culture of Turkey
Ayla Savaşçı
 
Gps
GpsGps
Gps
igoriv
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
Alex Meadows
 
Thulani Mpanza 1...
Thulani Mpanza 1...Thulani Mpanza 1...
Thulani Mpanza 1...
Thulani Mpanza
 
Final Thesis
Final ThesisFinal Thesis
Final Thesis
Shehryar ali Malik
 
What we can learn from Amazon for Clinical Decision Support
What we can learn from Amazon for Clinical Decision SupportWhat we can learn from Amazon for Clinical Decision Support
What we can learn from Amazon for Clinical Decision Support
Karim Keshavjee
 
The cost of data quality in EMRs
The cost of data quality in EMRsThe cost of data quality in EMRs
The cost of data quality in EMRs
Karim Keshavjee
 
Seguridad informática
Seguridad informáticaSeguridad informática
Seguridad informática
Ivan López
 
Data Quality Technical Architecture
Data Quality Technical ArchitectureData Quality Technical Architecture
Data Quality Technical Architecture
Harshendu Desai
 
DAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data Quality
DATAVERSITY
 

Viewers also liked (19)

Cultural heritage2 Spain
Cultural heritage2 SpainCultural heritage2 Spain
Cultural heritage2 Spain
 
DUP_1048-Business-ecosystems-come-of-age_MASTER_FINAL
DUP_1048-Business-ecosystems-come-of-age_MASTER_FINALDUP_1048-Business-ecosystems-come-of-age_MASTER_FINAL
DUP_1048-Business-ecosystems-come-of-age_MASTER_FINAL
 
20151219非日常tocfe(公開版)
20151219非日常tocfe(公開版)20151219非日常tocfe(公開版)
20151219非日常tocfe(公開版)
 
Marvin and the machines
Marvin and the machinesMarvin and the machines
Marvin and the machines
 
DUP-1374_Future-of-mobility_vFINAL_4.15.16
DUP-1374_Future-of-mobility_vFINAL_4.15.16DUP-1374_Future-of-mobility_vFINAL_4.15.16
DUP-1374_Future-of-mobility_vFINAL_4.15.16
 
Boost Email Marketing Revenue with the Power of Consumer Psychology
Boost Email Marketing Revenue with the Power of Consumer PsychologyBoost Email Marketing Revenue with the Power of Consumer Psychology
Boost Email Marketing Revenue with the Power of Consumer Psychology
 
Alpi
AlpiAlpi
Alpi
 
Club de lectura
Club de lecturaClub de lectura
Club de lectura
 
Email Marketing Metrics and Reporting with Excel Dashboards
Email Marketing Metrics and Reporting with Excel DashboardsEmail Marketing Metrics and Reporting with Excel Dashboards
Email Marketing Metrics and Reporting with Excel Dashboards
 
The Culture of Turkey
The Culture of TurkeyThe Culture of Turkey
The Culture of Turkey
 
Gps
GpsGps
Gps
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
 
Thulani Mpanza 1...
Thulani Mpanza 1...Thulani Mpanza 1...
Thulani Mpanza 1...
 
Final Thesis
Final ThesisFinal Thesis
Final Thesis
 
What we can learn from Amazon for Clinical Decision Support
What we can learn from Amazon for Clinical Decision SupportWhat we can learn from Amazon for Clinical Decision Support
What we can learn from Amazon for Clinical Decision Support
 
The cost of data quality in EMRs
The cost of data quality in EMRsThe cost of data quality in EMRs
The cost of data quality in EMRs
 
Seguridad informática
Seguridad informáticaSeguridad informática
Seguridad informática
 
Data Quality Technical Architecture
Data Quality Technical ArchitectureData Quality Technical Architecture
Data Quality Technical Architecture
 
DAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data QualityDAMA Webinar - Big and Little Data Quality
DAMA Webinar - Big and Little Data Quality
 

Similar to DQ Product Usage Methodology Highlights_v6_ltd

Crafted Design - Sandro Mancuso
Crafted Design - Sandro MancusoCrafted Design - Sandro Mancuso
Crafted Design - Sandro Mancuso
JAXLondon2014
 
Crafted Design - LJC World Tour Mash Up 2014
Crafted Design - LJC World Tour Mash Up 2014Crafted Design - LJC World Tour Mash Up 2014
Crafted Design - LJC World Tour Mash Up 2014
Sandro Mancuso
 
RAMP_FINAL_ppt
RAMP_FINAL_pptRAMP_FINAL_ppt
RAMP_FINAL_ppt
Madhusmita Roy
 
Large Data Management Strategies
Large Data Management StrategiesLarge Data Management Strategies
Large Data Management Strategies
Salesforce Developers
 
Crafted Design - ITAKE 2014
Crafted Design - ITAKE 2014Crafted Design - ITAKE 2014
Crafted Design - ITAKE 2014
Sandro Mancuso
 
Summary Technical Presentation (General)
Summary Technical Presentation (General)Summary Technical Presentation (General)
Summary Technical Presentation (General)
DonGlass
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
Open Party
 
DDS_UI_WFs_13012022.pptx
DDS_UI_WFs_13012022.pptxDDS_UI_WFs_13012022.pptx
DDS_UI_WFs_13012022.pptx
SatishreddyMandadi
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs Public
David Solivan
 
T3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of ExcellenceT3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of Excellence
veehikle
 
James hall ch 14
James hall ch 14James hall ch 14
James hall ch 14
David Julian
 
Anu_Sharma2016_DWH
Anu_Sharma2016_DWHAnu_Sharma2016_DWH
Anu_Sharma2016_DWH
Anu Sharma
 
Acceptance tests
Acceptance testsAcceptance tests
Acceptance tests
Dragan Tomic
 
Salesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Multitenant Architecture: How We Do the Magic We DoSalesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Developers
 
Rational: The Platform for Software Development
Rational: The Platform for Software DevelopmentRational: The Platform for Software Development
Rational: The Platform for Software Development
saman zaker
 
Crafted Design - GeeCON 2014
Crafted Design - GeeCON 2014Crafted Design - GeeCON 2014
Crafted Design - GeeCON 2014
Sandro Mancuso
 
Microsoft Dynamics CRM Technical Training for Dicker Data Resellers
Microsoft Dynamics CRM Technical Training for Dicker Data ResellersMicrosoft Dynamics CRM Technical Training for Dicker Data Resellers
Microsoft Dynamics CRM Technical Training for Dicker Data Resellers
David Blumentals
 
Nandini-CV
Nandini-CVNandini-CV
Nandini-CV
Nandini Kg
 
Computers in management
Computers in managementComputers in management
Computers in management
Kinshook Chaturvedi
 
Software Engineering Important Short Question for Exams
Software Engineering Important Short Question for ExamsSoftware Engineering Important Short Question for Exams
Software Engineering Important Short Question for Exams
MuhammadTalha436
 

Similar to DQ Product Usage Methodology Highlights_v6_ltd (20)

Crafted Design - Sandro Mancuso
Crafted Design - Sandro MancusoCrafted Design - Sandro Mancuso
Crafted Design - Sandro Mancuso
 
Crafted Design - LJC World Tour Mash Up 2014
Crafted Design - LJC World Tour Mash Up 2014Crafted Design - LJC World Tour Mash Up 2014
Crafted Design - LJC World Tour Mash Up 2014
 
RAMP_FINAL_ppt
RAMP_FINAL_pptRAMP_FINAL_ppt
RAMP_FINAL_ppt
 
Large Data Management Strategies
Large Data Management StrategiesLarge Data Management Strategies
Large Data Management Strategies
 
Crafted Design - ITAKE 2014
Crafted Design - ITAKE 2014Crafted Design - ITAKE 2014
Crafted Design - ITAKE 2014
 
Summary Technical Presentation (General)
Summary Technical Presentation (General)Summary Technical Presentation (General)
Summary Technical Presentation (General)
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
 
DDS_UI_WFs_13012022.pptx
DDS_UI_WFs_13012022.pptxDDS_UI_WFs_13012022.pptx
DDS_UI_WFs_13012022.pptx
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs Public
 
T3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of ExcellenceT3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of Excellence
 
James hall ch 14
James hall ch 14James hall ch 14
James hall ch 14
 
Anu_Sharma2016_DWH
Anu_Sharma2016_DWHAnu_Sharma2016_DWH
Anu_Sharma2016_DWH
 
Acceptance tests
Acceptance testsAcceptance tests
Acceptance tests
 
Salesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Multitenant Architecture: How We Do the Magic We DoSalesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Multitenant Architecture: How We Do the Magic We Do
 
Rational: The Platform for Software Development
Rational: The Platform for Software DevelopmentRational: The Platform for Software Development
Rational: The Platform for Software Development
 
Crafted Design - GeeCON 2014
Crafted Design - GeeCON 2014Crafted Design - GeeCON 2014
Crafted Design - GeeCON 2014
 
Microsoft Dynamics CRM Technical Training for Dicker Data Resellers
Microsoft Dynamics CRM Technical Training for Dicker Data ResellersMicrosoft Dynamics CRM Technical Training for Dicker Data Resellers
Microsoft Dynamics CRM Technical Training for Dicker Data Resellers
 
Nandini-CV
Nandini-CVNandini-CV
Nandini-CV
 
Computers in management
Computers in managementComputers in management
Computers in management
 
Software Engineering Important Short Question for Exams
Software Engineering Important Short Question for ExamsSoftware Engineering Important Short Question for Exams
Software Engineering Important Short Question for Exams
 

DQ Product Usage Methodology Highlights_v6_ltd

  • 1. DQ Project: DQ Tool Usage Highlights DV E) dvsingh@aninfosys.com M)+31629014650
  • 2. These features form the basis for the success of overall Project : – DQ Methodology – Desired Special Product Features – GxP Compliant Operational Model – Business Friendly Reporting Model
  • 3. Vision for DQ Project: - Initial Phase: Manual update of Source Systems by DMM (Data Stewards) - Future: Auto Update of Source Systems on case basis
  • 4. Analyst Business Users, Analyst Developer & Architect 1. Profile & Discover 2. Define & Design Metrics and Rules 3. Design & Implement Rules and Routines 4. Manage Exceptions 5. Monitor DQ Metrics DMM BPM / SuperUser Role-Based Data Quality Process Managers Project Dash board
  • 5. DMM - Bad records / exceptions mgmt - Duplicates mgmt Business Analysts - Profiling - Business Rules - Mapping specification - Reference Tables - DQ scorecards - Business Glossary Work Aspect Matrix
  • 6. • (4) Exception/ Bad Records Handling/Creation • (22) Business/Human Tasks Creation for Bad Records • (22) Business/Human Tasks Workflow Management • (22) Working on Bad Records/ Task items using Web UI • (11) Security & Authorization around Tasks / Records viewing
  • 7.  DMM can get easy view of all open task items related to data errors and can then follow them one by one and close them after updating the correction in Source System.  There is option to introduce multi level approvals / third eye check mechanism too before a correction can be marked as correctly completed 7 Manage Data Exceptions/Errors
  • 8. Business Process Flow Supported:  The user closes the tasks in Analyst tool after he fixes the rule error in the data source system.  The tasks are created and closed at data item level+rule level. Technical process flow Supported:  For checking data and creation of DQ tasks  Tasks are created on the basis of Business Rules within a data domain, that needs to be handled by users allocated to a particular business area. The data items are already associated to a particular business area. Source the Product Data Run Batch Validations Identify Exceptions Notify Responsibl e Correct Data in Source Systems Re-Validate
  • 9. (4) Exception/ Bad Records Creation: it can be achieved using a special Exception Control task. The Exception transformation identifies the following types of records based on each record score: Good records  Records with scores greater than or equal to the upper threshold. Good records are valid and do not require review. For example, if configure the upper threshold as 90, any record with a 90 or higher does not need review. Bad records  Records with scores less than the upper threshold and scores greater than or equal to the lower threshold. Bad records are the exceptions that you need to review in the Analyst tool. For example, when the lower threshold is 40, any record with a score from 40 to 90 needs manual review. Rejected records  Records with scores less than the lower threshold. Rejected records are not valid. By default, the Exception transformation drops rejected records from the data flow. For this example, any record with a score 40 or less is a rejected record.
  • 10. In right side, we show: a) Set of mappings to perform multi level of checks with increasing complexity. b) It shows a Excep task to fill all the BAD table c) Human Task : Manage data governance for example : Record distribution to users is based on Country code : (can be distributed by values from a reference table too) (a) (b) (c)  Once we fill the BAD records table, we need to identify who & how to allocate the correction of each of those line items.  It can be done using a special component called «Human Task» component in IDQ
  • 11. DQ System DMM Approve/Reje ctchanges = Automated= Manual ExecuteDQ mappings Correctrecords in Source System Review/ openassigned tasks NextSteps BPM / Superuser Review/ openassigned tasks Good records Badrecords Rejected records Approved records Scenario (a) : SourceSystem is Manually updated (22)Business/HumanTasks WorkflowManagement
  • 12. DQ System DMM Approve/Reje ctchanges = Automated= Manual ExecuteDQ mappings Correctrecords in DQ UI Review/ openassigned tasks Collect & Writetotarget BPM / Superuser Review/ openassigned tasks Good records Badrecords Rejected records Approved records Scenario (b) : Source System is Autoupdated from DQPlatform
  • 13. (22) Working on Bad Records/ Task items using Web UI  A user sees all data correction Tasks assigned to him  He can access the bad records with those errors  He can visualize all errors at record level and update them  See all audits related to a data record
  • 14.  It shows all Tasks Overview associated to you  Yousee all your DQTasks onthe main login page of Analyst Tool  Here is summarizedview and youneed to clickon it to see full list of records and errors  It includes workingTasks (DMM) and Review Tasks ( Superuser) My Task Page
  • 15.  Here youable to see the full set of records associated to oneTask, havingone or more errors  Errorfields are markedinRED color  InitialPhase: Perform the correction inSource System for these fields  Allows for Correction,Adding Notes, view field errors, DetailedTask View: Recordlisting
  • 16.  This screen also allows you toedit the Errors directly here  (useful for automated Source system updation scenarion only)  Updates done herecan be reviewed by BPM /Superuser and  henceuseful to update it heretoo apart from manual update of Source system directly Detailed Task View : RecordEditing
  • 17.  Here youcan view the ChangeHistory for a data record.  Useful for sensitive data  Thechangefields show up with green tick sign Tasks: DataAuditing Tab
  • 18. Manage Exceptions Business Users, Analyst Developer BPM / SuperUser / Manager DMM & Analyst & Data Steward Manage Review (Third Eye principle) Integrate People with Automated DQ Lifecycle Processes
  • 19.  Around Tasks / Records viewing  Around Tasks Distribution  What does it mean & How we can do it in DQ Tool 19 Security & Authorization
  • 20. A client implementationcaninclude: 100+ rolesforMaterialMaster 400+ rolesforVendor 800+ rolesforCustomers •Need to Support Complex data hierarchywithin each subject area •Vast no of DMM exists •Need to restrict data visibility to the DMM to there own data only •Support for Largeno of Data roles •Superusers need to havemore flexible data access •Support for multiple Subject area/Data Domain Access •Support capability to export data to excel with restricted data •GUIcapability to navigate through errordata and read associated errors seamlessly •Data is highlysensitive and desires restrictions at various levels •Employees has complex organization setup and manage mutiple data sets Data & Security Challenge
  • 21. Level 1 restriction on data can be done by doing data distribution based on data values. It can be done in Workflow tab of DQ Developer using a special component called «Human Task». In right side, we show: a) Human Task : Manage data distribution governance for example : Record distribution to users is based on Material Type code : (can be distributed by values from a reference table too) b) The Task Performer could be a Group in LDAP even and he will be restricted to specific data only Task Creation : Data Restriction
  • 22. Level 2 : Configuration of Human Task to restrict data to limited DMM We need to further configure the Human Task step to define the full process around Tasks and Review process:  With an Exception Step (DMM corrects data) restrict people to act as DMM  and with a Review Step (BPM/ Superuser manager approves changes) restrict list of reviewers Task Creation : DMM Restriction
  • 23.  There is support for groups and roles to organize security around data visibility and actions that can be executed on data  It is seamlessly supported across Analyst and Developer DQ-Security Model Schematic
  • 24. a) Users : Create normal users in Analyst tool or in LDAP b) Groups: Create Groups based on Tools Access and functionality. This is required from Tools Access aspects. Also create Groups based on Subject Area ROLES as defined within corporate functions. This is needed for Data Security aspect. Roles can also be created in LDAP c) Human Task creation: Do it based on data value of Subject Area Roles and assign it to Tools Data Roles derived from Tool config , which is 1 to 1. Manage all allocations in a table for Dynamic usage Approach to Configuring Roles, Groups and Task Access
  • 25.  Management desires full visibility on periodic progress  Ability to drill down at various aspects in Real time is satisfying for Upper management  Access to reports over Web UI is critical  DQ Project and Tools need to provide out of box or support a Operational Model for Aspects like GxP compliance  DQ Project and Tools need to provide out of box or support a Reporting Model 25 Reporting and GxP Compliance
  • 26.
  • 27.
  • 28. DQ HUBAnalyst / DQ Tool DQ/ ETL Tool Landing tables Staging tables Errors & Data tables Extractor (no transform) Define & Execute Validation Rules (Mappings & Workflows) Define Validation Rules (Maplets) UI Access DQ Errors Microsoft Excel Source Replica tables Aggregato r & Joiners (Incl. Filtering) DQ Source tables Landing Staging Base Object Analytics Tool Sourc eSourc e Data Stewards – Role/Group Reporting Tool Data Stewards – Fixes Manually Reportin g Model •Out of box Reporting model to support full scale corporate reporting doesnt come with product and hence we need to sketch the process and data model around it •Out of box data model for GxP, Process Audit etc, doesn’t come with product and hence we need to sketch the process and data model as part of DQ Rule checking framework
  • 29. Internal and Confidential Application Schema Analyst / IDQ for DQ Configuration
  • 30. Internal and Confidential Table Description DQC_BUS_ERROR_DTL Error message enhancement based on business input against rules failure. DQC_ERROR_MST System generated error message against each rule. DQC_DATA_STEW_MST Table contain list of data steward along with their managers. DQC_PRM_THRESHOLD Threshold value set for routing records based on score between Good records, bad records and rejected records. DQC_ROLE_DATA_STEW_ASSGN Association between Role, Data Steward for Site and Source System. DQC_ROLE_MST All role defined for DQ solution across site and source system. DQC_ROLE_RULE_ASSIGN Relationship Rules assigned to various Roles across site and source system along with Active Flag. DQC_RULE_BOOK Definition of all roles configured along Activation Flag and Date. DQC_RULE_OBJECT_ASSIGN Role assigned to data object or slice of data object. DQC_RULE_OBJVALUE_ON_OFF Rule switch on / off at data object / data object value level. DQC_SAP_FLD_NAMES Techical field names along with field description for reporting. DQC_SITE_MST Site configured for DQ system. DQC_SRC_SYSTEM Source System configured for DQ system. DQC_SUB_SUBJ_AREA Category within Subject area like within MM we have Plant, S-Org, BOM, Storage Loc etc. DQC_SUBJ_AREA_MST Subject area master like MM, Vendor, Customer. DQC_ROLE_OBJVAL_ASSIGN_MM_PLNT (*) Assignment of data slice to Role (slice would be plant, site, material type etc). DQO_MM_GENL (*) Opertaion table contain MM General Data. DQO_MM_GENL_PLNT (*) Opertaion table contain MM Plant data along with score and check type details added. DQO_MM_GENL_PLNT_BAD (*) System generated - opertaion table contain MM Plant data where records are categorised as bad records. DQO_MM_GENL_PLNT_BAD_ISSUE (*) System generated - opertaion table contain MM Plant records with issues details / error details where records are categorised as bad records. DQO_MM_GENL_PLNT_SRC (*) Operation table contain MM Plant data in raw format. DQO_MM_GENL_SRC (*) Operation table contain MM general data in raw format. DQO_MM_GENL_WHLST (*) Records whitelist for general data. For these records we don't perform any DQ checks. DQO_RUN_DTL Table containing DQ result for each run at field level for all rules attached to the field. DQR_DIM_DATE Reporting Time Dimension (Draft). DQR_DIM_ERROR_MST Reporting Error Dimension (Draft). DQR_DIM_ORG Reporting Organisation Dimension - Role / Data Steward / Site / Source System (Draft). DQR_DIM_RULE Reporting Rule Dimension (Draft). DQR_FCT_DATA_QUALITY Reporting Data Quality Fact table (Draft). (*) These tables are object specific and will be duplicated
  • 31. Internal and ConfidentialRules Roles Exception Data owners Data Segmentation
  • 32. Internal and Confidential Source System 1 Source System 2 Reference Tables Data object Definition Configuration Tables Check Results Distribute Results / Tasks Material Type Material Category List of Plants Material Group
  • 33. Internal and Confidential Data Quality Operational / DQO • These are operational tables; tables on which data quality check is performed and tables that store the outcomes. • Material Source data example would be plant data table or PIR data or general data. • Here each source data object table is complimented with system generated tables like bad table, issue table along with application table like Whitelist table and Source table. The right side ER shows MM Plant data object tables using MM_GENL_ in table naming • (*) These tables are object specific and will be duplicated • Here DQO_Run_DTL stores all DQ check results at data attribute level. • There is no relationship build on database level on purpose for these tables.
  • 34. DQ – Reporting Dimensional Data Model (Extension) D_L_RULE_BOOK RULE_NUMBER RULE_NAME RULE_DESCRIPTIO N RULE_PURPOSE .... F_H_DQ_ANALYSIS_AGGR F_L_DQ_RUN D_L_ROLE_ASSIG NMENT F_L_RULE_RUN D_L_SUBJECT_AR EA D_DQ_STEWARD D_L_RULE_CATEG ORY F_L_DQ_ERROR D_L_SOURCE_SYS D_L_SEVERITY D_L_DATE
  • 35. Entities/ Tables Description D_L_ROLE_ASSIGNMENT D_L_RULE_BOOK D_L_RULE_CATEGORY D_L_SEVERITY D_L_SOURCE_SYS D_L_SUBJECT_AREA Dimension Tables that contains: - All DQ rules - Rule Severity - Data Hierarchy - Data Steward role - Assigning Data Steward access to various data hierarchy - Various subject area F_H_DQ_ANALYSIS_AGGR F_L_DQ_ERROR F_L_DQ_RUN F_L_RULE_RUN Fact tables to capture all stats for reporting purpose at lower record level and at aggregated level too - generic error data - generic analysis data - All run details - Details of all rules applied during each run Reporting Data Model Description
  • 36. Feature Business Benefit Proven Framework Quick ROI & Low learning curve to implement a DQ project Platform and technology independent DQ framework Eliminates need for any new HW & SW purchases. Any kind of DQ/ ETL tool, UI interface, Analysis tool can be used Zero investment in new products. Reuse of inhouse tecnology and applications Can be used for any kind of data, be it Material Master , Vendor, Customer or any other non standard data like Vehicle, Crane 100% flexibility to extend to any new data area. Quick and easy to use and build Short implementation time. Normal low cost developers can be used to build it. New Reporting Dimensions can be added easily Flexibility to extend the framework to accomodate project based dimensions without dependency on product supplier or expert consultants Flexible data security model that supports valrious complex data hierarchy and vast combinations of data stewards It helps eliminate data security and data sensitivity discussions and prevents DQ project planning from unneccessary delay. Benefits of Framework
  翻译: