尊敬的 微信汇率:1円 ≈ 0.046239 元 支付宝汇率:1円 ≈ 0.04633元 [退出登录]
SlideShare a Scribd company logo
Statistical Databases
By Samir Rana
Database used for
statistical analysis.
● Researchers have access to statistics, but not the records
inside.
● Access control and limited query keeps prying eyes off from
sensitive data
● Basic ops Limited to : Count, Sum, Mean, std deviation etc
Requirements
Statistical Databases
● Data Model
● Query Language
● Integrity Constraints
● Recovery
● Physical DB organization
● Data analysis requirements
Data Model
1. Due to Multidimensionality of SD,
Relational Model is not suitable.
2. New data structures and operations
are needed, such as data cube
operator and aggregate Data
structures
Different Data model proposed:
SUBJECT ,
Semantic Association Models,
GRASS,
Conceptual Statistical Model,
Statistical Object Representation Model
Query Language
● powerful and easy-to-use query
languages to define and manipulate
statistical data.
● Evaluation Criteria of Statistical
query languages :
○ data and metadata definition,
○ data manipulation,
○ interface to statistical packages,
○ the expressive power of the
language.
Query Languages
● SDBMS built on top of CDBMS:
○ GRAFSTAT on DB2(SQL/DS),
○ STRAND on INGRES
● Generalized Interface system that links
together available CDBMS, statistical
packages and graphics software
○ SIBYL , GPI and PEPIN-SICLA
● Separately developed SDBMS:
○ RAPID, CAS SDB, ABE, SIR/SQL ,
GENISYS, CANTOR.
○ SIR/DBMS, TPL, TPLDCS, BROWSE.
● SDBMS with graphical user interfaces:
○ SUBJECT , GUIDE , ABE , STBE,
SEEDS online code book.
● Formal Extensions of Relational Model:
○ SSDL
● Natural language based user interface:
○ LIDS 86 .
● Query languages which calculate
aggregates from temporal data:
○ QUEL , HQUEL , TBE.
Tree Based Statistics Access Method(TBSAM)
● Calculate set-of-aggregates of all data items such that boolean qualification
● Based on the B+ tree, and it exploits all the benefits of a B+ trees dynamic nature.
● Aim is the efficient retrieval of a tuple, given the value of its index attribute.
● Dynamic index, and thus can support insertion/deletion/modification of tuples in
relation.
● Various types of statistical queries can be facilitated :
○ descriptive statistics,
○ order statistics,
○ statistical sampling types of queries.
Processing and Optimization
● large portion of statistical data are either
spatial or temporal data.
● pure tables of relational databases are not
capable of efficientlystoring or helping
retrieving such data.
● algorithms reorder the operations to be
performed → build the optimal or
suboptimal query processing tree →
depending on the physical data storage
structures, chooses the best possible
strategy to query data.
Operations on Temporal data
● Temporal theta join : the conjunction of two
sets or predicates, the time join predicate
and the non-time join predicate.
● TE-join : two tuples (or rows) in two join
relations (tables) are joined if their time
intervals intersect.
● T-join : causes the concatenation of tuples
from the operand relations only if their time
intervals intersect.
Security
● Easy to infer, The contents of Specific
Records from Statistical Data
● Conflict of Providing Statistics and
securing individual records gives rise to
Inference Control.
● Type of Inference Control
○ Query set Restriction
○ Data perturbation
○ Output Perturbation
○ Conceptual Approach
Evaluation of Effectiveness of Inference control:
● Security
● Robustness
● Bias
● Precision
● Consistency
● Cost
Other Applications
● Data Visualization:
○ A point in multidimensional space.
○ Can be used as a basis to build an interactive data visualization system.
○ User can browse in the multidimensional space.
● Statistical expert systems:
○ a program which can act in the role of an expert statistical consultant.
○ give expert advice on how to design a study, what data to collect to answer the research
questions, and how to analyze the data collected.
Conclusion
● Applications that collect vast amounts of data, and require interactive real-time
analysis capabilities on it, is on the rise.
● the standard approach of statistical analysis to load part of the data from a file
or database into a statistical package, and then performing analysis on it will
not work due to efficiency reasons.
● The overall goal of research in statistical database management has been to
make this analysis an integral part of the data management system itself.
● The focus of the research community has been on developing techniques to
make this happen.
Thank you

More Related Content

Similar to Statistical Databases

Job Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsJob Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabs
Stack Data Labs
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
Colleen Farrelly
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
MohammedAmeenUlIslam1
 
High dimensionality reduction on graphical data
High dimensionality reduction on graphical dataHigh dimensionality reduction on graphical data
High dimensionality reduction on graphical data
eSAT Journals
 
Data Science and Analysis.pptx
Data Science and Analysis.pptxData Science and Analysis.pptx
Data Science and Analysis.pptx
PrashantYadav931011
 
Introduction to Data Analytics.pptx
Introduction to Data Analytics.pptxIntroduction to Data Analytics.pptx
Introduction to Data Analytics.pptx
DikshantSharma63
 
7.-Data-Analytics.pptx
7.-Data-Analytics.pptx7.-Data-Analytics.pptx
7.-Data-Analytics.pptx
marow75067
 
Introducition to Data scinece compiled by hu
Introducition to Data scinece compiled by huIntroducition to Data scinece compiled by hu
Introducition to Data scinece compiled by hu
wekineheshete
 
Database.pdf
Database.pdfDatabase.pdf
Database.pdf
l235546
 
13_Data Preprocessing in Python.pptx (1).pdf
13_Data Preprocessing in Python.pptx (1).pdf13_Data Preprocessing in Python.pptx (1).pdf
13_Data Preprocessing in Python.pptx (1).pdf
andreyhapantenda
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
Knoldus Inc.
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
Knoldus Inc.
 
Complete Introduction to Business Data Analysis
Complete Introduction to Business Data AnalysisComplete Introduction to Business Data Analysis
Complete Introduction to Business Data Analysis
Sam Dias
 
Lesson1.2.pptx.pdf
Lesson1.2.pptx.pdfLesson1.2.pptx.pdf
Lesson1.2.pptx.pdf
JhimarPeredoJurado
 
Introductio to Data Science and types of data
Introductio to Data Science and types of dataIntroductio to Data Science and types of data
Introductio to Data Science and types of data
ManishaPatil932723
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptx
CarolineRebeccaD
 
Data science
Data scienceData science
Data science
Purna Chander
 
data structures and its importance
 data structures and its importance  data structures and its importance
data structures and its importance
Anaya Zafar
 
Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.
Jayanti Pande
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
NR Computer Learning Center
 

Similar to Statistical Databases (20)

Job Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsJob Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabs
 
Big data and data science overview
Big data and data science overviewBig data and data science overview
Big data and data science overview
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
 
High dimensionality reduction on graphical data
High dimensionality reduction on graphical dataHigh dimensionality reduction on graphical data
High dimensionality reduction on graphical data
 
Data Science and Analysis.pptx
Data Science and Analysis.pptxData Science and Analysis.pptx
Data Science and Analysis.pptx
 
Introduction to Data Analytics.pptx
Introduction to Data Analytics.pptxIntroduction to Data Analytics.pptx
Introduction to Data Analytics.pptx
 
7.-Data-Analytics.pptx
7.-Data-Analytics.pptx7.-Data-Analytics.pptx
7.-Data-Analytics.pptx
 
Introducition to Data scinece compiled by hu
Introducition to Data scinece compiled by huIntroducition to Data scinece compiled by hu
Introducition to Data scinece compiled by hu
 
Database.pdf
Database.pdfDatabase.pdf
Database.pdf
 
13_Data Preprocessing in Python.pptx (1).pdf
13_Data Preprocessing in Python.pptx (1).pdf13_Data Preprocessing in Python.pptx (1).pdf
13_Data Preprocessing in Python.pptx (1).pdf
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
 
DS Life Cycle
DS Life CycleDS Life Cycle
DS Life Cycle
 
Complete Introduction to Business Data Analysis
Complete Introduction to Business Data AnalysisComplete Introduction to Business Data Analysis
Complete Introduction to Business Data Analysis
 
Lesson1.2.pptx.pdf
Lesson1.2.pptx.pdfLesson1.2.pptx.pdf
Lesson1.2.pptx.pdf
 
Introductio to Data Science and types of data
Introductio to Data Science and types of dataIntroductio to Data Science and types of data
Introductio to Data Science and types of data
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptx
 
Data science
Data scienceData science
Data science
 
data structures and its importance
 data structures and its importance  data structures and its importance
data structures and its importance
 
Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 

Recently uploaded

一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
gebegu
 
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts ServicePune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
vashimk775
 
PCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdfPCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdf
incitbe
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
Rebecca Bilbro
 
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENTHigh Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
ranjeet3341
 
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering RoadshowFabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Gabi Münster
 
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
mparmparousiskostas
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
nhutnguyen355078
 
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
PsychoTech Services
 
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
Douglas Day
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering RoadshowDirect Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Gabi Münster
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
sapna sharmap11
 
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
zoykygu
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
nhero3888
 
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
hanshkumar9870
 
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your DoorAhmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Russian Escorts in Delhi 9711199171 with low rate Book online
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 
Health care analysis using sentimental analysis
Health care analysis using sentimental analysisHealth care analysis using sentimental analysis
Health care analysis using sentimental analysis
krishnasrigannavarap
 

Recently uploaded (20)

一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
 
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts ServicePune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
Pune Call Girls <BOOK> 😍 Call Girl Pune Escorts Service
 
PCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdfPCI-DSS-Data Security Standard v4.0.1.pdf
PCI-DSS-Data Security Standard v4.0.1.pdf
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
 
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENTHigh Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
High Profile Call Girls Navi Mumbai ✅ 9833363713 FULL CASH PAYMENT
 
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering RoadshowFabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
Fabric Engineering Deep Dive Keynote from Fabric Engineering Roadshow
 
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
 
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
 
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
202406 - Cape Town Snowflake User Group - LLM & RAG.pdf
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering RoadshowDirect Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
 
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
一比一原版(heriotwatt学位证书)英国赫瑞瓦特大学毕业证如何办理
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
 
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
 
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your DoorAhmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
 
Health care analysis using sentimental analysis
Health care analysis using sentimental analysisHealth care analysis using sentimental analysis
Health care analysis using sentimental analysis
 

Statistical Databases

  • 3. ● Researchers have access to statistics, but not the records inside. ● Access control and limited query keeps prying eyes off from sensitive data ● Basic ops Limited to : Count, Sum, Mean, std deviation etc
  • 4. Requirements Statistical Databases ● Data Model ● Query Language ● Integrity Constraints ● Recovery ● Physical DB organization ● Data analysis requirements
  • 5. Data Model 1. Due to Multidimensionality of SD, Relational Model is not suitable. 2. New data structures and operations are needed, such as data cube operator and aggregate Data structures Different Data model proposed: SUBJECT , Semantic Association Models, GRASS, Conceptual Statistical Model, Statistical Object Representation Model
  • 6. Query Language ● powerful and easy-to-use query languages to define and manipulate statistical data. ● Evaluation Criteria of Statistical query languages : ○ data and metadata definition, ○ data manipulation, ○ interface to statistical packages, ○ the expressive power of the language.
  • 7. Query Languages ● SDBMS built on top of CDBMS: ○ GRAFSTAT on DB2(SQL/DS), ○ STRAND on INGRES ● Generalized Interface system that links together available CDBMS, statistical packages and graphics software ○ SIBYL , GPI and PEPIN-SICLA ● Separately developed SDBMS: ○ RAPID, CAS SDB, ABE, SIR/SQL , GENISYS, CANTOR. ○ SIR/DBMS, TPL, TPLDCS, BROWSE. ● SDBMS with graphical user interfaces: ○ SUBJECT , GUIDE , ABE , STBE, SEEDS online code book. ● Formal Extensions of Relational Model: ○ SSDL ● Natural language based user interface: ○ LIDS 86 . ● Query languages which calculate aggregates from temporal data: ○ QUEL , HQUEL , TBE.
  • 8. Tree Based Statistics Access Method(TBSAM) ● Calculate set-of-aggregates of all data items such that boolean qualification ● Based on the B+ tree, and it exploits all the benefits of a B+ trees dynamic nature. ● Aim is the efficient retrieval of a tuple, given the value of its index attribute. ● Dynamic index, and thus can support insertion/deletion/modification of tuples in relation. ● Various types of statistical queries can be facilitated : ○ descriptive statistics, ○ order statistics, ○ statistical sampling types of queries.
  • 9. Processing and Optimization ● large portion of statistical data are either spatial or temporal data. ● pure tables of relational databases are not capable of efficientlystoring or helping retrieving such data. ● algorithms reorder the operations to be performed → build the optimal or suboptimal query processing tree → depending on the physical data storage structures, chooses the best possible strategy to query data. Operations on Temporal data ● Temporal theta join : the conjunction of two sets or predicates, the time join predicate and the non-time join predicate. ● TE-join : two tuples (or rows) in two join relations (tables) are joined if their time intervals intersect. ● T-join : causes the concatenation of tuples from the operand relations only if their time intervals intersect.
  • 10. Security ● Easy to infer, The contents of Specific Records from Statistical Data ● Conflict of Providing Statistics and securing individual records gives rise to Inference Control. ● Type of Inference Control ○ Query set Restriction ○ Data perturbation ○ Output Perturbation ○ Conceptual Approach Evaluation of Effectiveness of Inference control: ● Security ● Robustness ● Bias ● Precision ● Consistency ● Cost
  • 11. Other Applications ● Data Visualization: ○ A point in multidimensional space. ○ Can be used as a basis to build an interactive data visualization system. ○ User can browse in the multidimensional space. ● Statistical expert systems: ○ a program which can act in the role of an expert statistical consultant. ○ give expert advice on how to design a study, what data to collect to answer the research questions, and how to analyze the data collected.
  • 12. Conclusion ● Applications that collect vast amounts of data, and require interactive real-time analysis capabilities on it, is on the rise. ● the standard approach of statistical analysis to load part of the data from a file or database into a statistical package, and then performing analysis on it will not work due to efficiency reasons. ● The overall goal of research in statistical database management has been to make this analysis an integral part of the data management system itself. ● The focus of the research community has been on developing techniques to make this happen.
  翻译: