This presentation several topics of subjects RDBMS and DBMS including Distributed Database Design,Architecture of Distributed database processing system,Data Communication concept,Concurrency control and recovery. All the topics are briefly described according to syllabus of BCA II and BCA III year subjects.
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESAAKANKSHA JAIN
Distributed Database Designs are nothing but multiple, logically related Database systems, physically distributed over several sites, using a Computer Network, which is usually under a centralized site control.
Distributed database design refers to the following problem:
Given a database and its workload, how should the database be split and allocated to sites so as to optimize certain objective function
There are two issues:
(i) Data fragmentation which determines how the data should be fragmented.
(ii) Data allocation which determines how the fragments should be allocated.
A distributed database is a collection of logically interrelated databases distributed over a computer network. A distributed database management system (DDBMS) manages the distributed database and makes the distribution transparent to users. There are two main types of DDBMS - homogeneous and heterogeneous. Key characteristics of distributed databases include replication of fragments, shared logically related data across sites, and each site being controlled by a DBMS. Challenges include complex management, security, and increased storage requirements due to data replication.
This document discusses distributed databases and distributed database management systems (DDBMS). It defines a distributed database as a logically interrelated collection of shared data physically distributed over a computer network. A DDBMS is software that manages the distributed database and makes the distribution transparent to users. The document outlines key concepts of distributed databases including data fragmentation, allocation, and replication across multiple database sites connected by a network. It also discusses reference architectures, components, design considerations, and types of transparency provided by DDBMS.
The document summarizes key concepts in distributed database systems including:
1) Distributed database architectures have external, conceptual, and internal views of data. Common architectures include client-server and peer-to-peer.
2) Distributed databases can be designed top-down using a global schema or bottom-up without a global schema.
3) Fragmentation and allocation distribute data across sites for performance and availability. Correct fragmentation follows completeness, reconstruction, and disjointness rules.
This document discusses distributed database management systems (DDBMS). It outlines the evolution of DDBMS from centralized systems to today's distributed systems over the internet. It describes the advantages and disadvantages of DDBMS, components of DDBMS including transaction processors and data processors, and levels of data and process distribution including single-site, multiple-site, and fully distributed systems. It also discusses concepts like distribution transparency, transaction transparency, and distributed concurrency control in DDBMS.
This document provides an introduction to distributed databases. It defines a distributed database as a collection of logically related databases distributed over a computer network. It describes distributed computing and how distributed databases partition data across multiple computers. The document outlines different types of distributed database systems including homogeneous and heterogeneous. It also discusses distributed data storage techniques like replication, fragmentation, and allocation. Finally, it lists several advantages and objectives of distributed databases as well as some disadvantages.
This document discusses distributed databases and client-server architectures. It begins by outlining distributed database concepts like fragmentation, replication and allocation of data across multiple sites. It then describes different types of distributed database systems including homogeneous, heterogeneous, federated and multidatabase systems. Query processing techniques like query decomposition and optimization strategies for distributed queries are also covered. Finally, the document discusses client-server architecture and its various components for managing distributed databases.
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESAAKANKSHA JAIN
Distributed Database Designs are nothing but multiple, logically related Database systems, physically distributed over several sites, using a Computer Network, which is usually under a centralized site control.
Distributed database design refers to the following problem:
Given a database and its workload, how should the database be split and allocated to sites so as to optimize certain objective function
There are two issues:
(i) Data fragmentation which determines how the data should be fragmented.
(ii) Data allocation which determines how the fragments should be allocated.
A distributed database is a collection of logically interrelated databases distributed over a computer network. A distributed database management system (DDBMS) manages the distributed database and makes the distribution transparent to users. There are two main types of DDBMS - homogeneous and heterogeneous. Key characteristics of distributed databases include replication of fragments, shared logically related data across sites, and each site being controlled by a DBMS. Challenges include complex management, security, and increased storage requirements due to data replication.
This document discusses distributed databases and distributed database management systems (DDBMS). It defines a distributed database as a logically interrelated collection of shared data physically distributed over a computer network. A DDBMS is software that manages the distributed database and makes the distribution transparent to users. The document outlines key concepts of distributed databases including data fragmentation, allocation, and replication across multiple database sites connected by a network. It also discusses reference architectures, components, design considerations, and types of transparency provided by DDBMS.
The document summarizes key concepts in distributed database systems including:
1) Distributed database architectures have external, conceptual, and internal views of data. Common architectures include client-server and peer-to-peer.
2) Distributed databases can be designed top-down using a global schema or bottom-up without a global schema.
3) Fragmentation and allocation distribute data across sites for performance and availability. Correct fragmentation follows completeness, reconstruction, and disjointness rules.
This document discusses distributed database management systems (DDBMS). It outlines the evolution of DDBMS from centralized systems to today's distributed systems over the internet. It describes the advantages and disadvantages of DDBMS, components of DDBMS including transaction processors and data processors, and levels of data and process distribution including single-site, multiple-site, and fully distributed systems. It also discusses concepts like distribution transparency, transaction transparency, and distributed concurrency control in DDBMS.
This document provides an introduction to distributed databases. It defines a distributed database as a collection of logically related databases distributed over a computer network. It describes distributed computing and how distributed databases partition data across multiple computers. The document outlines different types of distributed database systems including homogeneous and heterogeneous. It also discusses distributed data storage techniques like replication, fragmentation, and allocation. Finally, it lists several advantages and objectives of distributed databases as well as some disadvantages.
This document discusses distributed databases and client-server architectures. It begins by outlining distributed database concepts like fragmentation, replication and allocation of data across multiple sites. It then describes different types of distributed database systems including homogeneous, heterogeneous, federated and multidatabase systems. Query processing techniques like query decomposition and optimization strategies for distributed queries are also covered. Finally, the document discusses client-server architecture and its various components for managing distributed databases.
This document discusses distributed data processing (DDP) as an alternative to centralized data processing. Some key points:
1) DDP involves dispersing computers and processing throughout an organization to allow for greater flexibility and redundancy compared to centralized systems.
2) Factors driving the increase of DDP include dramatically reduced workstation costs, improved desktop interfaces and power, and the ability to share data across servers.
3) While DDP provides benefits like increased responsiveness, availability, and user involvement, it also presents drawbacks such as more points of failure, incompatibility issues, and complex management compared to centralized systems.
The document discusses different distribution design alternatives for tables in a distributed database management system (DDBMS), including non-replicated and non-fragmented, fully replicated, partially replicated, fragmented, and mixed. It describes each alternative and discusses when each would be most suitable. The document also covers data replication, advantages and disadvantages of replication, and different replication techniques. Finally, it discusses fragmentation, the different types of fragmentation, and advantages and disadvantages of fragmentation.
DDBMS, characteristics, Centralized vs. Distributed Database, Homogeneous DDBMS, Heterogeneous DDBMS, Advantages, Disadvantages, What is parallel database, Data fragmentation, Replication, Distribution Transaction
Distributed Database Architecture
Database Links
Distributed Database Administration
Transaction Processing in a Distributed System
Distributed Database Application Development
Character Set Support for Distributed Environments
Query Processing : Query Processing Problem, Layers of Query Processing Query Processing in Centralized Systems – Parsing & Translation, Optimization, Code generation, Example Query Processing in Distributed Systems – Mapping global query to local, Optimization,
A distributed database system is a database in which portions of the database are stored on multiple computers within a network. This provides advantages like reliability if one site crashes, and speed since information is distributed rather than centralized. However, proper hardware and software is needed to connect the distributed sites, and there may be connection errors that impact users.
Distributed shared memory (DSM) is a memory architecture where physically separate memories can be addressed as a single logical address space. In a DSM system, data moves between nodes' main and secondary memories when a process accesses shared data. Each node has a memory mapping manager that maps the shared virtual memory to local physical memory. DSM provides advantages like shielding programmers from message passing, lower cost than multiprocessors, and large virtual address spaces, but disadvantages include potential performance penalties from remote data access and lack of programmer control over messaging.
Distribution transparency and Distributed transactionshraddha mane
Distribution transparency and Distributed transaction.deadlock detection .Distributed transaction and their types and threads and processes and their difference.
The document discusses distributed query processing and optimization in distributed database systems. It covers topics like query decomposition, distributed query optimization techniques including cost models, statistics collection and use, and algorithms for query optimization. Specifically, it describes the process of optimizing queries distributed across multiple database fragments or sites including generating the search space of possible query execution plans, using cost functions and statistics to pick the best plan, and examples of deterministic and randomized search strategies used.
The document summarizes some of the key potential problems with distributed database management systems (DDBMS), including:
1) Distributed database design issues around how to partition and replicate the database across sites.
2) Distributed directory management challenges in maintaining consistency across global or local directories.
3) Distributed query processing difficulties in determining optimal strategies for executing queries across network locations.
4) Distributed concurrency control complications in synchronizing access to multiple copies of the database across sites while maintaining consistency.
The document discusses data warehouses and their advantages. It describes the different views of a data warehouse including the top-down view, data source view, data warehouse view, and business query view. It also discusses approaches to building a data warehouse, including top-down and bottom-up, and steps involved including planning, requirements, design, integration, and deployment. Finally, it discusses technologies used to populate and refresh data warehouses like extraction, cleaning, transformation, load, and refresh tools.
2. Distributed Systems Hardware & Software conceptsPrajakta Rane
This document discusses distributed system software and middleware. It describes three types of operating systems used in distributed systems - distributed operating systems, network operating systems, and middleware operating systems. Middleware operating systems provide a common set of services for local applications and independent services for remote applications. Common middleware models include remote procedure call, remote method invocation, CORBA, and message-oriented middleware. Middleware offers services like naming, persistence, messaging, querying, concurrency control, and security.
The document discusses techniques used by a database management system (DBMS) to process, optimize, and execute high-level queries. It describes the phases of query processing which include syntax checking, translating the SQL query into an algebraic expression, optimization to choose an efficient execution plan, and running the optimized plan. Query optimization aims to minimize resources like disk I/O and CPU time by selecting the best execution strategy. Techniques for optimization include heuristic rules, cost-based methods, and semantic query optimization using constraints.
This document discusses distributed database design and fragmentation techniques. It begins with an outline of topics covered, then describes the design problem of placing data and applications across computer network sites. Primary horizontal fragmentation is explained as fragmenting a relation based on minterm predicates derived from a complete and minimal set of simple predicates describing the relation and application access patterns. An algorithm is provided to determine this fragmentation through several steps including finding the simple predicates, deriving minterm predicates, and eliminating contradictions to form the fragments. An example demonstrates applying this process to fragment relations based on salary and project budget attributes.
Transaction concept, ACID property, Objectives of transaction management, Types of transactions, Objectives of Distributed Concurrency Control, Concurrency Control anomalies, Methods of concurrency control, Serializability and recoverability, Distributed Serializability, Enhanced lock based and timestamp based protocols, Multiple granularity, Multi version schemes, Optimistic Concurrency Control techniques
This document discusses database fragmentation in distributed database management systems (DDBMS). Database fragmentation allows a single database object to be broken into multiple segments that can be stored across different sites on a network. This improves efficiency, security, parallelism, availability, reliability and performance. There are three main types of fragmentation: horizontal, vertical, and mixed. Horizontal fragmentation breaks data by attributes like location, vertical by attributes like departments, and mixed uses both. While fragmentation provides advantages, it also increases complexity, cost, and makes security and integrity control more difficult.
This document discusses different distributed computing system (DCS) models:
1. The minicomputer model consists of a few minicomputers with remote access allowing resource sharing.
2. The workstation model consists of independent workstations scattered throughout a building where users log onto their home workstation.
3. The workstation-server model includes minicomputers, diskless and diskful workstations, and centralized services like databases and printing.
It provides an overview of the key characteristics and advantages of different DCS models.
Distributed databases allow data to be shared across a computer network while being stored on multiple machines. A distributed database management system (DDBMS) allows for the management of distributed databases and makes the distribution transparent to users. Key concepts in distributed DBMS design include fragmentation, allocation, and replication of data across multiple sites. Transparency, performance, and handling failures and concurrency are important considerations for DDBMS.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Distributed Database Introduction
TYPES OF DD:
1. HOMOGENEOUS DISTRIBUTED DATABASE
2. HETEROGENEOUS DISTRIBUTED DATABASE
Distributed DBMS Architectures
Architectural Models
Some of the common architectural models are −
● Client - Server Architecture for DDBMS
● Peer - to - Peer Architecture for DDBMS
● Multi - DBMS Architecture
Design issues of distributed system –
1. Complex nature :
Distributed Databases are a network of many computers present at different locations and they provide an outstanding level of performance,
availability, and of course reliability. Therefore, the nature of Distributed DBMS is comparatively more complex than a centralized DBMS. Complex
software is required for Distributed DBMS. Also, It ensures no data replication, which adds even more complexity in its nature.
2. Overall Cost :
Various costs such as maintenance cost, procurement cost, hardware cost, network/communication costs, labor costs, etc, adds up to the overall
cost and make it costlier than normal DBMS.
3. Security issues:
In a Distributed Database, along with maintaining no data redundancy, the security of data as well as a network is a prime concern. A network can be
easily attacked for data theft and misuse.
4. Integrity Control:
In a vast Distributed database system, maintaining data consistency is important. All changes made to data at one site must be reflected on all the
sites. The communication and processing cost is high in Distributed DBMS in order to enforce the integrity of data.
5. Lacking Standards:
Although it provides effective communication and data sharing, still there are no standard rules and protocols to convert a centralized DBMS to a
large Distributed DBMS. Lack of standards decreases the potential of Distributed DBMS.
6. Lack of Professional Support:
Due to a lack of adequate communication standards, it is not possible to link different equipment produced by different vendors into a smoothly
functioning network. Thu several good resources may not be available to the users of the network.
7. Data design complex:
Fragmentation
A distributed database is a collection of logically related databases distributed across a computer network. It is managed by a distributed database management system (D-DBMS) that makes the distribution transparent to users. There are two main types - homogeneous, where all sites have identical software and cooperate, and heterogeneous, where sites may differ. Key design issues are data fragmentation, allocation, and replication. Data can be fragmented horizontally by row or vertically by column and allocated centrally, in partitions, or with full or selective replication for availability and performance.
This document discusses distributed data processing (DDP) as an alternative to centralized data processing. Some key points:
1) DDP involves dispersing computers and processing throughout an organization to allow for greater flexibility and redundancy compared to centralized systems.
2) Factors driving the increase of DDP include dramatically reduced workstation costs, improved desktop interfaces and power, and the ability to share data across servers.
3) While DDP provides benefits like increased responsiveness, availability, and user involvement, it also presents drawbacks such as more points of failure, incompatibility issues, and complex management compared to centralized systems.
The document discusses different distribution design alternatives for tables in a distributed database management system (DDBMS), including non-replicated and non-fragmented, fully replicated, partially replicated, fragmented, and mixed. It describes each alternative and discusses when each would be most suitable. The document also covers data replication, advantages and disadvantages of replication, and different replication techniques. Finally, it discusses fragmentation, the different types of fragmentation, and advantages and disadvantages of fragmentation.
DDBMS, characteristics, Centralized vs. Distributed Database, Homogeneous DDBMS, Heterogeneous DDBMS, Advantages, Disadvantages, What is parallel database, Data fragmentation, Replication, Distribution Transaction
Distributed Database Architecture
Database Links
Distributed Database Administration
Transaction Processing in a Distributed System
Distributed Database Application Development
Character Set Support for Distributed Environments
Query Processing : Query Processing Problem, Layers of Query Processing Query Processing in Centralized Systems – Parsing & Translation, Optimization, Code generation, Example Query Processing in Distributed Systems – Mapping global query to local, Optimization,
A distributed database system is a database in which portions of the database are stored on multiple computers within a network. This provides advantages like reliability if one site crashes, and speed since information is distributed rather than centralized. However, proper hardware and software is needed to connect the distributed sites, and there may be connection errors that impact users.
Distributed shared memory (DSM) is a memory architecture where physically separate memories can be addressed as a single logical address space. In a DSM system, data moves between nodes' main and secondary memories when a process accesses shared data. Each node has a memory mapping manager that maps the shared virtual memory to local physical memory. DSM provides advantages like shielding programmers from message passing, lower cost than multiprocessors, and large virtual address spaces, but disadvantages include potential performance penalties from remote data access and lack of programmer control over messaging.
Distribution transparency and Distributed transactionshraddha mane
Distribution transparency and Distributed transaction.deadlock detection .Distributed transaction and their types and threads and processes and their difference.
The document discusses distributed query processing and optimization in distributed database systems. It covers topics like query decomposition, distributed query optimization techniques including cost models, statistics collection and use, and algorithms for query optimization. Specifically, it describes the process of optimizing queries distributed across multiple database fragments or sites including generating the search space of possible query execution plans, using cost functions and statistics to pick the best plan, and examples of deterministic and randomized search strategies used.
The document summarizes some of the key potential problems with distributed database management systems (DDBMS), including:
1) Distributed database design issues around how to partition and replicate the database across sites.
2) Distributed directory management challenges in maintaining consistency across global or local directories.
3) Distributed query processing difficulties in determining optimal strategies for executing queries across network locations.
4) Distributed concurrency control complications in synchronizing access to multiple copies of the database across sites while maintaining consistency.
The document discusses data warehouses and their advantages. It describes the different views of a data warehouse including the top-down view, data source view, data warehouse view, and business query view. It also discusses approaches to building a data warehouse, including top-down and bottom-up, and steps involved including planning, requirements, design, integration, and deployment. Finally, it discusses technologies used to populate and refresh data warehouses like extraction, cleaning, transformation, load, and refresh tools.
2. Distributed Systems Hardware & Software conceptsPrajakta Rane
This document discusses distributed system software and middleware. It describes three types of operating systems used in distributed systems - distributed operating systems, network operating systems, and middleware operating systems. Middleware operating systems provide a common set of services for local applications and independent services for remote applications. Common middleware models include remote procedure call, remote method invocation, CORBA, and message-oriented middleware. Middleware offers services like naming, persistence, messaging, querying, concurrency control, and security.
The document discusses techniques used by a database management system (DBMS) to process, optimize, and execute high-level queries. It describes the phases of query processing which include syntax checking, translating the SQL query into an algebraic expression, optimization to choose an efficient execution plan, and running the optimized plan. Query optimization aims to minimize resources like disk I/O and CPU time by selecting the best execution strategy. Techniques for optimization include heuristic rules, cost-based methods, and semantic query optimization using constraints.
This document discusses distributed database design and fragmentation techniques. It begins with an outline of topics covered, then describes the design problem of placing data and applications across computer network sites. Primary horizontal fragmentation is explained as fragmenting a relation based on minterm predicates derived from a complete and minimal set of simple predicates describing the relation and application access patterns. An algorithm is provided to determine this fragmentation through several steps including finding the simple predicates, deriving minterm predicates, and eliminating contradictions to form the fragments. An example demonstrates applying this process to fragment relations based on salary and project budget attributes.
Transaction concept, ACID property, Objectives of transaction management, Types of transactions, Objectives of Distributed Concurrency Control, Concurrency Control anomalies, Methods of concurrency control, Serializability and recoverability, Distributed Serializability, Enhanced lock based and timestamp based protocols, Multiple granularity, Multi version schemes, Optimistic Concurrency Control techniques
This document discusses database fragmentation in distributed database management systems (DDBMS). Database fragmentation allows a single database object to be broken into multiple segments that can be stored across different sites on a network. This improves efficiency, security, parallelism, availability, reliability and performance. There are three main types of fragmentation: horizontal, vertical, and mixed. Horizontal fragmentation breaks data by attributes like location, vertical by attributes like departments, and mixed uses both. While fragmentation provides advantages, it also increases complexity, cost, and makes security and integrity control more difficult.
This document discusses different distributed computing system (DCS) models:
1. The minicomputer model consists of a few minicomputers with remote access allowing resource sharing.
2. The workstation model consists of independent workstations scattered throughout a building where users log onto their home workstation.
3. The workstation-server model includes minicomputers, diskless and diskful workstations, and centralized services like databases and printing.
It provides an overview of the key characteristics and advantages of different DCS models.
Distributed databases allow data to be shared across a computer network while being stored on multiple machines. A distributed database management system (DDBMS) allows for the management of distributed databases and makes the distribution transparent to users. Key concepts in distributed DBMS design include fragmentation, allocation, and replication of data across multiple sites. Transparency, performance, and handling failures and concurrency are important considerations for DDBMS.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Distributed Database Introduction
TYPES OF DD:
1. HOMOGENEOUS DISTRIBUTED DATABASE
2. HETEROGENEOUS DISTRIBUTED DATABASE
Distributed DBMS Architectures
Architectural Models
Some of the common architectural models are −
● Client - Server Architecture for DDBMS
● Peer - to - Peer Architecture for DDBMS
● Multi - DBMS Architecture
Design issues of distributed system –
1. Complex nature :
Distributed Databases are a network of many computers present at different locations and they provide an outstanding level of performance,
availability, and of course reliability. Therefore, the nature of Distributed DBMS is comparatively more complex than a centralized DBMS. Complex
software is required for Distributed DBMS. Also, It ensures no data replication, which adds even more complexity in its nature.
2. Overall Cost :
Various costs such as maintenance cost, procurement cost, hardware cost, network/communication costs, labor costs, etc, adds up to the overall
cost and make it costlier than normal DBMS.
3. Security issues:
In a Distributed Database, along with maintaining no data redundancy, the security of data as well as a network is a prime concern. A network can be
easily attacked for data theft and misuse.
4. Integrity Control:
In a vast Distributed database system, maintaining data consistency is important. All changes made to data at one site must be reflected on all the
sites. The communication and processing cost is high in Distributed DBMS in order to enforce the integrity of data.
5. Lacking Standards:
Although it provides effective communication and data sharing, still there are no standard rules and protocols to convert a centralized DBMS to a
large Distributed DBMS. Lack of standards decreases the potential of Distributed DBMS.
6. Lack of Professional Support:
Due to a lack of adequate communication standards, it is not possible to link different equipment produced by different vendors into a smoothly
functioning network. Thu several good resources may not be available to the users of the network.
7. Data design complex:
Fragmentation
A distributed database is a collection of logically related databases distributed across a computer network. It is managed by a distributed database management system (D-DBMS) that makes the distribution transparent to users. There are two main types - homogeneous, where all sites have identical software and cooperate, and heterogeneous, where sites may differ. Key design issues are data fragmentation, allocation, and replication. Data can be fragmented horizontally by row or vertically by column and allocated centrally, in partitions, or with full or selective replication for availability and performance.
• One of the most important decisions a distributed database designer has to make is data placement. Proper data placement is a crucial factor in determining the success of a distributed database system.
• There are four basic alternatives: namely,
– centralized,
– replicated,
– partitioned, and
– hybrid.
A distributed database is a set of interconnected databases spread over a computer network. It manages these distributed databases and makes them transparent to users. Data is logically interrelated but physically stored across multiple sites connected by a network. Distributed databases provide advantages like fast processing, reliability, lower costs and easier expansion but are more complex to manage with security and concurrency issues.
This document discusses distributed databases and their design. It defines a distributed database as a collection of logically related data distributed over a computer network and managed by a distributed database management system (D-DBMS). The document outlines distributed database types including homogeneous and heterogeneous, and covers key aspects of distributed database design such as data fragmentation, allocation, and replication.
Horizontal fragmentation divides a table into fragments based on the values of one or more fields, such that each fragment contains all columns but only tuples matching the fragment criteria. Vertical fragmentation divides a table into fragments where each fragment contains all tuples but only some columns. Hybrid fragmentation combines horizontal and vertical techniques, generating fragments with minimal extraneous information but at the cost of more expensive reconstruction.
The document discusses distributed databases and client-server architectures. It covers key concepts like distributed database systems, parallel vs distributed technologies, advantages of distributed databases, and functions like data fragmentation, replication, and allocation. It also discusses types of distributed database systems, query processing, and concurrency control and recovery in distributed databases. Finally, it provides an overview of the 3-tier client-server architecture and distributed databases in Oracle.
Distributed database consists of multiple databases that are connected with each other and are spread across different physical locations. The data that is stored on various physical locations can thus be managed independently of other physical locations. The communication between databases at different physical locations is thus done by a computer network.
A distributed database is a database that is not limited to one computer system.
It is like a database that consists of two or more files located in different computers or sites either on the same network or on an entirely different network.
Instead of storing all of the data in one database, data is divided and stored at different locations or sites which do not share any physical component.
The data can be easily accessed, managed, modified, updated, controlled, and organized in a database.
This document discusses distributed database systems. It defines centralized, distributed, and decentralized database systems. The key topics covered include distributed database management systems (DDBMS), advantages and disadvantages of DDBMS, distributed database design involving data fragmentation, replication and allocation, functions of a DDBMS, types of DDBMS including homogeneous and heterogeneous, and database transparency and gateways. The document is presented by a group with members Zupash, Sana, Marhaba and a group leader Hira Anwar.
This document discusses distributed databases and distributed database management systems (DDBMS). It defines a distributed database as a collection of interconnected databases spread across multiple physical locations. A DDBMS manages these distributed databases, making the distribution transparent to users and ensuring data integrity and consistency across sites. The document outlines different features, types, storage methods, advantages, and disadvantages of distributed databases and DDBMS.
Adbms 27 parallel database distribution architectureVaibhav Khanna
Parallel database architectures allow multiple processors to control multiple disk units containing partitions of a database. There are several types of architectures including shared memory, shared disk, and shared nothing. In shared memory, all processors access the same memory and disks. In shared disk, each processor has exclusive memory access but shared disks. In shared nothing, each processor has exclusive memory and disk control but can communicate. Careful data partitioning across disks is important to allow parallel query processing.
This document provides an overview of key concepts in parallel and distributed database systems. It discusses motivations for parallelism and distribution such as improved performance and reliability. It describes types of parallelism and benefits and drawbacks of distributed database management systems (DDBMS). It outlines Date's 12 rules for distributed database design, including concepts like local autonomy, transparency, and independence. It also discusses query processing, distributed transactions, hardware/DBMS independence, and the history of parallel and distributed databases.
Distributed databases allow data to be stored across multiple computers or sites connected through a network. The data is logically interrelated but physically distributed. A distributed database management system (DDBMS) makes the distribution transparent to users and allows sites to operate autonomously while participating in global applications. Key aspects of DDBMS include distributed transactions, concurrency control, data fragmentation and replication, distributed query processing, and ensuring transparency of the distribution.
The document discusses database system concepts including architecture, schema, and instances. It describes 1-tier, 2-tier, and 3-tier architectures. A 1-tier architecture has the client, server, and database on one machine. A 2-tier architecture separates the presentation and data layers across two machines. A 3-tier architecture introduces an application layer between the presentation and data layers. The schema defines the database structure while instances represent the stored data at a point in time.
A Review On Fragmentation Techniques In Distributed DatabaseJose Katab
This document discusses fragmentation techniques in distributed database systems. It begins with an introduction to data fragmentation and how it allows databases to be broken into segments that can be stored across network sites. The document then covers three main types of fragmentation - horizontal, vertical, and hybrid. Horizontal fragmentation divides rows, vertical fragmentation divides columns, and hybrid uses both. The document provides examples of each type and discusses how fragmentation aims to improve reliability, performance, storage usage, and more. It concludes that choosing an appropriate fragmentation method is important for utilizing resources in distributed database systems.
This document discusses database system architectures and distributed database systems. It covers transaction server systems, distributed database definitions, promises of distributed databases, complications introduced, and design issues. It also provides examples of horizontal and vertical data fragmentation and discusses parallel database architectures, components, and data partitioning techniques.
Overview, Database System vs File System, Database System Concept and
Architecture, Data Model Schema and Instances, Data Independence and Database Language and
Interfaces, Data Definitions Language, DML, Overall Database Structure. Data Modeling Using the
Entity Relationship Model: ER Model Concepts, Notation for ER Diagram, Mapping Constraints,
Keys, Concepts of Super Key, Candidate Key, Primary Key, Generalization, Aggregation,
Reduction of an ER Diagrams to Tables, Extended ER Model, Relationship of Higher Degree.
The document discusses key concepts related to database management systems (DBMS), including:
1. A DBMS allows for the creation, organization, and management of structured data in a centralized database that can be easily accessed and shared.
2. The three-level architecture of a DBMS separates the database into an internal, conceptual, and external schema to abstract the physical storage from the logical design and user view.
3. Key components of a DBMS include hardware for storage and input/output, software for managing the database, and users who design, implement and query the database system.
A distributed database (DDB) is a collection of logically related databases distributed across a computer network. It allows data and processing to occur at multiple sites. Key characteristics include data fragmentation across sites, replication of fragments for availability and performance, and distributed transaction management to ensure consistency. The main types are homogeneous DDBMS, where all sites use identical software, and heterogeneous DDBMS where different sites may use different systems. Challenges include complex management, security, and maintaining consistency across sites.
Similar to Distributed Database Management System (20)
Introduction
What is ML, DL, AL?
Decision Tree
Definition
Why Decision Tree?
Basic Terminology
Challenges
Random Forest
Definition
Why Random Forest
How does it work?
Advantages & Disadvantages
Definition: According to Arthur Samuel (1950) “Machine Learning is a field of study that gives computers the ability to learn without being explicitly programmed”.
Machine learning is the study and design of algorithms which can learn by processing input (learning samples) data.
The most widely used definition of machine learning is that of Carnegie Mellon University Professor Tom Mitchell: “A computer program is said to learn from experience ‘E’, with respect to some class of tasks ‘T’ and performance measure ‘P’ if its performance at tasks in ‘T’ as measured by ‘P’ improves with experience ‘E’”.
Decision Tree
Definition
Why Decision Tree?
Basic Terminology
Challenges
Random Forest
Definition
Why Random Forest
How does it work?
Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data
Inheritance in java introduces the concept of reusability by implementing a mechanism in which one object acquires all the properties and behaviors of the parent object.
Inheritance in Java is a mechanism in which one object acquires all the properties and behaviors of a parent object. It is an important part of OOPs (Object Oriented programming system).
The idea behind inheritance in Java is that you can create new classes that are built upon existing classes. When you inherit from an existing class, you can reuse methods and fields of the parent class. Moreover, you can add new methods and fields in your current class also.
Inheritance represents the IS-A relationship which is also known as a parent-child relationship.
Java provides a data structure, the array, which stores a fixed-size sequential collection of elements of the same type.
An array is used to store a collection of data, but it is often more useful to think of an array as a collection of
variables of the same type.
Data Mining is defined as extracting information from huge sets of data. In other words, we can say that data mining is the procedure of mining knowledge from data.
According to Inmon, a data warehouse is a subject oriented,
integrated, time-variant, and non-volatile collection of data. He defined the terms
in the sentence as follows:
Distributed Database Design and Relational Query LanguageAAKANKSHA JAIN
1) The document discusses topics related to distributed database design and relational query languages including transaction management, serializability, blocking, deadlocks, and query optimization.
2) A transaction begins with the first SQL statement and ends when committed or rolled back. It has ACID properties - atomicity, consistency, isolation, and durability.
3) Serializability ensures transactions are processed in a consistent order. Conflict serializability allows swapping non-conflicting operations while view serializability requires equivalent initial reads, write-read sequences, and final writers.
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMSAAKANKSHA JAIN
Slide present statistical mining of Malicious-Executable dataset collected from various antivirus log-files and other sources.
Further classifications of malicious code as per their impact on user's system & distinguishes threats on the muse in their connected severity.
Implementation of the algorithms JRIP ,PART and RIDOR in additional economical manner to acquire a level of accuracy to the classification results.
Here in the ppt a detailed description of Image Enhancement Techniques is given which includes topics like Basic Gray level Transformations,Histogram Processing.
Enhancement using Arithmetic/Logic Operations.
image averaging and image averaging methods.
Piecewise-Linear Transformation Functions
The document discusses key concepts in image processing including image sensing, acquisition, formation, sampling, quantization, and digital representation. It describes how the human eye forms images and contains photoreceptor cells. There are three main types of image sensors: single, line, and array. Sampling converts a continuous image to digital by selecting pixel values at regular intervals while quantization assigns discrete brightness levels. Together they allow images to be represented digitally as matrices of pixel values.
How to Create a Stage or a Pipeline in Odoo 17 CRMCeline George
Using CRM module, we can manage and keep track of all new leads and opportunities in one location. It helps to manage your sales pipeline with customizable stages. In this slide let’s discuss how to create a stage or pipeline inside the CRM module in odoo 17.
Brand Guideline of Bashundhara A4 Paper - 2024khabri85
It outlines the basic identity elements such as symbol, logotype, colors, and typefaces. It provides examples of applying the identity to materials like letterhead, business cards, reports, folders, and websites.
Information and Communication Technology in EducationMJDuyan
(𝐓𝐋𝐄 𝟏𝟎𝟎) (𝐋𝐞𝐬𝐬𝐨𝐧 2)-𝐏𝐫𝐞𝐥𝐢𝐦𝐬
𝐄𝐱𝐩𝐥𝐚𝐢𝐧 𝐭𝐡𝐞 𝐈𝐂𝐓 𝐢𝐧 𝐞𝐝𝐮𝐜𝐚𝐭𝐢𝐨𝐧:
Students will be able to explain the role and impact of Information and Communication Technology (ICT) in education. They will understand how ICT tools, such as computers, the internet, and educational software, enhance learning and teaching processes. By exploring various ICT applications, students will recognize how these technologies facilitate access to information, improve communication, support collaboration, and enable personalized learning experiences.
𝐃𝐢𝐬𝐜𝐮𝐬𝐬 𝐭𝐡𝐞 𝐫𝐞𝐥𝐢𝐚𝐛𝐥𝐞 𝐬𝐨𝐮𝐫𝐜𝐞𝐬 𝐨𝐧 𝐭𝐡𝐞 𝐢𝐧𝐭𝐞𝐫𝐧𝐞𝐭:
-Students will be able to discuss what constitutes reliable sources on the internet. They will learn to identify key characteristics of trustworthy information, such as credibility, accuracy, and authority. By examining different types of online sources, students will develop skills to evaluate the reliability of websites and content, ensuring they can distinguish between reputable information and misinformation.
Post init hook in the odoo 17 ERP ModuleCeline George
In Odoo, hooks are functions that are presented as a string in the __init__ file of a module. They are the functions that can execute before and after the existing code.
Cross-Cultural Leadership and CommunicationMattVassar1
Business is done in many different ways across the world. How you connect with colleagues and communicate feedback constructively differs tremendously depending on where a person comes from. Drawing on the culture map from the cultural anthropologist, Erin Meyer, this class discusses how best to manage effectively across the invisible lines of culture.
3. Distributed database design
Distributed Database Designs are nothing but multiple, logically related
Database systems, physically distributed over several sites, using a
Computer Network, which is usually under a centralized site control.
Distributed database design refers to the following problem:
Given a database and its workload, how should the database be split
and allocated to sites so as to optimize certain objective function
There are two issues:
(i) Data fragmentation which determines how the data should be
fragmented.
(ii) Data allocation which determines how the fragments should be
allocated.
4. Architecture of Distributed
Processing system
Distributed Processing architectures are generally developed depending
on three parameters −
Distribution − It states the physical distribution of data across the
different sites.
Autonomy − It indicates the distribution of control of the database
system and the degree to which each constituent DBMS can operate
independently.
Heterogeneity − It refers to the uniformity or dissimilarity of the data
models, system components and databases.
6. Client - Server Architecture for DDBMS
This is a two-level architecture where the functionality is divided into
servers and clients. The server functions primarily encompass data
management, query processing, optimization and transaction
management. Client functions include mainly user interface. However,
they have some functions like consistency checking and transaction man
agement.
The two different client - server architecture are −
1. Single Server Multiple Client
2. Multiple Server Multiple Client
7. Peer- to-Peer Architecture for DDBMS
In these systems, each peer acts both as a client and a server for
imparting database services. The peers share their resource with other p
eers and co-ordinate their activities.
This architecture generally has four levels of schemas −
Global Conceptual Schema − Depicts the global logical view of data.
Local Conceptual Schema − Depicts logical data organization at each
site.
Local Internal Schema − Depicts physical data organization at each
site.
External Schema − Depicts user view of data.
9. Multi - DBMS Architectures
This is an integrated database system formed by a collection of two or
more autonomous database systems.
Multi-DBMS can be expressed through six levels of schemas −
1. Multi-database View Level − Depicts multiple user views comprising
of subsets of the integrated distributed database.
2. Multi-database Conceptual Level − Depicts integrated multi-databa
se that comprises of global logical multi-database structure definitions
3. Multi-database Internal Level − Depicts the data distribution across
different sites and multi-database to local data mapping.
4. Local database View Level − Depicts public view of local data.
5. Local database Conceptual Level − Depicts local data organization
at each site.
6. Local database Internal Level − Depicts physical data organization
at each site.
10. Design Alternatives
The distribution design alternatives for the tables in a DDBMS are as
follows −
• Non-replicated and non-fragmented
• Fully replicated
• Partially replicated
• Fragmented
• Mixed
11. Design Alternatives
The distribution design alternatives for the tables in a DDBMS are as
follows −
• Non-replicated and non-fragmented
• Fully replicated
• Partially replicated
• Fragmented
• Mixed
12. Non-replicated & Non-fragmented
In this design alternative, different tables are placed at different sites.
Data is placed so that it is at a close proximity to the site where it is used
most. It is most suitable for database systems where the percentage of
queries needed to join information in tables placed at different sites is
low. If an appropriate distribution strategy is adopted, then this design
alternative helps to reduce the communication cost during data
processing.
13. Fully Replicated
In this design alternative, at each site, one copy of all the database
tables is stored. Since, each site has its own copy of the entire database,
queries are very fast requiring negligible communication cost. On the
contrary, the massive redundancy in data requires huge cost during
update operations. Hence, this is suitable for systems where a large
number of queries is required to be handled whereas the number of data
base updates is low.
14. Partially Replicated
Copies of tables or portions of tables are stored at different sites. The
distribution of the tables is done in accordance to the frequency of
access. This takes into consideration the fact that the frequency of
accessing the tables vary considerably from site to site. The number of
copies of the tables (or portions) depends on how frequently the
access queries execute and the site which generate the access queries.
15. Fragmented
In this design, a table is divided into two or more pieces referred to as
fragments or partitions, and each fragment can be stored at different
sites. This considers the fact that it seldom happens that all data stored
in a table is required at a given site. Moreover, fragmentation increases
parallelism and provides better disaster recovery. Here, there is only one
copy of each fragment in the system, i.e. no redundant data.
The three fragmentation techniques are −
• Vertical fragmentation
• Horizontal fragmentation
• Hybrid fragmentation
16. Fragmented
In this design, a table is divided into two or more pieces referred to as
fragments or partitions, and each fragment can be stored at different
sites. This considers the fact that it seldom happens that all data stored
in a table is required at a given site. Moreover, fragmentation increases
parallelism and provides better disaster recovery. Here, there is only one
copy of each fragment in the system, i.e. no redundant data.
The three fragmentation techniques are −
• Vertical fragmentation
• Horizontal fragmentation
• Hybrid fragmentation
17. Mixed Distribution
This is a combination of fragmentation and partial replications. Here, the
tables are initially fragmented in any form (horizontal or vertical), and
then these fragments are partially replicated across the different sites
according to the frequency of accessing the fragments.
18. Fragmentation
Fragmentation is the task of dividing a table into a set of smaller tables.
The subsets of the table are called fragments. Fragmentation can be of
three types: horizontal, vertical, and hybrid (combination of horizontal
and vertical). Horizontal fragmentation can further be classified into two
techniques: primary horizontal fragmentation and derived horizontal
fragmentation.
Fragmentation should be done in a way so that the original table can be
reconstructed from the fragments. This is needed so that the original
table can be reconstructed from the fragments whenever required. This
requirement is called “constructiveness.”
19. Advantages of Fragmentation
• Since data is stored close to the site of usage, efficiency of the
database system is increased.
• Local query optimization techniques are sufficient for most queries
since data is locally available.
• Since irrelevant data is not available at the sites, security and privacy
of the database system can be maintained.
• When data from different fragments are required, the access speeds
may be very high.
• In case of recursive fragmentations, the job of reconstruction will
need expensive techniques.
• Lack of back-up copies of data in different sites may render the
database ineffective in case of failure of a site.
Disadvantages of Fragmentation
20. Vertical Fragmentation
In vertical fragmentation, the fields or columns of a table are grouped
into fragments. In order to maintain constructiveness, each fragment
should contain the primary key field(s) of the table. Vertical fragmentation
can be used to enforce privacy of data.
For example, let us consider that a University database keeps records of
all registered students in a Student table having the following schema.
Regd_No Name Course Address Semester Fees Marks
Now, the fees details are maintained in the accounts section. In this case, th
e designer will fragment the database as follows −
21. Vertical Fragmentation
CREATE TABLE STD_FEES AS
SELECT Regd_No, Fees
FROM STUDENT;
Reconstruction of vertical fragmentation is performed by using Full
Outer Join operation on fragments.
22. Horizontal Fragmentation
Horizontal fragmentation groups the tuples of a table in accordance to values of
one or more fields. Horizontal fragmentation should also confirm to the rule of
constructiveness. Each horizontal fragment must have all columns of the origin
al base table.
For example, in the student schema, if the details of all students of Computer
Science Course needs to be maintained at the School of Computer Science,
then the designer will horizontally fragment the database as follows −
CREATE COMP_STD AS
SELECT * FROM STUDENT
WHERE COURSE = "Computer Science";
Reconstruction of horizontal fragmentation can be performed using UNION
operation on fragments.
23. Hybrid Fragmentation
In hybrid fragmentation, a combination of horizontal and vertical
fragmentation techniques are used. This is the most flexible
fragmentation technique since it generates fragments with minimal
extraneous information. However, reconstruction of the original table is
often an expensive task.
Hybrid fragmentation can be done in two alternative ways −
• At first, generate a set of horizontal fragments; then generate vertical
fragments from one or more of the horizontal fragments.
• At first, generate a set of vertical fragments; then generate horizontal
fragments from one or more of the vertical fragments.
24. Hybrid Fragmentation
In hybrid fragmentation, a combination of horizontal and vertical
fragmentation techniques are used. This is the most flexible
fragmentation technique since it generates fragments with minimal
extraneous information. However, reconstruction of the original table is
often an expensive task.
Hybrid fragmentation can be done in two alternative ways −
• At first, generate a set of horizontal fragments; then generate vertical
fragments from one or more of the horizontal fragments.
• At first, generate a set of vertical fragments; then generate horizontal
fragments from one or more of the vertical fragments.
25. Hybrid Fragmentation
Distribution transparency is the property of distributed databases by the
virtue of which the internal details of the distribution are hidden from the
users. The DDBMS designer may choose to fragment tables, replicate
the fragments and store them at different sites. However, since users are
oblivious of these details, they find the distributed database easy to use
like any centralized database.
The three dimensions of distribution transparency are −
• Location transparency
• Fragmentation transparency
• Replication transparency
26. Hybrid Fragmentation
Emp_ID Emp_Name Emp_Address Emp_Age Emp_Salary
101 Surendra Baroda 25 15000
102 Jaya Pune 37 12000
103 Jayesh Pune 47 10000
•Hybrid fragmentation can be achieved by performing horizontal and vertical
partition together.
•Mixed fragmentation is group of rows and columns in relation.
Example: Consider the following table which consists of employee information.
27. Hybrid Fragmentation
Fragmentation1:
SELECT * FROM Emp_Name WHERE Emp_Age < 40
Fragmentation2:
SELECT * FROM Emp_Id WHERE Emp_Address= 'Pune' AND Salary < 14
000
Reconstruction of Hybrid Fragmentation:
The original relation in hybrid fragmentation is reconstructed by performin
g UNION and FULL OUTER JOIN.
28. Data communication concepts
Data communication refers to the exchange of data between a source and
a receiver via form of transmission media such as a wire cable.
Data communication is said to be local if communicating devices are in the
same building or a similarly restricted geographical area.
A data communication system may collect data from remote locations
through data transmission circuits, and then outputs processed results to
remote locations. The different data communication techniques which are
presently in widespread use evolved gradually either to improve the data
communication techniques already existing or to replace the same with
better options and features.
29. Infographic Style
Insert the title of your subtitle Here
Modern PowerPoint
Presentation
Get a modern PowerPoint Presentation that is beautifully
designed. Easy to change colors, photos and Text. You
can simply impress your audience and add a unique zing
and appeal to your Presentations. Easy to change colors,
photos and Text. Get a modern PowerPoint Presentation
that is beautifully designed.
Easy to change colors, photos and Text. You can simply
impress your audience and add a unique zing and appeal
to your Presentations.
Your Text Here
30. Components of data communication system
A Communication system has following components:
1. Message: It is the information or data to be communicated. It can consist
of text, numbers, pictures, sound or video or any combination of these.
2. Sender: It is the device/computer that generates and sends that
message
3. Receiver: It is the device or computer that receives the message. The
location of receiver computer is generally different from the sender
computer. The distance between sender and receiver depends upon the
types of network used in between.
4. Medium: It is the channel or physical path through which the message is
carried from sender to the receiver. The medium can be wired like
twisted pair wire, coaxial cable, fiber-optic cable or wireless like laser, rad
io waves, and microwaves.
31. Concurrency Control and Recovery
Concurrency control (CC) is a process to ensure that data is updated
correctly and appropriately when multiple transactions are concurrently
executed in DBMS (Connolly & Begg, 2015).
Distributed Databases encounter a number of concurrency control and
recovery problems which are not present in centralized databases.
Some of them are listed below:
• Dealing with multiple copies of data items
• Failure of individual sites
• Communication link failure
• Distributed commit
• Distributed deadlock
32. Concurrency Control and Recovery
Concurrency control (CC) is a process to ensure that data is updated
correctly and appropriately when multiple transactions are concurrently
executed in DBMS (Connolly & Begg, 2015).
Distributed Databases encounter a number of concurrency control and
recovery problems which are not present in centralized databases.
Some of them are listed below:
• Dealing with multiple copies of data items
• Failure of individual sites
• Communication link failure
• Distributed commit
• Distributed deadlock
33. Concurrency Control
1. Dealing with multiple copies of data items:
The concurrency control must maintain global consistency. Likewise the recovery
mechanism must recover all copies and maintain consistency after recovery.
2. Failure of individual sites:
Database availability must not be affected due to the failure of one or two sites
and the recovery scheme must recover them before they are available for use.
3. Communication link failure:
This failure may create network partition which would affect database availability e
ven though all database sites may be running.
4. Distributed commit:
A transaction may be fragmented and they may be executed by a number of sites.
This require a two or three-phase commit approach for transaction commit.
34. Concurrency Control
5. Distributed deadlock:
Since transactions are processed at multiple sites, two or more sites may get
involved in deadlock. This must be resolved in a distributed manner.
Concurrency control protocols can be broadly divided into two categories −
• Lock based protocols
• Time stamp based protocols
35. Concurrency Control Protocol
1. Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which
any transaction cannot read or write data until it acquires an appropriate lock on it.
Locks are of two kinds −
• Binary Locks − A lock on a data item can be in two states; it is either locked
or unlocked.
• Shared/exclusive − This type of locking mechanism differentiates the locks
based on their uses. If a lock is acquired on a data item to perform a write
operation, it is an exclusive lock. Allowing more than one transaction to write o
n the same data item would lead the database into an inconsistent state. Read
locks are shared because no data value is being changed.
36. Continue..
1. Binary Locks:
A lock is kind of a mechanism that ensures that the integrity of data is maintained.
A binary lock can have two states or values: locked and unlocked (or 1 and 0, for
simplicity). A distinct lock is associated with each database item X.
If the value of the lock on X is 1, item X cannot be accessed by a database
operation that requests the item. If the value of the lock on X is 0, the item can be
accessed when requested. We refer to the current value (or state) of the lock
associated with item X as LOCK(X).
There are 2 operation in binary locking:
(i) Lock_item(X):
(ii) Unlock_item (X):
37. Continue..
1. Lock_item(X):
A transaction requests access to an item X by first issuing a lock_item(X)
operation. If LOCK(X) = 1, the transaction is forced to wait. If LOCK(X) = 0,
it is set to 1 (the transaction locks the item) and the transaction is allowed to
access item X.
2. Unlock_item (X):
When the transaction is through using the item, it issues an unlock_item(X)
operation, which sets LOCK(X) to 0 (unlocks the item) so that X may be accessed
by other transactions. Hence, a binary lock enforces mutual exclusion on the data
item ; i.e., at a time only one transaction can hold a lock.
38. Continue..
2. Shared / Exclusive Locking :
Shared lock :
Shared lock is placed when we are reading the data, multiple shared locks can be
placed on the data but when a shared lock is placed no exclusive lock can be
placed. These locks are referred as read locks, and denoted by 'S'.
If a transaction T has obtained Shared-lock on data item X, then T can read X, but
cannot write X. Multiple Shared lock can be placed simultaneously on a data item.
For example, when two transactions are reading Steve’s account balance, let
them read by placing shared lock but at the same time if another transaction wants
to update the Steve’s account balance by placing Exclusive lock, do not allow it
until reading is finished.
39. Continue..
Exclusive lock :
Exclusive lock is placed when we want to read and write the data. This lock allows
both the read and write operation, Once this lock is placed on the data no other
lock (shared or Exclusive) can be placed on the data until Exclusive lock is
released.
For example, when a transaction wants to update the Steve’s account balance,
let it do by placing X lock on it but if a second transaction wants to read the data
( S lock) don’t allow it, if another transaction wants to write the data(X lock) don’t
allow that either.
These Locks are referred as Write locks, and denoted by 'X'.
If a transaction T has obtained Exclusive lock on data item X, then T can be read
as well as write X. Only one Exclusive lock can be placed on a data item at a time.
This means multiples transactions does not modify the same data simultaneously.
40. Continue..
Lock Compatibility Matrix
_________________
| | S | X |
|-----------------------------
| S | True | False |
|-----------------------------
| X | False | False |
-----------------------------
How to read this matrix?:
There are two rows, first row says that when S lock is placed, another S lock can
be acquired so it is marked true but no Exclusive locks can be acquired so
marked False.
In second row, When X lock is acquired neither S nor X lock can be acquired so
both marked false
41. TIME STAMP BASED PROTOCOL
Time stamp is used to link time with some event or in more particular say
transaction. To ensure serializability, we associate transaction with the time
called as time stamp. In simple words we order the transaction based on the
time of arrival and there is no deadlock.
For each data item, two time stamp are maintained.
Read time stamp – time stamp of youngest transaction which has performed o
peration read on the data item.
Write time stamp – time stamp of youngest transaction which has performed o
peration write on the data item.
Let the transaction T’s time-stamp be denoted by TS(T), Read time-stamp of d
ata-item be denoted by R-timestamp(X), and Write time-stamp of data-item be
denoted by W-timestamp(X).
42. TIMESTAMP BASED PROTOCOL
The protocol works as follows-
• If a transaction issues read operation
If Ts(T) < W-timestamp(X) then
read request is rejected
else execute the transaction and update the time-stamp.
• If a transaction operates write operation
If Ts(T) < R-timestamp(X) or If TS(T) <W-timestamp(X) then
write request is rejected
else transaction gets executed and update the time-stamp.
43. TIMESTAMP BASED PROTOCOL
Thomas' Write Rule
This rule states if TS(Ti) < W-timestamp(X), then the operation is rejected and
Ti is rolled back.
Time-stamp ordering rules can be modified to make the schedule view
serializable.
Instead of making Ti rolled back, the 'write' operation itself is ignored.
44. Need of Recovery
Media failure, e.g. disc-head crash.
Part of persistent store is lost – need to restore it.
Transactions in progress may be using this area –abort uncommitted transactions
System failure e.g. crash - main memory lost.
Persistent store is not lost but may have been changed by uncommitted
transactions.
Also, committed transactions’ effects may not yet have reached persistent objects.
Transaction abort
Need to undo any changes made by the aborted transaction.
45. Need of Recovery
When a DBMS recovers from a crash, it should maintain the following −
• It should check the states of all the transactions, which were being executed.
• A transaction may be in the middle of some operation; the DBMS must ensure
the atomicity of the transaction in this case.
• It should check whether the transaction can be completed now or it needs to
be rolled back.
• No transactions would be allowed to leave the DBMS in an inconsistent state.
46. Recovery with Concurrent Transactions
Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all
the memory space available in the system. As time passes, the log file may grow
too big to be handled at all. Checkpoint is a mechanism where all the previous lo
gs are removed from the system and stored permanently in a storage disk.
Checkpoint declares a point before which the DBMS was in consistent state, and
all the transactions were committed.
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in
the following manner −
• The recovery system reads the logs backwards from the end to the last check
point.
• It maintains two lists, an undo-list and a redo-list.
47. Recovery with Concurrent Transactions
• If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just
<Tn, Commit>, it puts the transaction in the redo-list.
• If the recovery system sees a log with <Tn, Start> but no commit or abort log
found, it puts the transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed.
All the transactions in the redo-list and their previous logs are removed and then
redone before saving their logs.