Review for Paper: 12-Chapter 16

Review 1

Transaction management is employed by transactional database systems to ensure that the ACID properties are preserved, meaning basically that any database will always provide a consistent representation of a state of the model it instantiates (when all transactions have committed or aborted). More specifically, transaction management ensures that as long as the programmer makes each transaction map from consistent inputs to consistent outputs, then no inconsistencies will be introduced by the database management system, as a result of problems in concurrency or durability. Several anomalies can occur if transaction management does not ensure serializability, and database researchers must understand these anomalies.

The book chapter “Overview of Transaction Management” describes the goals of transaction management, such as the ACID property and serializability; the types of conflict that can occur if a DBMS does not ensure serializability, such as lost updates and the phantom problem; and methods for guaranteeing various isolation levels, using shared and exclusive locks. The chapter describes the locking mechanisms producing the four main isolation levels, from serializable (the most conservative) to repeatable read (the most permissive), along with explanations for when each level is appropriate to use. Of particular importance is the discussion of the phantom problem, a subtle issue that can prevent even strict two-phase locking from providing serializability, if rows are locked instead of whole tables.

The chapter thoroughly describes the categories of problems that can result from insufficient concurrency control. These include dirty reads, unrepeatable reads, lost updates, and the phantom problem. There is similarly complete coverage of isolation levels. The authors make particularly clear how the locking mechanism of each of the four isolation levels produces the theoretical guarantees of its level. For example, repeatable read systems can be implemented like strict two-phase locking, without locking index ranges or whole tables; while serializable systems can use strict 2PL and lock index ranges or whole tables. In a read committed system, read locks are released immediately, while in a read uncommitted system, no read locks are required at all.

One shortcoming of the chapter is that it states some of its most important points without proof. For example, the authors say “it can be shown that the Strict 2PL algorithm allows only serializable schedules.” Such an important claim should be justified better, considering that a proof sketch could be quite short in this case.


Review 2

The chapter enlightens the readers with the concept of transactions and how those transactions are affected when run by concurrent tasks. It starts off by highlighting fundamentals such as the ACID (Atomicity Consistency Isolation Durability) properties. Once all the terminology is dealt with, transactions and schedules are introduced. A transaction is a series of actions whereas a schedule is a list of actions (actions being read, write, abort or commit).

Concurrent executions of transactions are needed to improve the performance of a DBMS. We are faced with three anomalies during the process – Reading Uncommitted Data (data which has not been consolidated by the Task), Unrepeatable Reads (reading changed values on different accesses) and Overwriting uncommitted data (overwriting data which has not been consolidated and consequently losing it). Schedules involving aborted transactions should be rolled back to recover the loss of data caused due to interruption. The concept of locks is discussed showcasing Strict Two-Phase Locking which makes sure the locks function without interruption and how the performance of a query is hampered during execution of locks.

The chapter is successful in providing the readers with basics of transactions and locks in simple language. These are supported with figures which help in understanding the flow of control between two parallel tasks.

The reading had some typos here and there. Apart from that, the transaction support in SQL section is not detailed with real-life examples giving only the basic understanding of the concept.



Review 3

To improve performance, a database manage system would interleave the actions of several trans- actions. On the other hand, the interleaving should be done carefully to ensure the result of a concurrent execution of transactions is equivalent to some serial execution of the same set of transactions. Therefore how the DBMS handles concurrent executions is an important aspect of transaction management and the subject of concurrency control. Moreover DBMS should handle partial transaction which is closely related to interleaving and recovery. The DBMS ensures the transaction is atomic, and partial transactions are not seen by other transactions. This chapter provides a introduction to concurrency control and crash recovery in DBMS.
Section 16.1 discusses four fundamental properties of database transactions, ACID:
four properties of trans- actions: atomicity, consistency, isolation and durability.
In Section 16.2 presents schedule, an actual or potential execution sequence which provides a convenient way to describe interleaved executions of transactions.
Section 16,3 provides various problems that can arise due to interleaved execution, and we what interleavings, or schedules, a DBMS should allow: schedule should be serializable and support abortion, anomalies due to interleaved execution. It introduces lock-based concurrency control(Strict Two-Phase Locking), the most widely used approach, in Section 16.4. The performance issues associated with lock-based concurrency control is considered in Section 16.5, and provide three guidelines: locking the smallest sized objects possible; reducing the time that transaction hold locks; reducing hot spots. Section 16.6 consider locking and transaction properties in the context of SQL. Finally, it presents an overview of how a Database system recovers from crashes and what steps are taken during normal execution to support crash recovery.



Review 4

This paper (more precisely a chapter from the book Database Management Systems by Ramakrishnan and Gehrke) provides an overview about transaction management in DBMS.
In short, transaction is the foundation for concurrent execution and system recovery in DBMS, which makes it very important in DBMS. This paper first discuss four key properties of transactions (ACID, which stands for Atomicity, Consistency, Isolation and Durability). Then it describes schedule (interleaved execution of several transactions), and the lock-based concurrency control including Strict 2PL and concept of deadlocks. Finally, it provides discussion about performance of locking, as well as transaction support in SQL.

The general problem here is that there are multiple concurrent transactions in DBMS, in order to get a better throughput. Therefore, there are a lot of issues and problems surfacing. For example, if a transaction T1 reads an object A that has been modified by another transaction T2 but T2 has not yet committed, then T1’s read operation is not consistent (Dirty Read). Therefore, techniques need to be developed to handle all kinds of situations like dirty read.

The major contribution of the paper is that it provides a detailed summary about transaction and it gives a lot of examples when describing new terms and concepts. Here we will summarize the key components below:

1. Serializability

2. Anomalies due to interleaved execution
a. Reading uncommitted data (WR conflicts)
b. Unrepeatable reads (RW conflicts)
c. Overwriting uncommitted data (WW conflicts)

3. Lock-based concurrency control
a. Strict 2PL: request shared/exclusive lock before read/write; release all locks when completed
b. Deadlocks: it happens when T1 holds a X lock on B and T1 holds a X lock on A while T1 requests lock for B and T2 requests lock for A. They will waiting for each other in cycling.
c. Phantom Problem: a transaction retrieves objects twice and see different results.

4. Isolation Level (Read uncommitted, read committed, repeatable read, serializable)

One interesting observation: I like the way the author introduces new concepts or terms. It is very clear and accurate with detailed examples, and I think it is a good example of textbook. One problem I’m very interested in is Phantom Problem. In this chapter, it just mentioned a little bit, but not support detailed solution to it.


Review 5

This chapter covers an aspect of DBMS that students find very boring, but businesses find very important - transactions. The primary goal of transactions is to maintain compliance with ACID - atomicity, consistency, isolation, and durability. Atomicity refers to the idea that the transactions must finish in its entirety, or none of its effects should be felt. Consistency refers to the idea that each transactions must leave the database in a consistent state - this is up to the user issuing the queries to maintain. Isolation refers to the idea that each transaction runs independently of any other transactions - even if the transactions are interleaved by the DBMS, the effects should be the same as some serial order of the events. Durability refers to the idea that a transaction, once committed, should persist if the system crashes. This is generally accomplished with logging and recovery protocols such as ARIES.

For performance, a DBMS must interleave transactions - executing them in a serial fashion would limit response time in a multiple user system. However, the system must check to make sure that the concurrent execution of transactions does not lead to any conflicts. Some systems only use schedules that are serializable - that is, the concurrent plan results in the same outcome as some serial plan. Generally, database systems use some kind of lock-based concurrency control - whether it be strict two-phase locking, or grabbing exclusive locks first and then downgrading. Both of these methods have trade-offs when it comes to throughput and deadlocks. There is also the question of what to lock - the whole table, just the rows containing tuples that are relevant, or should you lock the index? All of these options have different levels of performance and isolation levels. Depending on what level of isolation the DBA allows, the database can increase performance by further interleaving transactions, even to the point that transactions can read from uncommitted transactions.

This chapter does a great job of explain how transactions are implemented, and why they are used. Its only drawback is that its on a subject that is generally seen as boring. Transactions aren't sexy - they're not a cool new feature. However, this paper does do a good job of explaining the complexity of the problem, as well as why transactions are necessary. While they're not "cool", they are highly desired by industry, so there's a lot of money in getting them fast and getting them right.


Review 6

The paper is a chapter from a database related textbook. It is proposed because the concurrency of transactions is an important part of the database. We like to implement concurrency because the I/O and CPU activities can be parallel, which allow two transactions to be executed concurrently. Also, transaction execution may be stuck when a long transaction is being executed in serial situation. The paper tries to convey the idea of the concurrent transactions by discussing the definition, principles, problems and solutions of concurrency.

At the beginning of the chapter, it shows the concept of transaction, which is the foundation for concurrent execution and recovery from system failure. It is any one execution in a DBMS. The requirement of the interleaving of transactions is that the result should be the same as a serial. To accomplish this requirement, it must meet ACID principle, which is atomicity (all actions of a transaction should be grouped together. They should be all or none carried out. So the database should remove the effect of actions of incomplete transactions), consistency(users’ responsibility), isolation (make sure that the concurrent transactions have the same effect as serial ones) and durability (crash recovery). The database has log to record the actions. It is used to remove the effect of an incomplete transaction for atomicity and recover from the crash for durability. The transactions can only communicate with each other through database read and write so that the database could detect inconsistencies.

Then it talks about the idea of a schedule. The database uses the schedule to manage the actions of multiple transactions. A complete schedule contains all the actions, including either an abort or a commit for each transaction. And a serial schedule executes transactions in serial order. A serializable schedule executes transaction concurrently and its result has the same effect of some serial schedule. The problems in interleaved execution are write-read conflicts, read-write conflicts, write-write conflicts and the abortion of transactions. They can result in a non-serializable schedule. Among them, transaction abortion means that the database should abort all the transactions the read the data written by the aborted transaction. If a transaction reading the data has been committed, the schedule become an unrecoverable schedule. To make sure a schedule is recoverable, a transaction commits only after all the transactions who it reads data from are committed. A database is required to be consistent when a transaction is completed. It is no need to be consistent when it is being executed.

The solution to the problems is that the database uses locks to make sure a serializable and recoverable schedule for processing. The most commonly used locking protocol is strict two-phase locking. It is used to serialize the actions of different transactions on the same object. The drawback of the locking protocol is that it is possible to make a deadlock. It happens when a set of transactions’ locks form a cycle and every one is waiting for the lock to be released. The way to detect and solve deadlocks is watching on the waiting length. It will timeout if there is a deadlock.

Save points are provided in SQL to support long-running transactions. A save point is like a checkpoint in the file system which allows the process rolls back to the point instead of starting from the beginning. Every time it rolls back to a save point, the savepoint should be recreated if it is still needed in the future.

One strength of the paper is that it has a clear structure of all those concepts. The explanation is from easy to hard, which is very accessible to readers. The other strength is that it provides a series of examples with the same form, like using T1, T2, R(A), R(B) consistently in different part so that the readers do not need to take time to get familiar with every new example.

The drawback is that the performance evaluation is too limited. It just has a few passages with a single figure. It would be benefit if the author could provide more figures of measurement and analysis the performance more detailedly, like talking about the writer lock and reader lock individually.

In total, it provides a very detailed and accessible overview of concurrent transactions. The words are clear and not difficult, which is very helpful for a new learner. In this paper, I can capture all the basic ideas about concurrency in database.


Review 7

This paper introduces the concept of transaction in a DBMS, the different scheduling and concurrency issues that are involved in transactions, and how transactions are implemented. The paper do not present any new idea or algorithm, but it provide clear introduction to the topic. Supporting concurrent execution of transactions in DBMS is important, because it help speed up performance and scale better with increasing parallelism to mask IO delay.

A transaction, in the context of DBMS, is the basis for concurrent execution and system recovery. Transactions are actions or executions that are performed by the user program in a DBMS. The four properties of transaction guaranteed by DBMS are:
- Atomic: all executions of a single transactions start and complete as one.
- Consistent: the transaction keep the database in a consistent state when it runs.
- Isolation: order in which transactions execute does not affect the outcome
- Durability: if a transaction is complete, its effect should persist even if the system crashes.
The transactions are executed in a series of actions in a DBMS, and the only means of interaction between the database and the transactions are through read and write operators.

The scheduling of transactions involve how the different actions (read, write, abort, commit) are interleaved and executed in order by the DBMS. In scheduling, there are several restrictions, such as WR conflicts, RW conflits, WW conflicts, that need to be avoided. These are often handled through concurrency techniques such as locking, at a cost of performance. The granularity of locks and blocking is an area of optimization.

The paper does a good job presenting the material in a clear manner, and gives a good introduction to transactions. However, it is a bit light on the details of how the transactions are implemented, or advanced techniques for concurrency management. It seems to touch the basic solutions, but not in great details or any interesting design decisions that come with the solution.


Review 8

This chapter introduces the concept of the transaction and illustrates several issues concerning transaction schedule. In addition, it introduces topics of crash recovery and SQL support for transaction. Following is the summary of this chapter:

1.Transaction is an execution of a user program in a DBMS and should be ensured the ACID properties. Atomicity and durability are provided by logging and isolation is guaranteed by careful scheduling of interleaved event(serializable schedule) while consistency should be maintained by the user programs, since DBMS cannot detect any inconsistency due to user program logic.

2.Transactions can be interleaved for the performance(throughput or average latency) concern. But interleaved transactions can cause three kind of conflicts: WR, RW, WW conflicts, which might violate the ACID properties and make the database inconsistent. Considering commit and abort, the interleaving may also cause the unrecoverable schedule.

3.To solve the problems above, one solution is strict two-phase locking(strict 2PL), since it guarantees serializable schedules. However, there is some general problems with strict 2PL. First, deadlock can occur in strict 2PL. A common approach to solve is to detect and resolve deadlocks. Second, it affects performance when system thrashes: too many transactions are blocked. There are some possible ways to solve this: 1.adjust granularity of locking 2.reduce the time a transaction hold locks, 3. reduce hot spots(reduce contentions of a certain resource).

4.SQL provides support for transactions. It supports different isolation level, which is, from low to high, read uncommitted, read committed, repeatable read, serializable. Lower level isolation allows more concurrency and higher level is safer. The read uncommitted transaction is read only. It also supports command save point and rollback, which minimize the overhead to initial several transactions by saving some version of object that might be reused.

5.Due to the limited memory and I/O cost concern, steal non-force approach has to be implemented in the databases. But this will result in a inconsistent state when system crashes. To restore the consistency, write ahead log, which records each mutation of data before it materialized to the disk, is needed in recovery. Checkpoints are also needed, since it saves much I/O in the recovery. ARIES is a recovery algorithm for the steal non-force approach, which has three phases as analysis, redo and undo. It will ensure the atomicity and durability of the DBMS even after a crash.


Review 9

Problem/Overview:
This textbook chapter gives an overview of the transactional properties that DBMS generally want to provide, and how they do it. The chapter first describes the ideal properties of transactions using the ACID acronym: Atomicity, Consistency, Isolation, and Durability. Because maintaining Consistency is the responsibility of the user, the chapter focuses on how Atomicity, Isolation and Durability apply in the applications of Concurrency and Recovery.

Databases want to support concurrent transactions, but also want to provide Isolation for their transactions- the effects of a transaction should not depend on the way in which the transaction is interleaved with other transactions (since this is not under the user’s control). This means that the effect of a set of interleaved transactions should always be the same as if the transactions were executed consecutively and not interleaved. Atomicity is also a consideration in concurrent transactions, because if a transaction is aborted, it must be able to be undone. The DB should not leave a transaction halfway finished. Both Atomicity and Isolation can be ensured by only allowing certain interleavings of transactions, and can be implemented through a protocol called Two-Phase Locking. This protocol can decrease performance, so the DBMS may give the user an option to run transactions at lower isolation levels to increase performance.

Databases also want to have Recovery protocols, in order to keep the database consistent in the event of a crash. If the database crashes, it wants to ensure that transactions are durable, meaning that committed transactions survive the crash. Atomicity in this context means that transactions should not be half-done even if a crash occurs. These properties are provided by logging. The database will write a log of changes to some stable storage. When a transaction commits, a “commit” change is written to the log, and this is used after a crash to determine if the particular transaction finished or not. If a “commit” change is seen in the log, then the DB knows that the transaction committed and can replay any changes not applied to the database yet. If a “commit” change is not seen, then the DB can undo any uncommitted modifications that the transaction made before the crash.

Strengths:
This is a chapter from a textbook so it is very readable and provides good examples and diagrams.

Weaknesses:
There is a lot of concepts covered and so it feels less focused than papers.



Review 10

The paper is a chapter from a database text book. The chapter is titled as “Overview of Transaction Management”. As the chapter title suggests, the paper discusses the relevant topics for a DBMS executing concurrent transactions. The paper serves as a good review material for refreshing the fundamentals of transaction management of DBMS.

The paper begins by explaining the well-known ACID properties of a DBMS: Atomicity, Consistency, Isolation and Durability. These properties must be guaranteed by a DBMS in order to maintain data correctly from concurrent access and system failures. The DBMS needs to interleave the actions of different transactions for concurrent execution, which is essential for the performance of the DBMS. After motivating readers for concurrent execution, the paper delves into different issues that can be raised concurrent/interleaved execution of multiple transactions.

Without any concurrency control, there could be many anomalies caused by interleaved execution of transactions. Two transactions can conflict with each other in a write-read (WR) conflict, read-write (RW) conflict or write-write (WW) conflicts. These conflicts cause a database to be in an inconsistent state or a transaction to terminate unexpectedly with an error. Also, aborted transactions can affect other transactions leaving them unrecoverable. We need a concurrency control technique to handle these issues and the paper mainly discusses lock-based concurrency control.

Strict Two-Phase Locking (Strict 2PL) is the most widely used locking protocol. With this protocol, a transaction requires a shared lock on an object it wants to read and similarly requires an exclusive lock on an object it wants to write. This ensures transactions to be executed in a serializable schedule, while sacrificing its performance due to blocking and aborting. Most modern DBMS, thus, supports a number of different granularities on locks. The DBMS can lock either entire table, a portion of a table or a single row. While a row-level locking is better in terms of performance, it may cause a phantom problem. This makes the choice of lock granularity a complicated problem.

In SQL, isolation level can also be specified. This lets a user to control a degree in which a given transaction is exposed to the actions of other transactions executing at the same time. Similar to lock granularity, different isolation levels provide a trade-off between the concurrency (i.e., performance) and possibility for inconsistent results due to dirty read, unrepeatable read and phantom problem.

To sum it up, the paper provides a good review of the fundamentals of a DBMS in transaction management. One downside is that the textbook is a bit outdated and fails to include discussions of the topic from today’s perspective.



Review 11

Chapter 16 of Database Management Systems discusses transaction management and the need for concurrency control mechanisms. Although preserving the ACID properties listed at the start of the chapter would be simply with serial execution, it would not be feasible in supporting multiple users issuing many transactions. Furthermore, it would waste computer resources such as CPU cycles as discussed in the text. However, using transaction interleaving introduces several problems: lost updates, dirty reads, unrepeatable reads, and phantom reads. One method to solve these problem is to use locking. The chapter talks about strict two phase locking as a method to enforce schedules to be serializable and and recoverable in the case of a crash. The chapter also brings up the idea of locking granularity and isolation levels and how it can affect database performance.

This chapter does a decent job of introducing the motivation for concurrency as well as a method to enforce strict scheduling to preserve the ACID properties of a database. There are several applicable examples which are easy to understand. However, I feel like the chapter gives the idea that locking is the only method used to enforce concurrency in modern database systems. I think section 16.6.1 could have been scrapped and replaced with a small paragraph about other possibilities for concurrency control. The chapter is already introducing new ideas that it defers to chapter 17 so I don't think it would lose any of its content in introducing another chapter 17 idea.


Review 12

This chapter discusses about transactions. Transactions are a series of read and write operations following by a commit (or abort) message to DBMS. Since most DBMS interleaves actions from different transactions to improve efficiency, the "schedule" of actions should follow some properties. The DBMS has a recovery manager to ensure the ACID properties:

1. Atomic: When any unexpected events happen, such as a system crash, an incomplete transaction should be undone using the log maintained by the DBMS. As a result, each transactions are done atomically.

2. Consistency: Users are responsible for consistency, that is, there should not be logic errors in users' program.

3. Isolation: This property ensures that the final result of several interleaving transactions should be the same as the result that these transactions execute one by one.

4. Durability: The log used for keeping atomicity is also used for durability. The DBMS can write to disk following the write log after a system crash so that the disk can complete the writes.

To achieve ACID properties, the schedule of actions should be done very carefully because inappropriate ordering of actions might lead to RW, WR, or WW conflicts which can turn a consistent database into an inconsistent one. The DBMS should only allow serializable and recoverable schedules, which can be achieved by the use of locks. The two-phase locking involving "shared lock" and "exclusive lock" is one of the common protocols. However, database thrashing can occur because of lock contention, which needs to be prevented by some further policies.

I think this chapter is very straightforward and well-written since it classifies anomalies into categories and shows a lot of examples that is easy to understand. It would be more interesting if there is any performance analysis showing how much improvement is achieved by interleaving transactions while ensuring ACID properties.



Review 13

This chapter of Database Management Systems covers the main aspects of the way database management systems handle transactions. The motivation for this discussion is because of the great impact on performance and reliability from the way a DBMS handles concurrency. The chapter approaches this topic by describing the guarantees of transactions in DBMS, the interleaving of transactions, and crash recovery.

The main strengths of this chapter are that it covers and explains the concepts vital to transaction management in a concise and easy to understand matter. Each DBMS must ensure atomicity (the completion of transactions are all-or-nothing), consistency (the fact that the database holds consistent data for each transaction), isolation, (the protection of transactions from effects of concurrently scheduling other transactions), and durability (the persistence of transactions even if system crashes before the changes are reflected on disk). Each transaction is either Committed, indicating that the transaction was successful, or Aborted, indicating that the all the actions carried out so far must be undone. The interleaving of transactions is made possible with locks, which ensure the serializability (the execution of the schedule is equivalent to executing the transaction in a serial order) and recoverability (transactions commit only after all transactions that it reads commits) of the transactions. The most used locking protocol is Strict Two-Phase Locking, in which a transaction must request a shared lock to read an object and leases the locks it holds when it's complete. For atomicity and durability, a recovery manager is deployed. All modifications to the database are saved as a log on stable storage. The ARIES recovery algorithm, which uses steal (the page containing dirty frame is written to disk even if its associated transaction is active) and no-force (pages modified by a transaction in the buffer pool are not forced to disk upon commit), analyzes the changes that haven't been written to disk, restores the database to where it was before the crash, and undoes the actions in transactions that did not commit.

Some of the limitations of this chapter are that the overview for the ARIES recover algorithm is very brief. I would've liked to see it provide an example of how the ARIES algorithm is applied. I would've also liked to see an example of deadlocking in this chapter's discussion in section 16.4.2.



Review 14

This is the 16th chapter of Ramakrishnan and Ghrke's database textbook. This chapter is about transaction management which includes a broad introduction to concurrency control, the fundamental properties of database transactions and how these properties are maintained, how a schedule works to interleave the actions of several transactions, and problems and performance of these systems.

Databases are accessed by many users, which is a large part of why they are useful in the first place. To support concurrency the database system has to have the properties of atomicity, consistency, isolation, and durability. Here are what these mean:

Atomicity: Users should have entire transactions run or have them not run at all. Partially completed transactions because of crashes or other errors can cause a mess.

Consistency: For a given application there may be consistency constraints. It is the users responsibility to write transactions that maintain this consistency.

Isolation: A query should be transacted without having to worry about other queries that are currently happening.

Durability: The database must be keep changes if it tells the user it has completed the transaction.

A database maintains atomicity and durability by keeping a log file of transactions and undoing things that don't complete transactions. A DBMS interleaves transactions to increase throughput by scheduling I/O and CPU actions and decreases average wait time for query completion, but this has other constraints. There are three inconsistencies that can occur though, and these are called RW, WR, and WW anomalies. Aborted transactions can cause problems too. These can be fixed by locks. A DBMS should only let serializable, recoverable schedules run.

The advantages of this paper include a clear, thorough and intuitive explanation of transaction management in database management systems. It makes a lot of sense if you have background in operating systems. Concurrency control is an issue that operating systems solve and they deal with threads and deadlocks the same way that the DBMS uses strict 2-phase locking and deadlock detection. The paper is also very clear in its motivation and explanations, especially of RW, WR, and WW anomalies.

This paper led me down a few paths where I didn't think I'd get an answer to my question. The transition from 16.3.3 to 16.3.4 was abrupt and I was expecting something to tell me 16.4 was coming but instead I was left wondering "how do these things work then!?". There are some misspellings which I assume are from performing an ineffective OCR technique to a scanned PDF, so this is not the authors fault. There are some figures that are unclear. Is 16.9 supposed to have a line? How am I supposed to understand the scale here? Also, 16.10 looked like it was missing a column header but I managed to figure out it had to do with phantoms. This chapter mentioned that consistency was the responsibility of the user. I feel like I'd never run something that wasn't SERIALIZABLE. I am wondering to what extent one really has to worry about this?


Review 15

Part 1: Overview

This chapter provides an high level overview of transactions which is widely used in database systems. Starting from the four properties (so called ACID properties) of transactions, atomicity, consistency, isolation, durability, this book talks about the importance of concurrency control and crash recovery. Every transaction views the database system as a consistent instance.

Transactions can be scheduled as a list of actions including reads and writes. The final actions should either commit or abort. As transactions are design to be able to run in parallel for performance concerns on some multi CPU hosts, there comes the serializability problem that we can sometimes reorder the transaction actions however retain the same results. WR, RW as well as WW conflicts must be avoided in scheduling so as to keep the final results unchanged and easy to be recovered.

Locks are used in order to keep concurrency. Strict 2PL locking enforces a transaction to hold some exclusive lock on the object to read or modify. Database system often choose to detect and recover from deadlocks as there are so many exclusive locks of different objects. The overhead caused by locking scheme can be reduced by reduce the size of critical sections, reduce time that a transaction holds locks, or scatter the hot spots. Transactions can be supported by SQL, as the user executes some query such as SELECT, UPDATE or CREATE TABLE, the transaction automatically starts and will not stop before being terminated or committed. SQL allows programmers to decide the access mode, the diagnostics size, and isolation level of transactions. Crash recovery is another major concern for the database administrators. Atomic writes assumption is introduced as the foundation of crash recovery problem. ARIES recovery algorithm is introduced as a steal and no-force approach.

Part 2: Contributions

The book clearly shows the development of transactions in the literature. Most major concepts are mentioned along the way of introducing the four ACID properties of transactions, which gives readers a great overview of what transactions are like. Concurrency control, atomicity guarantee, performance improving and crash recovery mechanism are discussed in a clear order, which gives a perfect summary of transactions management.

Part 3: Possible drawbacks

Atomic disk writes assumption may not hold in practice and we may need some more complex mechanism to verify the recent written page. Also as approaching the end of the chapter, the paragraphs are becoming short and arguably vague. Many concepts appear with “referring to Chap 18.6” etc. This may result in an cycle of learning difficulty. If the reader does not know ARIES and log mechanism, he or she may be confused by the concept of rollback.



Review 16

Chapter 16 of the “Database Management Systems by Ramakrishnan and Gehrke” book provides an overview of transaction management. Transaction management is important because database applications involves accessing of shared database objects which might leads to inconsistency, if not properly accessed. The chapter discusses the ACID property of a transaction, serializability of execution of multiple transactions, usage of locking, and performance penalty of locks and other related issues. ACID refer to four properties of a transaction: atomicity, consistency, isolation and durability. Each of these properties must be ensured by a DBMS, for it to works correctly in face of concurrent access and system failures. DBMS interleaves actions of different transactions to improve performance, but some interleaving might lead to violation of isolation. In order to ensure isolation, a concurrent execution of transactions should have a serializable schedule i.e schedule whose effect on any consistent database instance is guaranteed to be identical to that of some kind of complete serial schedule. If serializability is not maintained, execution of concurrent transaction might lead to an inconsistent database. Actions which might leads to inconsistency includes: reading uncommitted data, unrepeatable read and overwriting uncommitted data.

DBMS usually use locking protocol to ensure that only serializable and recoverable schedules are allowed. The most widely used protocol is called Strict Two-Phase Locking (Strict 2PL). This protocol has two rules. The first rule is requiring a request for a shared(exclusive) lock for a read(modify) action on an object. The second rule requires that all locks held by a transaction to be released only when the transaction is completed.The protocol only allows safe interleaving. For example, if two transactions access completely independent objects of a database, they can concurrently obtain locks and proceed freely. However, any scheduling which leads to some anomalies is forbidden. One other concepts discussed in the chapter is deadlock. A deadlock can be created when multiple transactions hold keys to different kind of objects which are intern requested by each of them to make forward progress. Such deadlock can be identified using a timeout mechanism and solved by aborting the transactions and hence releasing the objects. One thing to note is that locking doesn't come freely. It added a delay due to blocking as some transactions have to wait until the locks are released by the current holder.

In addition, the chapter discusses about transaction support in SQL. SQL transaction is automatically started when executing SQl statement such as SELECT, UPDATE or CREATE statements. Any additional statements can be executed as part of this transaction until a COMMIT or ROLLBACK statement is executed. One important feature of SQL is a savepoint which allows to identify a point in a transaction and selectively roll back operation carried out after this point. This technique allows to flexibly roll back over several checkpoints as opposed to transaction based roll back which only allows a single rollback i.e rollbacking only to the most recent transaction. Another specific to consider in SQL is what DBMS treats as an object for setting a lock. The DBMS could set a lock on the entire table or set of rows. The latter approach is common as it provides better concurrent access of different rows in the same table. Furthermore, the chapter discusses about different transactional characteristics specific to SQL including: action mode and isolation level. For example, a “READ ONLY” access mode allows multiple transactions to concurrently read an object as long as they don’t modify it. And a SERIALIZABLE isolation level is the safest and recommended ways. However, for a statistical query which tolerate a few incorrect values, the “READ COMMITTED” level could provide a better system performance.

The main strength of the book chapter is that it explains the basic overview of transaction management in a simple and detailed ways. It uses abstract examples to clarify concepts involved in transaction system and control before getting into specific support provided by SQL.

The main drawback of the chapter is most part of the discussion is based on abstract transaction examples. It could have been better if more SQL query examples were added and the discussion was balanced. The authors give a detailed explanation of transaction management using abstract example and give smaller proportion when discussing specific implementation in SQL. In addition, they only covers the Strict 2PL protocol and failed to discuss other kind of protocols. For example, there are various flavours of the 2PL protocol like simple 2PL and Strong strict 2PL protocol. It could have been better if they added a discussion section to identify which protocol is more appropriate for SQL transaction control system.



Review 17

In this chapter, an overview of transaction management is provided. The concept, transaction, is critical in concurrent execution and system recovery in a database management system. During concurrent transactions, the state of the database is required to be valid all the time. Being valid means the state is reachable through a serial list of actions. The properties of database transactions are concluded as ACID or atomic, consistency, isolation and durability.

It is surprising, but the chapter says that the users are responsible for ensuring transaction consistency. The DBMS is not able to detect inconsistencies due to logic errors in user’s program. For example, interaction between transactions are assumed not to happen, but database cannot make sure of that, users should. Database consistency follows from transaction atomicity, isolation, and transaction consistency.

Then the paper covers the topic about schedule, the list of action, and lock, which are topics raised by concurrent execution. Lock is used to protect shared resources. There are three kinds of anomalies to be solve with different kinds of locks, and they are write-read, read-write, and write-write conflicts. Stick two phase locking is then introduced. It is safe but at the cost of performance. Also, by using locks, we need to face the problem of deadlocks. We should decide what to lock and what kind of lock (different isolation levels) shall we choose depending on the transactions we are performing.

As you might have noticed, it is a fairly old textbook. A lot of problems to be solved here are shared in operating system and parallel computing. The chapter is easy to understand and has a good flow with plenty of examples and explanations.


Review 18

This chapter provides an overview of the transaction management component in modern database systems.
Because of performance reasons, DB management systems have to interleave several transactions. This will cause consistency problems. So transaction must guarantee 4 important properties, A – atomic, C – consistent, I – isolation and D – durable. To keep these four properties while keeping concurrency in the same time, the transaction management component must face different types of anomalies and be able to recover from these anomalies. Anomalies happen because of read and writes happen on same data area, and include cases 1) Reading uncommitted Data, 2) Unrepeatable Reads, 3) Overwriting uncommitted Data. Then this chapter shows that concurrency and ACID can be achieved using locks. The Strict Two-Phase Locking is the most important locking protocol, and it has two rules 1) request a lock before access. 2) Release locks once finished. Lock usage in SQL is also discussed. Finally, crash recovery is introduced to ensure atomicity and durability. Two cases called stealing frames and forcing pages are discussed and then a recovery algorithm called ARIES is mentioned.

Good:
This chapter of the book introduced us many useful techniques and concepts that need to make transaction management possible.

My favorite part is Transaction support in SQL, this part gives a glimpse of actual usage of transaction management. And it shows how transaction management’s function connects to other parts of the database.

Weakness:
There is one thing I wish to know more about deadlock. Deadlock also happens in operating systems. And in operating systems, there are other ways to solve deadlocks, such as each resource is assigned a number, an application must request resources with smaller numbers before they can request resources with large numbers. I wish to know if the same algorithm applies to DBMS, but the paper doesn't talk about this.



Review 19

This chapter introduces the fundamental ideas of transaction management and also a introductory glance into crash recovery.

The four properties that a database management system must guarantee are ACID namely;
1. Atomicity – Either all actions are executed or none, but there are no incomplete transactions
2. Consistency – Each transaction running is responsible for maintaining consistency
3. Isolation – Users should understand a transaction even if the DBMS is internally interleaving the steps
4. Durability – Once a transaction is completed, the effects must persist even if the system crashes.

The isolation property is guaranteed by ensuring the net effect of interleaved transactions is exactly the same as it would have been if they were run in a serial manner. A DBMS ensures atomicity by undoing actions of incomplete transactions. It also maintains a record, called the log, which in the event of a system crash is used to restore the changes that were not written to disk before the crash.

The chapter also introduces an important concept that is a serializable schedule whose effect is guaranteed to be identical to that of a complete serial schedule. But for that, the following problems must be averted-Reading uncommitted data resulting in WR conflicts, unrepeatable reads resulting in RW conflicts and overwriting uncommitted data which is a Write-Write conflict which can result in a lost update.

The DBMS uses a strict 2-phase locking in order to avoid these problems. This involves the transaction requesting a shared/exclusive lock over an object that it wants to read/write. While one transaction holds an exclusive lock, no other transaction can hold a shared or exclusive lock over that object. Also, all locks that are held by a transaction are released when the transaction is completed. In the example, the authors have interleaved the read write for A and B by T1, followed by read write for A for T2 but as per the rules of 2PL, T2 should be able to get an exclusive lock over A the moment T1 moves on to B.

In order to avoid inconsistency and performance issues on account of locking the whole table, the DBMS has the property of changing the granularity at which the objects are locked. The DBMS gives the programmer the control to manage the values: access mode, diagnostics size and isolation level.
1. Access mode – If access mode is read only, transactions are not allowed to modify the database, this increases concurrency
2. Isolation level controls the extent to which current transaction is exposed to other transactions. The levels are read uncommitted, read committed, repeatable read and serializable, in the order of isolation level. The phantom problem can occur in all but serializable level of isolation.

Even though the authors have mentioned that read uncommitted can be used in case of a statistical query with less overhead, I believe if there happen to be multiple transactions running modifying different values, even a statistical query will not return accurate results with a Read uncommitted transaction level.

One of the last important parts of this chapter is the ARIES algorithm. It involves a three phase crash recovery: Analysis phase where it identifies the dirty pages in buffer and active transactions at the time of the crash, the redo phase where it redoes the actions identified in the previous step using the log and undo phase where it undoes the actions of uncommitted transactions.

A few real time examples with regards to the different concepts would have helped make this chapter a lot more interactive. Overall, this chapter provides a lucid beginner level explanation of transaction and crash recovery theory.





Review 20

This chapter serves as a brief introductory material for the transaction management issues in the DBMS. It covers the topic of ACID, transactions & schedule, concurrent execution using lock and the transaction support in SQL.

Firstly, the concept of ACID(atomic, consistency, isolation and durability) is introduced. ACID is mainly a conceptual requirement for DBMS, and they are not unrelated to each other. To work together, they require the DBMS will treat each transaction atomically and independently, and the whole system should have a consistent and stable view for user. That is to say, a transaction can either be committed or aborted, not half done. And the system should keep a consistent view after transaction or system crash.

Then, it reveals the fact that each transaction is executed as a series of read and write on database subjects. And a schedule is a list of actions from a set of transactions. In order to improve database efficiency, transactions are always concurrently executed. Some of the concurrent execution can be viewed as serializable, some may have WR, RW and WW conflicts.

The lock based strict two PL can help preventing above anomalies. Basically, strick 2 PL requires:
if a transaction A wants to read(write) to an object, it request a shared(exclusive) lock first on the object.
All locks are held until the transaction is completed.
In this mechanism, deadlock are very usual with high system concurrency, it can be solved by deadlock detection(resolve) and deadlock timeout method. And the system bottleneck are mostly caused by blocking, if the DBA wants to avoid thrash, he can lock the small objects, reduce deadlock timeout and reduce hotspot.

In terms of the transaction support in SQl, there are now savepoint and chained transactions to support long-running and multiple transactions. And this chapter further talked about phantom problem associated with the problem of what object to lock. Transaction characteristic like access mode, diagnostic size and isolation level are also briefly mentioned. And in the end, crash recovery related to stealing and no forcing rule are discussed.



Review 21

In this chapter, the book talks about transaction in the database and talks about the technique of maintain the properties of the transaction.
The transaction a a series of read and write actions and should have atomicity, consistency, isolation and durability. Then talks about the serializable schedule which is an interleaving of actions from a set of transaction that have the same final state as execute the transaction serially. The DBMS want interleave the actions in different transaction to provide better performance(through-output and response time), but the interleaving schedule of transactions leads to anomalies that may lead to unserializable result - WW, WR, RW conflict. Another problem that is introduced by interleaving the transactions is scheduling involving aborted transactions. The DBMS should should have recoverable schedule that is if the transaction read a uncommitted object it should not commit before that transaction committed.
Then the book talks about the strict-2PL. (1) If a transaction wants to read/write an object, it first requests a shared/exclusive lock on the object. (2) All locks held by a transaction are released when the transaction is completed. This strategy provide serializability and ACA. Also it talks about the deadlock problems.
Then the book talks about the performance of transaction and the transaction support in SQL. SQL uses commit and rollback&savepoint. Also it talks about the tradeoff for the size of object of locking.
In the last, the book gives a overview of recovery in DBMS and introduce some concept like stealing, forcing and ARIES.

This book is the textbook in the 484. One reason I like this book is this book introduce all the knowledge that is important in a easy-to-understand way without going to deep. This is a good book for people who first learn DBMS.
The drawback of this chapter is the structure of this chapter is a little disorganized and make me confused when I first read this chapter in 484. Another drawback in this PDF is too much typo error(although I think will not have too much influence on reading).


Review 22

The chapter provides background on the ACID guarantees for transactions and mentions disclaimers about what may actually happen within the database. For instance, transactions are isolated from each other but they may still occur concurrently. However from the user's perspective it should not matter that they do. The author discusses transaction schedules and concurrent execution, justifying the need for concurrent execution by arguing for its performance benefits as well as helping relieve head-of-line blocking through interleaved execution. The author also discusses serializability, conflict types, and locking. Much of the locking discussion here overlaps with the reading of Chapter 17. The author ends with a discussion of locking as it relates to SQL.

The author clearly presented issues related to transaction management in databases. However, I would have liked to see justification for the ACID guarantees and whether other sets of guarantees have been considered in the past and why most databases provide ACID guarantees today.


Review 23

This paper introduces transactions and how transactions work in relational databases. Transactions are important because there are certain functions that need multiple actions to complete, thus leaving the database in an inconsistent state in between. The problems with this approach are apparent when multiple users are connected executing the same query. These users will get ultimately get inconsistent results. To solve this problem, transactions were introduced.

The paper starts off by introducing the properties that all database management systems should preserve: atomicity, consistency, isolation and durability. It then introduces the concept of serializability and how certain schedules will present inconsistent values. For example, if a transaction reads a value, then another transaction changes that value, the value that the first transaction read is now outdated. To maintain concurrency and consistency, most databases use a lock-based control system called strict two-phase locking. It has these two main rules:

1. A shared lock must be acquired for all reads. An exclusive lock must be acquired for all writes.
2. All locks must be released when the transaction is completed.

The paper then explains the granularity at which databases should lock and what problems might arise such as the phantom problem. It then finishes with how the transaction should deal with crashes and how it should recover.

Overall, the paper does a great job of describing the ins and outs of transactions and why we need transactions. It also considers most of the problems developers might run into when using transactions. However, the paper does have some limitations:

1. When explaining thrashing, the paper gives three possible solutions to prevent this problem. It does not explain the pros and cons of the solutions though. I would have liked to see how modern database management systems deal with this problem.
2. The paper introduces only one type of locking: strict two-phase locking. Are there other solutions to the concurrency problem and why is strict two-phase locking the solution that most databases ended up choosing?



Review 24

This chapter of the textbook covers the basics of transaction management in a DBMS. Database managers typically want many programs to be able to interact with their database concurrently in order to improve performance. This can cause a host of problems if not managed properly. Transactions are intended to ensure that many programs can run concurrently while keeping the database in a consistent state that is a equivalent to some state that could be achieved by a serial ordering of all transactions.

In order to accomplish this, transactions implement the ACID properties. These properties guarantee that transactions:

1. Effectively execute either all or none of their actions
2. Preserve the consistency of the database in regards to user-defined constraints
3. Interact with other transactions only by reading and writing shared objects
4. Have persistent effects

Transactions implement lock-based concurrency control in order to prevent problems that can arise from RW, WR, and WW conflicts. In strict 2-phase locking, transactions obtain shared locks on objects before reading them, and exclusive locks on objects before writing them. All locks held by a transaction are released when the transaction commits. This scheme prevents inconsistencies that might arise when transactions read changes made by other uncommitted transactions.

While 2-phase locking guarantees serializability, system performance can suffer if transactions block while waiting to obtain a lock on an object that is currently being modified by another transaction. To prevent this, DBMSs offer several different levels of transaction isolation ranging from fully serializable to allowing transactions to read the changes of uncommitted transactions (albeit in read only mode). This allows a DBA to optimize performance in cases where strict accuracy is unnecessary.

I wish the chapter had spent a little more time on giving examples of when the different levels of transaction isolation would be used. It provides the example of finding the average age of sailors as a case in which a few inaccuracies will not greatly affect the results of the query, but offer no further explanation for when one might want to run at the READ COMMITTED or READ UNCOMMITTED levels.



Review 25

The article talked about transaction management, which was an important database topic for concurrency control and recovery from system failure in DBMS. For performance reason, DBMS had to interleave the actions of several different transactions. In order to handle concurrent executions, concurrency control and recovery became an essential topic in database systems design. Thus, this article introduced some properties and approaches, including ACID properties, lock-based concurrency control.

First, the most important properties of transactions are ACID, which stands for atomicity, consistency, isolation, and durability, as following.
1. Atomicity: Either all actions in a transaction are carried out or none are.
2. Consistency: After the transaction runs against a consistent database, it will leave the database in a consistent state. The consistency may be some properties hold in the database.
3. Isolation: Users should be able to understand a transaction without considering the effect of other transactions.
4.Durability: If a transaction is completed, its effect should be persistent even if the system crashes
As a note, maintaining consistency is the responsibility of users. All concurrency and recovery algorithm are to preserve ACID properties of transactions.

Second, the article talked about lock-based concurrency control, including strict two-phase locking (strict 2PL) and two-phase locking (2PL). Before talking about these two methods, the article introduced some anomalies due to interleaved execution, such as WR, RW, and WW conflicts. In strict 2PL approach, if a transaction wants to read (or write) data, it must first request a shared (or exclusive) lock on the object. Then, all locks held by a transaction are released when the transaction is completed. The strict 2PL approach allows only serializable schedules.

To sum up, the article provides a good description of transaction properties and lock-based concurrency control, which are important topic when dealing with interleaved transactions. The article has many examples to help illustrate these ideas, making it clearer for readers to understand.



Review 26

This “paper” is an overview of transaction management, what is needed and how it is accomplished. The high level overview of what is needed for transactions is covered by the acronym ACID, which stands for Atomicity, Consistency, Isolation, and Durability. Atomicity means that either all actions are carried out or none are, and it is atomic meaning it can be considered 1 step. Consistency means that the data is consistent across all systems or databases. Isolation means transactions are protected from the effects of other transactions (that may even be happening concurrently). Durability means that even if the system crashes before all changes are on disk the effects still persist (and on startup it will be fixed).

There are different levels of isolation that transaction management systems can have and they are: read uncommitted, read committed, repeatable read, and serializable. Serializable is the most strict level of isolation (so it is the safest but also the hardest to ensure). If two transactions are serializable it means that if T1 and T2 are interleaved it will have the same affect as T1 completing and then T2 starting and completely (or vice versa).

There are certain conflicts that challenge making two transactions serializable and they are: write-read conflicts, read-write conflicts, and write-write conflicts. Write read conflicts happen when one transaction tries to write to a variable and another tries to read from that variable before it’s new value is committed and written. In the event of a crash this could lead to inconsistency depending on the order of events. Read-write conflicts occur when on variable is read by a transaction, then a separate transaction writes to that variable and the original transaction re-reads the variable that now has a different value, this is also known as unrepeatable reads. Lastly, write-write conflicts as you would imagine occur when two transactions both try to write to a variable and the new value depends on the order of writes.

To prevent against some of these problems locking is done, and one popular type is “Strict Two-Phase Locking” (Strict 2PL). Strict 2PL assures that if one transaction wants to read a variable it must have a shared lock for that variable, and if it wants to write a variable it must have an exclusive lock for this variable. All of these locks are released upon commit, ensuring that the whole transaction had valid and consistent data. The downside for this is it can provide deadlocks as I’m sure you can imagine a scenario where T1 is going to write A and read B, but T2 is going to write B and read A. If T1 locks A while T2 locks B they will be waiting forever for each other as neither can finish because they can not attain the shared lock for what they need.

The paper then goes more in depth on SQL transaction support, which in summary, as you would expect SQL supports transactions.

Lastly, the paper touches on the crash recovery process. there is a Log file that is very important for crashes and it must be guaranteed that the changes that will be made are committed to the log file before they are changed on disk and committed. This allows for reading the log file after a crash and essentially replaying the events so that hopefully this time they will work and can be committed. Every so often all the changes are pushed to disk and the log file is cleared. It also touches on the two main styles for writing to pages, Stealing and Forcing. Stealing means that a transaction can write something to disk before it commits (when a dirty page from the buffer pool is evicted by another transaction), and forcing means that when a transaction commits all it’s changes in the buffer pool must be forced to disk.

I can’t really think of a downside to this paper as it is just meant to be an overview and introduction to transactions… I guess it doesn’t touch on current issues of transaction much or go very in depth with solutions, but it’s just an overview and introduction so I don’t blame it.

One clear strength of this paper to me is the usage of examples. It did a great job of breaking up the text blocks to give examples, which were very helpful when it came to scheduling and conflict descriptions.



Review 27

This “paper” gave a broad introductory description of transaction management and a variety of the mechanisms involved, with SQL used as an example for many of the topics discussed. The chapter goes over very basic concepts, such as transactions maintaining ACID properties or how transactions can be represented as finite state machines, as well as more involved scenarios such as thrashing, crash recovery, or lock-based protocols.

Specifically, the variety of concurrency control protocols, though broadly divided into lock-based and timestamp protocols, were fascinating to read about, as they each address different concurrency problems (lock-based protocols manage order between pairs of transactions, and timestamp protocols generally allow for immediate transaction execution). One of the more interesting parts of this chapter that I had not learned about before was the “phantom” problem, where a transaction can retrieve multiple sets of results even though the tuples queried were not modified by the transaction itself.

One thing that would make the reading more interesting would be to mention some of the more advanced problems in transaction control and concurrency management, as well as current approaches to solving these problems (e.g. next-key locking for addressing the phantom problem). In addition, it would be interesting to see a description of the cost/benefit tradeoffs of certain transaction management schemes over others, especially when considered in real-world industry contexts.


Review 28

This chapter covered the basics of DBMS transaction management. Much of the chapter discussed material in relation to the famous ACID theorem. One particularly interesting part of this chapter was the explanations of locking performance.

At a high level, locks are just used to resolve conflicts between two entities trying to access the same piece of data. However, coming up with efficient lock based schemes is a more involved problem. The book states that fewer than 1% of all transactions are subject to a deadlock. If there are many concurrent transactions executing on the same database objects, there is a lot of possibility for blocking delays. Thus, it is imperative to employ an efficient lock system. There were three key points to optimizing locking schemes. First, putting locks on the smallest possible sized objects can help reduce the number of transactions that actually require the same lock. Second, reducing the amount of time a transaction can hold a lock obviously leads to less latency. Finally, alleviating hot spots can greatly aid performance because if a piece of data is being frequently accessed, that means many transactions are going to be waiting on it.

Though it was not described in depth, the ARIES recovery algorithm was another interesting technique presented in this chapter. ARIES is a simple three step recovery scheme that can help prevent data loss. The first phase (dubbed Analysis phase) it identifies pages in the buffer pool that have not been written to stable storage along side transactions that were active at the time of the crash. Next, it repeats all the identified actions from a point chosen in the logs and restores the database to that point. Finally, it undoes the transactions that were not committed to keep consistency with the actions of committed transactions.


Review 29

This chapter elaborates on the transaction management in database system. It discusses basic aspects such as the ACID properties (Atomicity, Consistency, Isolation, Durability) of transaction and transaction scheduling. Then, it touches the concurrent transaction execution by interleaving between transaction (as strategy to improve performance as well as optimizing the utilization of CPU and I/O) while maintaining the schedule serializability, as well as the drawbacks of interleave execution (WR conflict, RW conflicts, and WW conflicts). This concurrency control leads to the need to ensure serializability and to make sure that no action of committed transaction are lost while undoing aborted transaction. In this case, it is important to recognize which segment of a transaction can be executed in interleaving manner with other transaction. Implementing lock-based concurrency control ensures DBMS that, even though actions of several transactions might be interleaved, the net effect is identical to executing all transactions in one serial order.

This chapter in particular discusses Strict Two-Phase Locking (Strict 2PL) as the most widely used locking protocol. In Strict 2PL, only interleavings of “safe” transaction are allowed. If two transactions access the same object (in which one of them wants to modify the object), their actions are effectively ordered serially. The downside of locking is deadlock, which rarely happened in general, but still an issue. This chapter discusses a bit on how to detect deadlock in a transaction. Another downside is overhead caused by blocking, which happens more often as more transactions access the same object. Fortunately, this chapter also gives tips on how to prevent blocking (by locking the smallest object possible and by reducing the time that transactions hold locks). Last, it talks about transaction support in SQL (using SAVEPOINT and ROLLBACK), how to choose which object to lock (which often leads to phantom problem), and transaction characteristic in SQL (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE).

This chapter is important because transaction management is one of the ways to optimize performance while still maintaining the ACID property of the transactions in the face of concurrent access and system failure. In a DBMS, performance optimization is one of the main goals. To achieve this, it is inevitable that DBMS has to interleave the actions of several transactions. Yet, DBMS also has to make sure that the database in always in consistent state with consistent data for every users/access. Thus, knowledge about the nature of transactions, the risks of interleave executions, as well as lock control are needed to manage transaction.

While this chapter is great in covering the basic aspects and some general issues of transaction management, it does not consider how the architecture of the OS and/or database itself (i.e.: distributed database) influences the transaction management. For example, how does it communicate with the I/O and OS scheduler? Also, in a distributed database architecture in which the data is physically distributed but logically united, is there any particular technique of transaction management?



Review 30

The purpose of this chapter in the textbook was to give the reader an overview of transactions in a database management system (DBMS). The chapter covers the briefly the basics of what transactions are, but focuses on how a DBMS handles transactions, the properties that the DBMS must check and the properties that are left to the user to check (in the latter case, for example, the user is responsible for asserting that a transaction is consistent, i.e. it will always leave the database in a consistent state upon its completion and subsequent commit).

This chapter doesn’t really have any technical contributions to the field. Rather, it is a survey and expositional work on transactions and how they interact with a database system. The paper gives an overview of locking and how the DBMS issues locks to various transactions. It also spends a good deal of space discussing examples of the various scenarios in which transactional management can lead to errors and unresolvable sequences of transactions. It breaks down these problem areas into enumerated lists and subsections and walks through specific examples so that the reader can understand various kinds of transactions and how a set of transactions could be formed into a serializable schedule and also the cases where this is not possible.

I do think a strength of this chapter is that it first presents a series of concepts and then presents a section in which they discuss how the specific concepts are manifested in a real world database system. It is helpful for a reader to contextualize the concepts by seeing how they might appear in a commercial database system such as SQL 1999 that the reader might one day use.

It is hard to ascribe weaknesses to a chapter in a textbook as there are no technical merits to critique or analyze. Thus I find I am forced to nit pick the formatting of the chapter. While I think that it does work through the sections methodically and covers a wide range of topics regarding transactions, I feel like parts of the chapter are quite redundant. This might just be an artifact of the textbook, but I do wish it were a more dynamic and engaging read.



Review 31

Paper Title: Chapter 16: Overview of Transaction Management
Reviewer: Ye Liu

Chapter Summary:
This chapter focuses on the discussion of transaction. A transaction is the key factor of concurrent execution and recovery from system failure in a DBMS. A transaction is any execution of a user program in a DBMS and differes from an execution of a program that is outside of the DBMS.

The chapter starts with introducing the ACID property and other properties and principles and responsibilities on both the users and the DBMS’s sides including consistency and isolation, which states that users are responsible for making actions’ results independent of the order of execution.

Next, the chapter reviews the concept of transactions and covers the concept of schedules. A schedule is a list of actions from a set of transactions. All actions must be included in a schedule to make it complete thus it can be used for recovery in case of a system failure.

Following part discusses about action conflicts. The concepts are similar to those introduced in the literature of Operating Systems where due to the execution time, read and write may cause conflicts when the system attempts to parallelize the execution queue.

Then the chapter also discusses the lock mechanism for concurrent execution control. It explains some implementation and their potential issues such as deadlock in details with examples.

In the end the majority of the chapter is focused on the transaction support in SQL. It provides detailed discussion of how the transaction can be implemented in SQL and at the same time it refers to the previously mentioned issues and examples.



Review 32

Summary:
In this chapter, the book gives an overview of transaction management. It covers the the basic concepts of transaction as well as the implementation and techniques behind transaction, such as crash recovery and lock-based implementation. The concepts covered in this chapter can be summarized as follow:
1. Properties of transaction: ACID
1) Atomicity
a) A user can think of a transaction as executing all its actions or none.
2) Consistency
a) A transaction will make the database system move from consistent state to another consistent state
b) Users who submit transactions need to ensure the transaction consistency
c) Database consistency:
3) Isolation
a) Transactions are protected from the effects of concurrently scheduling other transactions
4) Durability
a) Once the DBMS informs the users that the transaction is completed, the effects of the transaction should persist in the system even if crash happens before write to disk.
2. Transactions and schedules
1) For consistency maintenance, transactions interact with each other only via database read and write operations, no exchange messages between them.
2) A transaction is either commit or abort
3) A schedule is a list of actions (read, write, commit, and abort) from a set of transactions and the order between two actions of a transaction
3. Concurrent execution of transactions
1) Motivation for concurrent execution
a) Improve throughput. For example, disk i/o and cpu activities can be in parallel to improve the performance.
b) Reduce latency for short transaction. Intervals between transactions can avoid the possibility a short transaction waiting for a very large transaction.
2) serializability
a) A serializable schedule is that for a set of transactions S, the effects of operating them by DBMS is the same as a complete serial schedule of these transactions.
b) If multiple transactions are submitted concurrently, different serial orders may produces different results by DBMS.
3) Conflicts due to interleaved executions
a) Two actions on the same object conflict if at least one of them is a write
b) Although the transactions need to leave the system consistent when it's ending, it is ok to let the system inconsistent while its operating. This may lead the DBMS to inconsistent state.
c) WR conflict: aka Dirty Read, is reading on data that was written by other uncommitted transactions.
d) RW conflict: aka Unrepeatable Read, is writing the data that is previously read by other not committed transactions. The problem here is the transactions cannot be serializable.
c) WW conflict: aka lost updates, is writing the data that was previously changed by other transactions that are not yet committed. The problem is DBMS lost a update of the previously written data.
4) Schedules Involving Aborted Transactions
a) Recoverable: Transactions are commit only after all transactions whose changes they read has committed.
b) Avoid cascading effects: Transactions that only reads changes of committed transactions.
4. Lock-based concurrency control
1) Only serializable, recoverable schedules are allowed and that no actions of committed transactions are lost while undoing aborted transactions. This can be archived using lock.
2) Strict two-phase Locking(Strict 2PL) is one of the most common used locking protocol in transaction management that could assure serialibility and recoverability. It has the following steps:
a) If a transaction T wants to read(on the other hand Write) an object, it first requests a reader lock(write lock on the other hand) of the object.
b) All locks held by a transaction are released when committed.
3) Deadlocks
a) According to the book, deadlock can happen when cyclic wait happens between two transactions.
b) Way to identify deadlock: using timeout mechanism.
c) When deadlock happens, DBMS abort one of the transaction and then reapply the transaction latter.
5. The book introduces several way to boost the performance:
1) By locking the smallest sized objects possible
2) By reducing the time that transactions hold the locks.
3) By reducing the hot spot in database objects.
6. Transaction support in SQL
1) DBMS allows user to create and terminate transaction with the supports of:
a) SAVEPOINT
b) ROLLBACK TO SAVEPOINT
c) Chained Transaction
2) Objects needs to lock the table or all corresponding rows of the table using indexes in case of phantom problem happens.
7. Crash Recovery in transactions:
1) A recover manager is responsible for atomicity and durability of the system.
a) It ensures atomicity by undoing the actions of transactions that have not yet committed. This is archived by rollback functionality
b) It ensures durability by maintaining a log that make sure all committed data can survive after system crash or media failure.
2) Steal frames and Forcing pages
a) Steal frames : if frames in the buffer pool can be written to disk before commits, we say steal frames are allowed.
b) Forcing pages: If when transactions committed, all changes in the buffer pool relative to the transactions are forced to disk, we say that forcing pages is applied.
c) For performance issue, most DBMS use steal and no-force approach.
3) ARIES: a recovery algorithm that has 3 phases:
a) Analysis phase, identifies dirty pages in the buffer pool and active transactions at the time of the crash
b) REDO phase, repeat all actions, after checkpoint, and restores the database state to the time it was crash.
c) Undo phase, undoes all the actions of transactions that did not commit.

Strengths:
1.This book gives a comprehensive and comprehensible review of transactions. It covers most of the concepts and algorithms that related to transactions.
2. This book use great examples to illustrate the concepts, help its reader to understand the knowledge in a much easier way.

Weakness:
1. It would be more interesting if the book can show some related details of real world DBMS systems that implement transactions.





Summary:
In this chapter, the book gives an overview of transaction management. It covers the the basic concepts of transaction as well as the implementation and techniques behind transaction, such as crash recovery and lock-based implementation. The concepts covered in this chapter can be summarized as follow:
1. Properties of transaction: ACID
1) Atomicity
a) A user can think of a transaction as executing all its actions or none.
2) Consistency
a) A transaction will make the database system move from consistent state to another consistent state
b) Users who submit transactions need to ensure the transaction consistency
c) Database consistency:
3) Isolation
a) Transactions are protected from the effects of concurrently scheduling other transactions
4) Durability
a) Once the DBMS informs the users that the transaction is completed, the effects of the transaction should persist in the system even if crash happens before write to disk.
2. Transactions and schedules
1) For consistency maintenance, transactions interact with each other only via database read and write operations, no exchange messages between them.
2) A transaction is either commit or abort
3) A schedule is a list of actions (read, write, commit, and abort) from a set of transactions and the order between two actions of a transaction
3. Concurrent execution of transactions
1) Motivation for concurrent execution
a) Improve throughput. For example, disk i/o and cpu activities can be in parallel to improve the performance.
b) Reduce latency for short transaction. Intervals between transactions can avoid the possibility a short transaction waiting for a very large transaction.
2) serializability
a) A serializable schedule is that for a set of transactions S, the effects of operating them by DBMS is the same as a complete serial schedule of these transactions.
b) If multiple transactions are submitted concurrently, different serial orders may produces different results by DBMS.
3) Conflicts due to interleaved executions
a) Two actions on the same object conflict if at least one of them is a write
b) Although the transactions need to leave the system consistent when it's ending, it is ok to let the system inconsistent while its operating. This may lead the DBMS to inconsistent state.
c) WR conflict: aka Dirty Read, is reading on data that was written by other uncommitted transactions.
d) RW conflict: aka Unrepeatable Read, is writing the data that is previously read by other not committed transactions. The problem here is the transactions cannot be serializable.
c) WW conflict: aka lost updates, is writing the data that was previously changed by other transactions that are not yet committed. The problem is DBMS lost a update of the previously written data.
4) Schedules Involving Aborted Transactions
a) Recoverable: Transactions are commit only after all transactions whose changes they read has committed.
b) Avoid cascading effects: Transactions that only reads changes of committed transactions.
4. Lock-based concurrency control
1) Only serializable, recoverable schedules are allowed and that no actions of committed transactions are lost while undoing aborted transactions. This can be archived using lock.
2) Strict two-phase Locking(Strict 2PL) is one of the most common used locking protocol in transaction management that could assure serialibility and recoverability. It has the following steps:
a) If a transaction T wants to read(on the other hand Write) an object, it first requests a reader lock(write lock on the other hand) of the object.
b) All locks held by a transaction are released when committed.
3) Deadlocks
a) According to the book, deadlock can happen when cyclic wait happens between two transactions.
b) Way to identify deadlock: using timeout mechanism.
c) When deadlock happens, DBMS abort one of the transaction and then reapply the transaction latter.
5. The book introduces several way to boost the performance:
1) By locking the smallest sized objects possible
2) By reducing the time that transactions hold the locks.
3) By reducing the hot spot in database objects.
6. Transaction support in SQL
1) DBMS allows user to create and terminate transaction with the supports of:
a) SAVEPOINT
b) ROLLBACK TO SAVEPOINT
c) Chained Transaction
2) Objects needs to lock the table or all corresponding rows of the table using indexes in case of phantom problem happens.
7. Crash Recovery in transactions:
1) A recover manager is responsible for atomicity and durability of the system.
a) It ensures atomicity by undoing the actions of transactions that have not yet committed. This is archived by rollback functionality
b) It ensures durability by maintaining a log that make sure all committed data can survive after system crash or media failure.
2) Steal frames and Forcing pages
a) Steal frames : if frames in the buffer pool can be written to disk before commits, we say steal frames are allowed.
b) Forcing pages: If when transactions committed, all changes in the buffer pool relative to the transactions are forced to disk, we say that forcing pages is applied.
c) For performance issue, most DBMS use steal and no-force approach.
3) ARIES: a recovery algorithm that has 3 phases:
a) Analysis phase, identifies dirty pages in the buffer pool and active transactions at the time of the crash
b) REDO phase, repeat all actions, after checkpoint, and restores the database state to the time it was crash.
c) Undo phase, undoes all the actions of transactions that did not commit.

Strengths:
1.This book gives a comprehensive and comprehensible review of transactions. It covers most of the concepts and algorithms that related to transactions.
2. This book use great examples to illustrate the concepts, help its reader to understand the knowledge in a much easier way.

Weakness:
1. It would be more interesting if the book can show some related details of real world DBMS systems that implement transactions.





Review 33

This article talks about the concept and implementation of database transaction. A transaction is a set of operations that are seen as atomic from DBMS’s perspective. To ensure the ACID property of transaction, a lock-based concurrency control is used. DBMS also implements ARIES algorithm for crash recovery.

An important job of DBMS is to ensure Atomicity, Consistency, Isolation and Durability(ACID). Users are responsible for consistency. The atomicity and isolation is guaranteed by concurrency control. DBMS also provides crash recovery for ensuring durability even this is system crash.

Transactions can be represented as a set of read/write actions in a schedule. The DBMS has an global view of the sequence in which actions happened. Executing those actions without any protection could cause severe problem. For example, if the transaction A writes to data X before transaction B reads X, then transaction A aborts itself. In this case the transaction B is reading inconsistent data. The real value of X that B should see is the one before write. By apply locking mechanism, the DBMS prevents interleaved actions from different transactions affecting each other. The most widely used locking protocol is Strict Two-Phase Locking. It uses read-write lock to protect data and only release those locks when the transaction is completed. The performance of lock is largely determined by delays due to blocking. To prevent the system from thrashing, it limits the number of active transactions at any time.

To provide crash recovery, a recovery algorithm ARIES is introduced. It consists of three steps. When the system start after a crash, it first analyze the log and find the dirty pages and active transactions at the time of crash. It then redo all the actions from a checkpoint in the log and recover the database to its state at the time of crash. Lastly it will undo the transactions that haven’t been committed.