Review for Paper: 9-Serializable Snapshot Isolation in PostgreSQL

Review 1

The authors implement a new serializable isolation level to PostgreSQL, and the previous highest isolation level was snapshot isolation. The new serializable isolation level uses Serializable Snapshot Isolation (SSI) technique to retain benefits of snapshot isolation and guarantee true serializability.

The paper first introduces snapshot isolation and points out that it can cause unexpected transactions that lead to database inconsistencies.

SSI ensures serializability by running transactions with snapshot isolation and adding additional checks to determine whether anomalies are possible. It is used due to its high performance and simple abort.

The authors add optimizations for read-only transactions to their SSI implementation since workloads contain a significant fraction of read-only queries. The optimization comes from:
1. theory enabling read-only snapshot ordering optimization.
2. safe snapshots to be executed safely without SSI overhead, and deferrable transactions whose executions are delayed to ensure being run on safe snapshots.

The authors also mention how they handle potential unbounded memory usage problem:
1. safe snapshots and deferrable transactions (reduce the impact of long-running transactions)
2. granularity promotion (combine locks to reduce space used)
3. addressive cleanup (remove unneeded transaction states)
4. summarization of committed transactions (more compact storage)

The cost of serializability is evaluated in PostgreSQL 9.1, where the proposed SSI implementation and existing snapshot isolation level are compared. The results show that SSI offers comparable performance with SI, and much better than S2PL implementation for serializable isolation.

This paper is well-structured and clear on technical details.

Review 2

Serializable isolation of transactions allows developers to write transactions assuming they will execute sequentially ignoring the interactions with other transactions executing simultaneously. However, previous PostgreSQL versions are not providing SSI considering the high cost of standard two-phase locking mechanism. This paper introduces the latest PostgreSQL release with an extension of SSI technique implemented. The new technique improves the performance for read-only transactions. The paper first explains the differences between serializability and snapshot isolation. Then the paper reviews previous work and presents how new optimizations improve the read-only transactions performance. The paper then discusses detailed implementations, features and memory usage optimizations. At last the paper compared their implementation to traditional lock-based implementation of serializability to illustrate the improvements it made.

Serializability can simplify difficult problems occurred while handling concurrency issues. Developers can assume that their transactions are running isolated without interfering with other concurrent transactions. SSI improves the performance of S2PL implementation and doesn’t require additional blocking because it checks anomalies and abort potential transactions that may violate serializability.

The new serializable isolation level solves the performance problem of serializability and make the performance similar to snapshot isolation, which is applicable for real world workload. The contribution of this paper is great because it’s the first try of implementing SSI in PostgreSQL.

Review 3

This paper introduced Serializable Snapshot Isolation (SSI) technique to implement PostgreSQL’s new serializable isolation level. Previously, PostgreSQL used Snapshot isolation (SI) to achieve serializability, which would cause unexpected behaviors hence the inconsistency in the database by allowing anomalies. Such concurrency issues are notoriously difficult to deal with, since the analysis of SI anomalies is difficult and even not feasible, not to mention using static analysis to identify anomalies may not be possible when the workload includes ad hoc queries. Therefore, providing serializability in the database is an important simplification for application developers.

Other databases provided a serializable isolation level by strict two-phase locking (S2PL), which was too expensive and did poorly in performance. And the MVCC system used by PostgreSQL needed additional blocking, which was not convenient for users. Hence, this paper used SSI technique to implement serializable isolation level in PostgreSQL, which offered higher performance and did not require additional blocking.

SSI runs transactions using snapshot isolation, but checks at runtime for conflicts between concurrent transactions, and aborts transactions when anomalies are possible. In order to implement it in PostgreSQL, this paper built a new SSI lock manager, along with existing multiversion concurrency control data, to detect conflicts between concurrent transactions. This paper also introduced a safe retry rule, which resolves conflicts by aborting transactions in such a way that an immediately retried transaction does not fail in the same way. This implementation addressed interactions with PostgreSQL’s support for replication systems, two-phase commit, and subtransactions. It also addressed memory usage limitations by a transaction summarization technique that ensured that the SSI implementation used a bounded amount of RAM without limiting the number of concurrent transactions.

Besides, this paper also optimized read-only transactions to improve performance by enabling a read-only snapshot ordering optimization to reduce the false-positive abort rate and identifying certain safe snapshots on which read-only transactions can execute safely without any SSI overhead or abort risk, and introduced deferrable transactions, which delay their execution to ensure they run on safe snapshots. These allowed certain read-only transactions to execute without the overhead of SSI by identifying cases where snapshot isolation anomalies cannot occur.

The main contributions of this paper are: 1)It is the first implementation of SSI in a production database release. 2)It is the first in a database that did not previously have a lock-based serializable isolation level. The advantage of this paper is it not only came up with ideas of implementing serializable isolation level, but actually implemented it in existing database, with explanations of additional issues and how to solve them.

The main disadvantage of this paper is the authors could compare their implementation in PostgreSQL to previous implementations in other databases such as MySQL, which may be more convincing to prove their implementation.

This paper introduced Serializable Snapshot Isolation (SSI) technique to implement PostgreSQL’s new serializable isolation level. Previously, PostgreSQL used Snapshot isolation (SI) to achieve serializability, which would cause unexpected behaviors hence the inconsistency in the database by allowing anomalies. Such concurrency issues are notoriously difficult to deal with, since the analysis of SI anomalies is difficult and even not feasible, not to mention using static analysis to identify anomalies may not be possible when the workload includes ad hoc queries. Therefore, providing serializability in the database is an important simplification for application developers.

Other databases provided a serializable isolation level by strict two-phase locking (S2PL), which was too expensive and did poorly in performance. And the MVCC system used by PostgreSQL needed additional blocking, which was not convenient for users. Hence, this paper used SSI technique to implement serializable isolation level in PostgreSQL, which offered higher performance and did not require additional blocking.

SSI runs transactions using snapshot isolation, but checks at runtime for conflicts between concurrent transactions, and aborts transactions when anomalies are possible. In order to implement it in PostgreSQL, this paper built a new SSI lock manager, along with existing multiversion concurrency control data, to detect conflicts between concurrent transactions. This paper also introduced a safe retry rule, which resolves conflicts by aborting transactions in such a way that an immediately retried transaction does not fail in the same way. This implementation addressed interactions with PostgreSQL’s support for replication systems, two-phase commit, and subtransactions. It also addressed memory usage limitations by a transaction summarization technique that ensured that the SSI implementation used a bounded amount of RAM without limiting the number of concurrent transactions.

Besides, this paper also optimized read-only transactions to improve performance by enabling a read-only snapshot ordering optimization to reduce the false-positive abort rate and identifying certain safe snapshots on which read-only transactions can execute safely without any SSI overhead or abort risk, and introduced deferrable transactions, which delay their execution to ensure they run on safe snapshots. These allowed certain read-only transactions to execute without the overhead of SSI by identifying cases where snapshot isolation anomalies cannot occur.

The main contributions of this paper are: 1)It is the first implementation of SSI in a production database release. 2)It is the first in a database that did not previously have a lock-based serializable isolation level. The advantage of this paper is it not only came up with ideas of implementing serializable isolation level, but actually implemented it in existing database, with explanations of additional issues and how to solve them.

The main disadvantage of this paper is the authors could compare their implementation in PostgreSQL to previous implementations in other databases such as MySQL, which may be more convincing to prove their implementation.

This paper introduced Serializable Snapshot Isolation (SSI) technique to implement PostgreSQL’s new serializable isolation level. Previously, PostgreSQL used Snapshot isolation (SI) to achieve serializability, which would cause unexpected behaviors hence the inconsistency in the database by allowing anomalies. Such concurrency issues are notoriously difficult to deal with, since the analysis of SI anomalies is difficult and even not feasible, not to mention using static analysis to identify anomalies may not be possible when the workload includes ad hoc queries. Therefore, providing serializability in the database is an important simplification for application developers.

Other databases provided a serializable isolation level by strict two-phase locking (S2PL), which was too expensive and did poorly in performance. And the MVCC system used by PostgreSQL needed additional blocking, which was not convenient for users. Hence, this paper used SSI technique to implement serializable isolation level in PostgreSQL, which offered higher performance and did not require additional blocking.

SSI runs transactions using snapshot isolation, but checks at runtime for conflicts between concurrent transactions, and aborts transactions when anomalies are possible. In order to implement it in PostgreSQL, this paper built a new SSI lock manager, along with existing multiversion concurrency control data, to detect conflicts between concurrent transactions. This paper also introduced a safe retry rule, which resolves conflicts by aborting transactions in such a way that an immediately retried transaction does not fail in the same way. This implementation addressed interactions with PostgreSQL’s support for replication systems, two-phase commit, and subtransactions. It also addressed memory usage limitations by a transaction summarization technique that ensured that the SSI implementation used a bounded amount of RAM without limiting the number of concurrent transactions.

Besides, this paper also optimized read-only transactions to improve performance by enabling a read-only snapshot ordering optimization to reduce the false-positive abort rate and identifying certain safe snapshots on which read-only transactions can execute safely without any SSI overhead or abort risk, and introduced deferrable transactions, which delay their execution to ensure they run on safe snapshots. These allowed certain read-only transactions to execute without the overhead of SSI by identifying cases where snapshot isolation anomalies cannot occur.

The main contributions of this paper are: 1)It is the first implementation of SSI in a production database release. 2)It is the first in a database that did not previously have a lock-based serializable isolation level. The advantage of this paper is it not only came up with ideas of implementing serializable isolation level, but actually implemented it in existing database, with explanations of additional issues and how to solve them.

The main disadvantage of this paper is the authors could compare their implementation in PostgreSQL to previous implementations in other databases such as MySQL, which may be more convincing to prove their implementation.

This paper introduced Serializable Snapshot Isolation (SSI) technique to implement PostgreSQL’s new serializable isolation level. Previously, PostgreSQL used Snapshot isolation (SI) to achieve serializability, which would cause unexpected behaviors hence the inconsistency in the database by allowing anomalies. Such concurrency issues are notoriously difficult to deal with, since the analysis of SI anomalies is difficult and even not feasible, not to mention using static analysis to identify anomalies may not be possible when the workload includes ad hoc queries. Therefore, providing serializability in the database is an important simplification for application developers.

Other databases provided a serializable isolation level by strict two-phase locking (S2PL), which was too expensive and did poorly in performance. And the MVCC system used by PostgreSQL needed additional blocking, which was not convenient for users. Hence, this paper used SSI technique to implement serializable isolation level in PostgreSQL, which offered higher performance and did not require additional blocking.

SSI runs transactions using snapshot isolation, but checks at runtime for conflicts between concurrent transactions, and aborts transactions when anomalies are possible. In order to implement it in PostgreSQL, this paper built a new SSI lock manager, along with existing multiversion concurrency control data, to detect conflicts between concurrent transactions. This paper also introduced a safe retry rule, which resolves conflicts by aborting transactions in such a way that an immediately retried transaction does not fail in the same way. This implementation addressed interactions with PostgreSQL’s support for replication systems, two-phase commit, and subtransactions. It also addressed memory usage limitations by a transaction summarization technique that ensured that the SSI implementation used a bounded amount of RAM without limiting the number of concurrent transactions.

Besides, this paper also optimized read-only transactions to improve performance by enabling a read-only snapshot ordering optimization to reduce the false-positive abort rate and identifying certain safe snapshots on which read-only transactions can execute safely without any SSI overhead or abort risk, and introduced deferrable transactions, which delay their execution to ensure they run on safe snapshots. These allowed certain read-only transactions to execute without the overhead of SSI by identifying cases where snapshot isolation anomalies cannot occur.

The main contributions of this paper are: 1)It is the first implementation of SSI in a production database release. 2)It is the first in a database that did not previously have a lock-based serializable isolation level. The advantage of this paper is it not only came up with ideas of implementing serializable isolation level, but actually implemented it in existing database, with explanations of additional issues and how to solve them.

The main disadvantage of this paper is the authors could compare their implementation in PostgreSQL to previous implementations in other databases such as MySQL, which may be more convincing to prove their implementation.

Review 4

Problem & Motivation:
Serializable isolation for transactions enables the application developer to write transactions without concerning other concurrency transactions. However, the traditional way of writing two-phase locking greatly influences the performance of the database. Given that the performance is an important metric for evaluating the database, finding a solution with better performance is essential. Therefore, the authors of the paper propose a database, which utilizes the Serializable Snapshot Isolation (SSI) technique, and implement a new serializable isolation level which offer greater performance while still guaranteeing true serializability.

Achievement:
Propose the first implementation of SSI for a purely snapshot-based DBMS and provide many valuable experience & techniques during implementation (like transaction summarization). By fast copying the transactions and comparing them in runtime, the database achieves greater performance.

Drawback:
Too many contents for the overview section. I really lost my mind given a bunch of new term in it. However, after I read the next section, I pick up my mind again. Therefore, maybe we can put some of the content of the overview section back to the next section.

Review 5

Serializable isolation is an important in database transactions, since it allows the developer to write instructions with the illusion that they will be executed sequentially, and not have to worry about other concurrent transactions and the effects that they might have on those transactions due to race conditions, etc. Despite these advantages, serializable isolation was not a highly sought after feature for quite some time in PostgreSQL, mainly due to its cost. This paper is the first instance where serializable snapshot isolation (SSI) was finally introduced into a PostgreSQL production release, and details the process involved in implementing this feature.

The authors begin by making a distinction between serializability and snapshot isolation. Snapshot isolation is a weaker isolation level than serializability, and ensures that the user sees a consistent picture of the database, that of a “snapshot” taken when the first read is done. Any updates done during this process are not shown, and transactions cannot update anywhere that conflicts with other updates done in the meantime. This can lead to potential issues such as lost updates. On the other hand, serializability means that the transaction effects must be equivalent to executing everything serially. This is achieved by using standard snapshot isolation and adding additional checks to determine the possibility of various anomalies in the form of rw-conflicts, etc. If serializability is violated, transactions are aborted. From there, it is up to the user to deal with aborted transactions using various common methods that the paper does not have to deal with. In addition, the authors discuss some read-only optimizations put into the system, as well as the theory behind them. From there, they delve into actual implementation challenges with PostgreSQL, including the tasks of detecting, tracking, and resolving conflicts, memory usage optimization, and unexpected interactions between their serialization snapshot isolation and other PostgreSQL features.

The primary strength of this paper is that it demonstrated a working implementation of serializable snapshot isolation in PostgreSQL. As this had never been done for PostgreSQL, the authors had to implement various new features in order to integrate SSI, and their description of the challenges they faced were interesting to read. For a long time, PostgreSQL users did not express a desire for SSI, in part due to the cost of implementing such a feature, so getting this done was an important step in getting the community to adopt such a potentially useful feature.

The only potential weakness with this paper that I can raise is the applicability of the microbenchmarks they ran and how representative they are of real workloads. Also, while their SSI implementation is clearly superior to the locking-based S2PL, it might be work asking if the additional feature is worth the increased 10 to 20 percent overhead.

Review 6

This paper presents a contribution of Serializable Snapshot Isolation (SSI), which maintains serializability in concurrency control. The paper starts by distinguishing SSI from serializability, by outlining features like it’s ability to avoid race conditions since transactions are essentially executed in series. On the other hand, the SSI implementation is similar to optimistic concurrency control methods in that it does not restrict reads but it does have restriction on writes. SSI however reads from a snapshot of the system at some early time.

The paper introduces concepts like Snapshot Isolation Anomalies, which are essentially concurrency issues containing “dangerous structures” found in the system snapshot that you read from. Dangerous structures occur when a transaction in a series commits before another one takes a snapshot of the database. This causes transactions to abort on the bad data. The paper describes the lock manager that they implemented and how it is used to detect read/write conflicts. They also propose a method of aborting in a way where the system can safe retry the transaction such that you will not exit with the same serialization failure.

I did not like how late the prior work was presented in this paper. I may just be used to it in the first or second section but I felt that it came pretty late in the paper compared to others we have read. However I did like the placement of the PostgreSQL current concurrency control methods because it provided immediate context for their SSI implementation which ran in PostgreSQL; we were able to directly compare the new method had over the existing.

I like how this paper did not assume the reader was already familiar with the material. They eased into the material and introduced all the new terminology with a brief explanation. They also outlined the flow of the paper in the first section, and there was a generally logical flow of the information in the paper. The use of theorems and proofs was also helpful in clearly laying out the proposed ideas and convincing the reader of the ideas.

Review 7

This paper gives introduction on implementation of serializable snapshot isolation technique in PostgreSQL which does not have lock-based isolation before. The paper also, as one of the contribution, proposed SSI extension which improved performance for read-only transaction. This paper also realizes two first: 1)first implementation of SSI in a production database release. 2)first implementation of SSI for a purely snapshot-based DBMS.

PostgreSQL offers high performance that adopting weakest isolation level as default. And snapshot isolation is one type of weak isolation level with efficient implementation using MVCC without reading lock. Snapshot isolation has the properties of 1)all reads within a transaction see a consistent view of the database 2)snapshot isolation does not allow dirty reads, non-repeatable reads and phantom reads. However, Snapshot isolation does not guarantee serializable behavior which may pose a problem(requirement of data integrity). Through it allows several other anomalies, it is still wildly used for reasons: 1)many workloads do not have anomalies. 2)use explicit locking to avoid conflict. 3)those conflicts can be materialized and force transaction involved to update it. The paper also shares the view of that providing serializability in the database is am important simplification for application developers.

Serializable snapshot isolation is not like previous ones(S2PL etc). It uses snapshot isolation to run transactions with adding additional checks to determine whether anomalies are possible. It shows higher performance compared S2PL. Another good aspect of SSI is that it does not require additional blocking. This paper also introduces SSI theory and its some variations.

PostgreSQL have three isolation levels ofter adding SSI. All queries in PostgreSQL are performed in snapshot format. There are three distinct lock mechanisms in PostgreSQL: lightweight locks, heavyweight locks and tuple locks. PostgreSQL detects conflicts by using MVCC data and a new lock manager in SSI implementation. And it uses safe retry to resolve conflicts.

part of the storage optimization: PostgreSQL uses several techniques to limit the memory usage of SSI lock manager: 1)safe snapshots and deferrable transactions 2)granularity promotion 3)aggressive cleanup of committed transactions 4)summarization of committed transaction. This version of implementation of PostgreSQL also realized feature that never existed before in previous implementations: 1)it supports two-phase commit 2)it uses sub-transactions to implement savepoints.

I think the drawback of this paper lies on the proposed features. Instead of merely introducing the previous problems and future plans(one of four features), a comprehensive intro of how to implement the feature that in this release of PostgreSQL is recommended.

Review 8

PostgreSQL 9.1 is the first production-grade database to achieve serialized isolation levels using Serializable Snapshot Isolation (SSI). The paper mainly summarized the work that is done to apply SSI on PostgreSQL and the challenges they have met. The paper also provide a detailed performance analysis on the new SSI.

Before 9.1, PostgreSQL does not provide serializable isolation since the Strict 2PL has a too large overhead. The tradeoff here between whether or not have a serializable isolation in the application is obvious. An serializable isolation level is wanted since it allow user not to consider the concurrency but implementing and using that might be expensive. PostgreSQL provide a weak isolation level that is snapshot isolation, which does not provide serializability.

In this paper, the authors first compare snapshot isolation and serializability. Then the author explained how SSI is achieved by identify three dependencies: wr, rw, ww. There are techniques that is used to optimize SSI in the implementation is called ‘safe snapshot’ and deferrable transactions. Safe snapshot is used to guarantee that no concurrency transactions that is committed with rw anti dependency, and the read only transactions can be run on it without overheat. Deferrable transactions is that read-only transactions are delayed to run on safe snapshot. The paper also introduce the concurrency control in PostgreSQL. In the experiment, the paper showed that SSI has outperform Strict 2PL in several situations.

The main contribution of this paper falls on following points, 1. It is the first SSI that is implemented on industry product. 2. It fulfill the blank in PostgreSQL that no serializable isolation is provided. Pointing out the challenge in implementation make this paper quite strong.
I think the paper failed to compare performance of SSI to other serializable isolations makes it a little weaker. Since comparing to others will give readers a better overview of SSI by comparing to other methods.

Review 9

This paper covers the implementation of serializable snapshot implementation (SSI) in PostgreSQL. PostgreSQL had previously only supported snapshot isolation (although they called it SERIALIZABLE), which caused certain anomalies that were a problem for customers requiring true serializability. The example provided was the Wisconsin Court System. These anomalies could cause certain invariants to be incorrect, given certain interleavings. The authors chose to use SSI as their serializable implementation because it gave better performance than two-phase locking.

SSI runs each individual transaction using snapshot isolation, but stores some state information to ensure that no anomalies can occur. If they would occur, the transaction is aborted. The implementation builds on the idea that if there is both an incoming and outgoing rw-antidependency, there may be an anomaly; this is a heuristic, and can cause false positives. PSSI is a system that builds the full serialization graph and is therefore able to get rid of false positives, but uses much more memory, so the authors decided not to use it. As I will describe in more detail, the authors also explain modifications they made to the traditional SSI method for Postgres.

This paper is a major contribution as it discusses the first implementation of SSI in a database with multiversion concurrency control and without an existing S2PL implementation. The authors are able to discuss areas that had not been considered initially when SSI was implemented in other databases, either because they already had S2PL or because they were not customer-facing databases with additional requirements. They discussed the lock manager interface that was necessary to implement SSI, as well as various optimizations for read-only transactions (safe snapshots, deferrable transactions), memory optimizations, and interactions with existing features like 2PC, replication, and savepoints. This information would be very useful to someone trying to implement SSI in a production database release. In addition to providing critical implementation details, the authors provided data from running various benchmarks.

One complaint that I had about the paper was the ways in which application users were and weren’t trusted. The paper clearly states on page 1851 that READ COMMITTED, the weakest provided isolation level, is the default. This implies a certain amount of trust in the application programmers (or DBAs), who would have to recognize the need to set a stronger isolation level. In my opinion, a lot of database technology revolves around “how can we protect the users from themselves.” The authors even mention on page 1856 that they are concerned about wr-dependencies that take place outside of the database, which shows that they don’t even expect application users to write clean transactions. This implies to me that the strongest isolation level should always be chosen as the default, and I was surprised that this wasn’t addressed directly in the paper. It is clear that many users choose a weaker isolation level for performance reasons, but this seems like it should be a conscious decision, especially when a true serializable mode is provided.

Review 10

This paper summarized the experience of implementing PostgreSQL’s serializable isolation level based on the newly developed serializable snapshot isolation(SSI) technique. Their experience is noteworthy since it’s not only the first implementation of SSI in a production database system but also the first implementation of SSI in a purely snapshot based database system. They provided solutions for a set of problems they encountered during the implementation which might be useful for later implementation of SSI in other systems.

The entire paper can be divided into two parts. The first part concerns the theory behind SSI technique. The core theorem here is that every cycle (thus no serial order possible) in the serialization history graph contains a so-called “dangerous structure” in the form of two adjacent rw-antidependency. Based on that, SSI tries to identify such structure and abort transactions as necessary. To optimize (long-running) read-only transactions, more theorems have been developed to enable safe snapshots and deferrable transactions. These optimizations greatly reduced overhead(no need to obtain SIREAD locks) and memory consumption(allow the cleanup for other transactions’ SIREAD locks).

The second part mainly focuses on the difficulties the PostgreSQL team has met during the implementation of SSI. For example, they need to implement a new lock manager since their system is previously snapshot based. They also talked about how much information should be maintained for each transaction in order to track dependencies, rules for picking the transaction that needs to be aborted and retired, aggressive cleanup and transaction summarization technique for avoiding unbounded memory usage, etc.

The author concluded the paper with a performance comparison between their implementation (using SSI) and strict two-phase locking based implementation for the serializable isolation level. The result showed that the performance of their implementation is significantly better and is closed to the snapshot isolation level. One concern I have about this paper is that if strict two-phase locking is widely used in production database systems, how many companies are willing to give up using the existing solution, switch to the new snapshot based technique and implement it from scratch.

Review 11

In the paper "Serializable Snapshot Isolation in PostgreSQL", Dan Ports and Kevin Grittner describe their experiences implementing PostgreSQL's new serializable isolation level. These include the struggles of dealing with implementing a new lock manager, discovering a new technique to ensure memory is bounded, and integrating it with other PostgreSQL features. Previously, PostgreSQL offered no isolation levels that guaranteed serializability. Thus, a new technique called Serializable Snapshot Isolation was created in order to function on top of the existing model so that non-serializable results are no longer possible. This was deemed to be necessary because the resources used in two phase locking was too expensive. Having serializable results is important because it allows developers to ignore other concurrent transactions while ensuring their transactions don't suffer from inconsistencies. Thus, solving this problem becomes a major concern for those that see the current snapshot isolation as a temporary solution.

The paper starts by comparing Snapshot Isolation (SI) and Serializable Snapshot Isolation (SSI). We can see that SI is completely inferior to the latter model since there are many cases where it breaks down. The cases that are unserializable can lead to difficult-to-reproduce errors caused by inconsistent data. Additionally, all the reads within a transaction will see a consistent view of the database, but does not guarantee consistent data. Thus, SSI comes to solve these issues. On top of running SI, there are multiple checks that consider the possibilities of an anomaly. This way, the same performance is kept, but it takes a safer approach. In the case that something is non-serializable, there is an abort and roll back session that eliminates the possibility of it occurring. Furthermore, based on the performance tests, SSI does better with read operations - just as well as SI and even better than two phase locking.

This paper does a good job of taking a real world problems and attempting to "patch" out the mistake with a layered solution. Furthermore, their claims are backed up with a plethora of quantitative evidence - this makes it plausible to apply this to industry. Of course, with strengths, there are also drawbacks to this paper. I would have liked to see more discussion on evaluating the optimizations on SSI. Being able to observe the impact and graphs that these have might influence one to incorporate them. Another weakness of the paper was when they were discussing the possibility of false positives. Providing more details on the occurrence of false positives would be helpful because there might be situations that are unfavorable to have a false positive.

Review 12

This paper shows a way to provide serializability in Postgres in a more efficient way than normally might happen. A standard way to ensure serializable transactions is through using strict 2-phase locking, but this can be inefficient, as a transaction keeping locks until it commits can noticeably restrict other transactions.

On the other hand, snapshot isolation can be used. In this isolation mode, each transaction takes a snapshot of the database when it begins, and it operates only on that snapshot. Concurrent transactions also can’t modify the same rows of the same tables, which prevents most issues. However, there are still obscure phenomena that can occur and prevent serializability in this isolation mode. As such, this paper aims to design a version of snapshot isolation that also guarantees serializability.

The idea with this is to use standard snapshot isolation, but to monitor the dependency conflicts between transactions, and abort certain transactions if a nonserializable cycle appears between transactions. Every cycle that causes an issue has the property that two consecutive read-write conflicts appear, and that the transaction at the end of those conflicts is the first to commit. As such, if this cycle appears in the snapshot isolation model, some of the transactions involved can be aborted to preserve serializability. This will produce some false positives. These can be found be keeping track of all conflicts, not just read-write, but that trades off with the overhead of tracking all conflicts.

This solution allows serializability in a much more permissive setup than previous versions. Lightening locking requirements allows for greater concurrency among transactions. This method can also be optimized for read-only transactions, by noting that a read-only transaction could only be at the beginning of a read-write conflict chain. If that possibility is ruled out, less tracking needs to be done for read-only transactions. However, while the overall idea is generalizable, the focus on Postgres makes it harder to transfer the serializable snapshot isolation to a different kind of DBMS.

Review 13

The paper mainly focus on realizing serializability on PostgreSQL by implementing serializable snapshot isolations, which is crucial as serializable transaction can simplify development by ignoring concurrency issues, becomes popular in today’s highly-concurrent systems but PostgreSQL historically do not provide serializability.
The basic idea is to modify snapshot isolation in PostgreSQL, which is the highest isolation level given by PostgreSQL, but still not guarantee serializability. The approaches are to detect snapshot isolation anomalies and abort transactions when detected. However it raises several challenges like implementing a new predicate manager, deal with PostegreSQL’s different features and mitigate memory usage. The authors take several approaches to optimize them.
Read-only optimization is adopted by enabling read-only snapshot ordering optimization, identifying certain safe snapshots on read-only transactions and introducing deferrable transactions to ensure safety. They tackle the memory problem by granularity promotion, aggressive cleanup and summarization of committed transactions.
The solution seems not perfect but the performance is greatly improved. The strategies for mitigating memory usage need great efforts, especially for not read-only tasks, which arise much workload and have some potential risks.

Review 14

This paper describes experience of developing PostgreSQL's new serializable isolation level based on Serializable Snapshot Isolation (SSI). Paper starts off by explaining how snapshot isolation is different from serializability. Essentially, serializability is the strongest isolation level, for which transactions are executed in some serial order. On the other hand snapshot isolation is a weak isolation level that uses multiversion concurrency control without read locking. Although SI has some potential anomalies, like the two examples provided by author, it is still widely used. This is due to concurrency issue is hard to deal with, therefore providing SI simplifies work of developers. The author describes SSI technique and review previous work. To sum up, SSI not only runs transaction using snapshot isolation, but also add additional checks to determine whether anomalies are possible. Specifically, it detects two adjacent rw-antidependency edges and abort one the transactions involved. Author also introduces two variants of SSI: one commits ordering optimization, and the other one builds the full serialization history graph to test for cycles and dangerous structures. The following section describes 2 ways of optimizing read-only transactions: enables a read-only snapshot ordering optimization and identifies safe snapshots and deferrable transactions. Author then provides an overview of implementation of SSI, discusses techniques of reducing memory usage, and examines how SSI interacts with other features of PostgreSQL. The last section compares performance of the implementation of SSI with lock-based serializability. Author performed experiments on three workloads, and SSI outperforms lock-based serializability under most scenarios.

The advantage of this paper is that it touches a lot of technical details, and the structure is clear. However, when explaining the scenarios, using more graphs may be more clear. Tracking the notations in sentences is sometime hard to follow.

Review 15

“Serializable Snapshot Isolation in PostgreSQL” by Dan R. K. Ports and Kevin Grittner present their PostgreSQL 9.1 release which implements Serializable Snapshot Isolation (SSI) and is the first production database implementation to do so. As a result of implementing SSI, PostgreSQL 9.1 supports true serializability but still outperforms two-phase locking for read-intensive workloads. The paper provides an overview of SSI, discusses read-only optimizations they implemented in PostgreSQL 9.1’s SSI, and describes the pieces of the PostgreSQL 9.1 implementation including the infrastructure they had to redesign in order to use SSI rather than two-phase locking.

Review 16

This paper gives their implementation detail of serializable snapshot isolation PostgreSQL.

The main approach is based on the "recently" developed SSI technique, and PSQL didn't provide as serializable isolation level due to the expensive cost of two-phase locking mechanism, upon which is used to implement other serializable isolation techniques.

SSI adds additional checks to determine whether anomalies are possible since single snapshot isolation is weak, this paper explains the technique in detail with theory and examples.

Also, the paper gives implementation details about handling conflicts, memory usage problem. These are important for practical implementation.

This paper evaluates its proposed method on three workloads, SIBENCH Microbenchmark, DBT-2++, and application performance on RUBiS. And the results show that SSI outperforms S2PL on every metrics, and slightly downperforms snapshot isolation.

The main drawback for me is that it is difficult for me to understand the whole serializability thing and the theory behind the technique. It may beyond my ability to judge whether the technique is good or not.

Review 17

This paper purposes a practical experience in implementing PostgreSQL’s new serializable isolation level by using Serializable Snapshot Isolation (SSI) technique. However, in the previous implementation of PostgreSQL, there is no lock-based serializable isolation level which makes it harder to implement SSI into PostgreSQL. Besides, Postgres offered a snapshot isolation level in their design, although provides good performance, there are undesirable anomalies problems with that it. Even though there are some techniques for solving the anomalies problem, they are very complicated and thus beyond the reach of most users, because these problems are incorporated with complex concurrency problems. These problems are important for DBMS designers because a good trade-off between performance and consistency is significant to a modern DBMS, a serializable isolation with good performance is always a preferable choice. In order to solve these problems in PostgreSQL, the authors implement the SSI in PostgreSQL by adding a new lock manager. In their design, a transaction summarization technique is used to bound the usage of RAM and they make an integration with other PostgreSQL’s features. They made an optimization of SSI on read-intensive workloads and the experiments show that their method outperforms others. Next, I will summarize the key points in this paper with my understanding.

First of all, the authors make a comparison between snapshot isolation and serializability, they can be regarded as a trade-off between performance and consistency. The authors give two anomaly examples happened in snapshot isolation and then show that the detection of anomalies is not an easy job for the human, thus present their solutions. First, they illustrate why an S2PL approach cannot be introduced to PostgreSQL (system and performance reason), the SSI is promising because it builds on snap isolation with better performance than S2PL. They use three kinds of dependencies wr, ww and rw-anti to show anomalies and give a theory of Serializability. The key idea of SSI is detecting potential anomalies at runtime and abort transactions when necessary. Rather than testing the graph for cycles, it checks for a “dangerous structure” of two adjacent rw-anti edges. This makes SSI much efficiency. Besides, SSI yields better concurrency compared to S2PL and OCC because it is more permissive. In their implementation, they make efforts in optimizing read-only transactions in two ways: 1. Enables a read-only snapshot ordering optimization to reduce the false-positive abort rate; 2. Identify certain safe snapshots on which read-only transactions can execute safely without any SSI overhead or abort risk and the usage of deferrable transactions. Based on this idea, they incorporate SSI into PostgreSQL, in the rest of paper, they provide implementation details including how to detect, track and resolve conflicts, SSI lock manager RAM usage reduction and interaction with other PostgreSQL features.

The technical contribution of this paper is that implementing SSI in PostgreSQL is the first implementation of SSI in a production database release. Previously in PostgreSQL, there is no lock-based serializable isolation level, but they made it by designing a new lock manager for tracking SIREAD locks. By using SSI, PostgreSQL enjoys the performance benefits of snapshot isolation and also reach true serializability. From an engineering perspective, they provide good compatibility with their previous features, like a new lock manager alone with their previous MVCC. For this paper, it provides rich background information about both PostgreSQL and SSI technique, so even for people with little experience in this field can understand this paper easily.

The drawbacks of this paper are minor. First, they still have a false alarm rate in their algorithm, although some mechanisms are built to solve them. Achieving a better performance by allowing some anomalies may not fit every situation. I think they should adopt new algorithms for reducing the false positive case. Second, this paper only provides a detailed implementation for PostgreSQL, it’s not general enough for all DBMS systems since different systems have different concurrency control mechanisms. Third, the optimization assumption of their paper is that the read-only transaction takes the majority of the workload, for other use cases which do not have heavy read workload, the overall system performance may be worse.

Review 18

This paper aims to improve the isolation level guarantees of PostgreSQL by implementing a new isolation level that guarantees serializability; originally PostreSQL had only supported one level below this due to using snapshot-based isolation. The reason serializability had not been a feature before was due to its relatively poor performance; the authors addressed this problem by providing serializability in a way that only had slightly worse performance than SI-isolation level. In particular, the authors decided to improve the SI-isolation level to prevent the anomalies that are possible under SI, which thus makes it equivalent to serializability.
The authors did this by utilizing SSI (Serializable Snapshot Isolation), which tracks the history of a DB with a directed graph that represents transactions as nodes. By doing this, it is possible to identify the anomalies present using SI by identifying cycles as well as various times of “dangerous structures” that represent these anomalies. Thus, SSI is able to function very similarly to regular SI, with the main difference being the additional checks for these anomalies in the graph.
The paper’s main contribution is the actual implementation of SSI in PostgreSQL. The authors also provided several optimizations (safe snapshots, deferrable transactions, analysis on interactions between SSI and 2PC, etc) which improved the performance / addressed edge cases for a real-world implementation of a new isolation level. The advantages are obvious—it provides a higher degree of consistency while retaining most of the performance that regular SI has to offer.
On the flip side, one obvious weakness is that although performance losses were not as great as previous implementations of serializability isolation levels, the performance was still worse, and so for people who can tolerate the kinds of anomalies that SI doesn’t protect against, there is not much of a point to use this. The additional performance overhead is only worth it if it is absolutely necessary to have strict serializability in an application. Another thing to note (not really fair to call it a weakness) is that there are several variations that could be attempted with this, such as varying the index type, which is probably worth trying to improve performance even more.