Review for Paper: 41-CryptDB: Processing Queries on an Encrypted Database

Review 1

This paper introduces CryptDB, the first practical system to execute a wide range of SQL queries over encrypted data. The solution satisfies two goals: protecting data confidentiality and enabling the server to run computations on encrypted data.

CryptDB addresses two threads, and the protection to the first threat is the main focus of this paper.
1. threat from an adversary who gains access to the DBMS server and tries to learn private data
2. threat from an adversary who gains complete control of the application and the DBMS server, and CryptDB protect the confidentiality of the data belonging only to users logged-out of the application.

Besides data protection, another task of executing queries on encrypted data requires a balance between minimizing the amount of confidential information revealed to the DBMS server and the ability to efficiently execute a variety of queries. As a solution to the problem, CryptDB incorporates two techniques: SQL-aware encryption and adjustable query-based encryption.

In general, CryptDB has the following properties:
1. Sensitive data is never available in plain text at the DBMS server.
2. Sensitive fields are not exposed to applications unless necessary.
3. For equality checks, proxy reveals repeating items without comparing actual values
4. For order checks, proxy reveals the order of elements in the column

One important design of CryptDB is the adjustable query-based encryption schemes. The goal is to use the most secure encryption schemes that enable running the requested queries. The problem is that the query set is not always known in advance. Thus, an adaptive scheme that dynamically adjusts encryption strategies is desired. The solution is to encrypt each data item in one or more onions, each value dressed in layers of increasingly stronger encryption, and determines which layer to use at run-time.

I like the idea of "onion" encryption, since it smartly address the problem of not being able to know what encryption schema to use before running queries. By dynamically selecting the encryption layer to use, the system can both be efficient when running queries and be secure. It's also interesting to learn that different encryption schemes can be "layered".

I have two concerns about the experiment results. First, the author says a 26% slower performance is "modest considering the gains in confidentiality". I wonder if there's a formal requirement on performance-confidentiality tradeoff, otherwise the "modest" is not that persuasive. Second, I would like to know if using the onion style multi-layered encryption would lead to high storage overhead, and if yes, how large the overhead would be? But in general, I like this paper.

Review 2

Problems & Motivations
Nowadays, the theft of private information is a significant problem for online applications. Sensitive data can leak from the online data repositories because the adversary has the ability to access the server. The traditional wisdom is to encrypted all the data in the server; yet, for DBMS, how to run query on the encrypted data becomes another big challenge. While someone has already proposed homomorphic encryption that can compute arbitrary functions over encrypted data, the system sacrificed the performance. Therefore, the authors propose the CryptDB, which achieves a balance point between security&flex and performance. Also, it doesn’t change the existing the DBMS.

Main Achievements:
The paper firstly identifies two threats mode. The first one is weaker. It only assumes the DBMS has been corrupted, which means the adversary can get any data stored in the DBMS and any intermediate results during the computation of the DBMS. The second one is that any components including the application, proxy, server, DBMS has been corrupted. This paper will focuses on the first mode and assume the adversary is passive which it doesn’t modify any data. The key insight behind the paper is SQL-aware encryption and adjustable query-based encryption. For SQL-aware encryption, the paper summarizes some advanced encryption techniques and also propose a new one cryptographic scheme for JOIN. For the adjustable query-based encryption, the key insight here is the database should only know the necessary information to run the query, for example if the database needs an order by operation, then all the database need to know the relative order of group key instead of the content of the group key. Therefore, the CrypyDB uses the onions of encryption which basically compact different levels of the encryptions and only decrypted to the certain level given the query.

Drawbacks:
Though the paper mentions merely about the situation under the threat-2, according to other researches, it is unsafe to use the CryptDB.
Other disadvantage is the storage efficiency, because we adopt the onion structure here, though we compact them, it is still overhead to storage those data and brings a huge maintain overhead as we can see from the result section.

Review 3

Security in recent years has become a growing area of concern, especially with a number of high profile data breaches where millions of people’s records were compromised. The types of information being revealed can range from personally identifying information to credit card information to health records. In many cases, such leaks occur due to unauthorized access due to software vulnerabilities, physical access to hardware, or malevolent internal users. Some methods have been proposed and used to combat this threat, one of which is to encrypt the sensitive information on the database, but this presents new issues. For example, it imposes significant performance and network bandwidth overhead, and in some schemes, forces the client to download and decrypt data themselves, which forces clients to maintain significant processing power locally rather than being able to outsource it to cloud providers. This paper attempts to address the issue by proposing the first practical system of its time capable of executing a wide range of SQL queries directly on encrypted data in the database.

To start, the authors identify two major types of threats in their threat model. The first one is if an external adversary gains full read access to a DBMS server and seeks to learn about the contents stored inside. To deal with this, CryptDB intercepts SQL queries and redirects them to a trusted proxy that contains a master secret key used to rewrite queries to execute on encrypted data, and encrypts and decrypts all data. It manages to implement a variety of desirable characteristics from the standpoint of confidentiality. The DBMS never sees any sensitive data in plaintext form. Also, if the application does not request relational predicate filtering over a column, no information about the sensitive data other than its size is revealed. Only with stronger requests such as equality checks and ordering does additional information become revealed, but again only pieces of information that are relevant to the request. SQL-aware encryption performs encryption for various simple operators as well as privacy-preserving encryption schemes for joins.

Additionally, adjustable query-aware encryption allows CryptDB to selectively adjust the level of SQL-aware encryption to support different operations on the data, some of which require weaker levels of encryption. Onions of encryption compactly store ciphertext within each other without revealing weaker levels of encryption. To deal with the possibility of the entire infrastructure being compromised, CryptDB uses the previous methods but without a single master key that can be used to decrypt the entire database. Instead, they are given on a per-user basis, which protects the data for logged-out users in the event of arbitrary server-side compromises.

The primary strength of this paper is that it introduces a working and practical system capable of executing various SQL queries on encrypted data. Additionally, it seems to be reasonably flexible in terms of the number of columns it can run operators on; in the experimental data, it was able to support up to 99.5% of all columns on some publically available datasets. Additionally, it incurs up to about 26% performance overhead (in terms of decreased query throughput) compared to unmodified MySQL when run on the TPC-C benchmark, but it is quite reasonable, especially when compared to schemes such as downloading data and decrypting/encrypting it client-side.

One weakness of this paper is that it seems to be focused solely on providing strong confidentiality guarantees. For example, the worst case situation that it tests against is the event where the entire system is controlled by an adversary. In the vast majority of use cases, however, such situations would probably be considered extreme, and as such, an easy way to trade off security guarantees in exchange for lower performance overhead would be desirable. This is not specified by the authors, besides their adjustable query-aware encryption scheme, and it is not clear whether the system can fundamentally accommodate such flexibility since it was designed from the ground up for confidentiality.

Review 4

This paper introduces CryptDB, which is the first practical system that can “execute a wide range of SQL queries over encrypted data”. The paper talks about both security concerns in databases and how computations typically need to be run on the data. They propose that an ideal solution to enable servers to compute over encrypted data, so the computational burden is taken off of the client and the data remains confidential. CryptDB protects against 2 vulnerabilities. In the first, an attacker tries to get user information by snooping on the server but CryptDB protects private data. In the second, an adversary has complete access of the application but CryptDB can still protect the information of users who are logged-out during the attack.

The paper describes the threat model that CryptDB addresses. The first type of threat is server compromise where an attacker has read access to the database and does not try to modify data on the server. The second type of threat is arbitrary threats where the proxy, DBMS server infrastructure, or the application server may be compromised. CryptDB provides properties to defend against these such as ensuring the data is never exposed as plaintext on the server and hiding information like the data values that repeat in a column, and only sharing the number of repeats of the value. Next the paper describes queries over encrypted data as well as other computations.

I liked how as a security paper this document seemed structured differently to others. It describes the adversary and exploits and then described how those motivated the contribution and how it protected against them. I did not like how some of the example in the graphics in the paper were hard to follow, as they could get a bit cluttered with information.

Review 5

Cryptdb was proposed by MIT's CSAIL in the 11-year SOAP, which implements homomorphic encryption technology on the database. Cryptdb hopes to implement encryption operations on the database system. The effect is: The data in the database is all encrypted, but the database can still execute the user's SQL statement on the encrypted data, return the encrypted data to the user. THen the user can decrypt he returned result to obtain the plaintext data.

It is based on the idea that full homomorphic encryption is difficult to implement, but for databases, only a few common operations are required. For example, for a where condition in a normal select statement, you need to compare equal operations. If you just support these common operations, you can achieve the encrypt with only 20% drop in throughput.

The Cryptdb system can be divided into three parts: Client, MySQL-Proxy, and MySQL-SERVER. The main logic is implemented in MySQL-Proxy, and for MySQL-SERVER, some auxiliary functions are implemented through UDF.

MySQL-Proxy can get the SQL request sent by the user, perform intermediate processing, and then send the processed request to MySQL-SERVER. After the request is executed on the server, the result is processed by MySQL-Proxy and returned to the client.

The order preserving encryption algorithm itself guarantees that the relative order of the data is consistent before and after encryption. The DET algorithm can support whether two numbers want to be compared, and HOM can support addition. Cryptdb has made some optimizations on the data storage. It uses the onion encryption model. At the beginning, it is at the strongest encryption level. When it needs to support certain operations, it will perform partial decryption to reach the level that can support specific operations.

The key contribution of this paper is CryptDB. It is the first system that execute a wide range of SQL on encrypted data. The system also achieved a fairly good performance on encrypted data.

The weak point that I noticed is that the latency of a single query might be a little bit longer.

Review 6

CryptDB is a system that allows queries to run on encrypted data. There is a lot of value in a system like this - companies want to be able to protect data from an adversary or a nosy DBA, but they also need the powers of a DBMS to manage their data. If full tables are simply encrypted and sent back to the client, the client would essentially need to provide modules like query optimization that live in the DBMS.

CryptDB provides a proxy layer, which stores the encryption keys - all SQL requests are routed through the proxy layer. The first threat discussed (labeled “Threat 1”) is an adversary obtaining access to the DBMS layer. Because this is behind the proxy layer with the keys, the data is safely encrypted. The second threat (“Threat 2”) is more severe - the application, DBMS, and proxy may all be compromised. In this case, there is some risk of users who are currently logged in having their data compromised.

The most interesting and innovative contribution of this paper is SQL-aware encryption. There are increasingly strict levels of encryption that allow for an increasingly small subset of operations to be performed. There are layers that allow joins, searches, and order by to be performed on the encrypted data - of course, each of the less strict encryption levels allows some new information about the data to be revealed. The “onion” strategy allows layers of encryption to come off in order to get to the one that is suitable for the query. Thus, the strategy is both SQL aware and adaptive. It’s noted in the results section that CryptDB is able to process queries over 99.5% of the columns during the experiments, which is a strong result.

I found some of the claims made in the paper to be a bit strong when looking at this method through the eyes of an adversary. As I understand it, the person operating CryptDB can specify a minimum security level for any very sensitive information, like credit card numbers. The authors mention near the end of the paper that their approach “provides a significant improvement in confidentiality over revealing all encryption schemes to the server.” However, because the security scheme is adaptive and can be dynamically adjusted, is that not the same thing as exposing the minimum level of security to any adversary?

Review 7

In the paper "CryptDB: Processing Queries on an Encrypted Database", Raluca Ada Popa and Co. discuss CryptDB, a practical system that provides confidentiality for applications that use database management systems. In the modern era, theft of private data is very common. Whether the criminal uses software exploits to gain unauthorized access to a server, physical server access to steal data from disk/memory, or administrator privileges at a hosting provider to snoop on private data, they all have one purpose: to compromise the confidentiality of data. Thus, the generally accepted solution to solve this issue is to encrypt data. Yet, data within servers are not just stored - they are used for computations as well. A naive solution would be to store encrypted data on a server, but move it to a trusted client for decryption and computation purposes. However, this is unfeasible due to the burden placed on the client's machine and the potentially large amounts of data that need to be moved around. Ideally, the server should be able to perform computations over encrypted data which preserves the architecture that most applications are built upon. Progress on this approach has been made: fully homomorphic encryption. The problem, however, lies in its terrible performance - over nine orders of magnitudes slower than the state of the art plain-text method. Thus, we have CryptDB which hits all of our pain points and is able to perform computations over encrypted data with less overhead. With little to no changes to the internal DBMS, CryptDB protects from adversaries who gains access to the DBMS server and try to learn private data by snooping on the server. It is clear that such a useful system is necessary and important to explore.

This paper is divided into several sections:
1) Threat model and overview: CryptDB targets passive attackers that have full read-access to data. It removes these threats by executing SQL queries over encrypted data on a DBMS sever. CryptDB intercepts all the SQL queries on a trusted proxy. This proxy has a master secret key and rewrites the queries so that it is able to run on the encrypted data. The proxy then encrypts and decrypts all the data, changes some operators, and preserves the semantics of the queries. This is great because the DBMS server never actually received the decryption keys - the adversary is unable to see the sensitive data. CryptDB is very careful about the amount of data that it decides to reveal during computation. The main focus is on confidentiality, not data integrity or availability. CryptDB only ensures that the data itself cannot be discerned - the table structure and amount of data is still fair game.
2) Querying over encrypted data: There are two main techniques applied: SQL aware encryption and adjustable query based encryption. In the former, several methods are used such as RND, DET, OPE, HOM, JOIN, OPE-JOIN, and SEARCH. These methods give different security levels and allow for different computations. In the latter technique, adjustable query-based encryption dynamically adjusts the layer of encryption on the DBMS server, creating an onion. Depending on what computation the user desires, layers of the onion are peeled off in order perform those computations. Keep in mind that the data is still encrypted, not all the layers are taken off. This makes it so that any attacker, not matter how good he/she is at intercepting queries, will never disclose particular information within tuples.
3) Implementation: The CryptDB proxy is written over a mySQL proxy and consists of a C++ library and a Lua module. The main contribution is the query re-writer that produces replacement expressions to perform computations over encrypted data.

Much like other papers, this paper also has some drawbacks. The first drawback that I noticed is within the experimental evaluation. CryptDB does not ensure that query results that are obtained from the server are correct. A malicious user can redirect results that are able to mislead clients that are learning trends about data. Furthermore, an adversary can also use external means to get access to a decryption key and comprise data for the entire database. Another drawback that I noticed was the incompatibility of CryptDB with data cleaning software. Since information is encrypted and the contents of the tuples are unknown, data cleaning is now orders of magnitudes slower due to decryption of these tuples. It follows that any type of search query also suffers as a result of encryption. Nothing is ever free in the world of computer science; performance is sacrificed for the sake of security. The final drawback I noticed was the vague metrics they used in order to evaluate the security of CryptDB. I felt that they use the MinEnc in order get a justifiably high score for their system.

Review 8

This paper describes CryptDB, a remote DBMS that encrypts users’ data to protect against outsiders reading it. CryptDB stores all of the user data and performs calculations on a remote server. However, all of the data is encrypted. An attacker that is able to read everything on the server would still not be able to read any of the user’s private data.

The server keeps all of the user data encrypted and never sees the plaintext. It also does not have access to the master key that can decrypt the data, so that an attacker cannot find the plaintext data just from accessing the server. However, the burden of final decryption shouldn’t be on the client, so that ordinary DMBS clients can still interact with CryptDB. As such, the user interacts with a proxy server, which has the master key for decryption. This proxy server interacts with the main server, decrypts the results, and sends them to the client.

The major issue with encrypting data is balancing revealing information to potential attackers with actually computing queries; a fully encrypted dataset reveals no information, but also permits no computation. CryptDB solves this by encrypting data in such a way that it can support standard DBMS operations.

This idea is referred to as SQL aware encryption. For example, if two values are equal, then their encryptions will also be equal, so there is no need to decrypt them. CryptDB also has levels of encryption that allow for ordering and addition of values, but these may reveal more about the underlying data.

In addition, CryptDB can apply adjustable query-based encryption. With this idea, each data item is encrypted in several layers, called onions. Then, the server can decrypt only as many layers as are necessary in order to apply operations for any incoming query. Whenever a query comes in, CryptDB can also encrypt columns and constants in the query in order to match the levels of encryption already in place, so it doesn’t have to unroll more of the onion. Joins are generally possible on column that have been encrypted with the same key. In order to get around this, CryptDB uses a system that allows them to re-encrypt with a different key in a manner that does not expose the underlying plaintext.

The large advantage of CryptDB is the ability to protect sensitive information without placing an undue burden on the client. Separating the database and proxy servers allows data to be undecodable using just the database server, and multiple layers of encryption allow the database server to keep doing the majority of computation without compromising the plaintext data.

One downside of the system is that it’s not built to handle an attacker that can modify data. In addition, compromising the proxy server can allow the attacker access to private data, so the proxy server needs stronger privacy guarantees. Onion decryption being entirely on the database server seems to cause issues with security, as any attacker might be able to decrypt the onions as well, negating their security benefits.

Review 9

The paper presented CryptDB, the first practical system that can execute a wide range of SQL queries on encrypted data. Using SQL-aware adjustable encryption with multiple onions, CryptDB provides a strong level of confidentiality in the face of two significant threats confronting database-backed applications: compromises to the DBMS server by a passive adversary, and arbitrary compromises to the application server and the DBMS. CryptDB requires no changes to the internals of the DBMS. The evaluation shows that CryptDB successfully handles a wide range of queries observed in practice, with a modest performance overhead
CryptDB addresses two threats. The first threat is an adversary who gains access to the DBMS server and tries to learn private data by snooping on the server. This threat might arise when an attacker exploits some vulnerability to directly get to the DB server, when the database is outsourced to an external, or when the DBMS is administered by a curious system or database administrator (DBA) who might not be trusted. CryptDB aims to prevent the adversary from learning private data in this case. The second threat is an adversary who gains complete control of the application and the DBMS servers. In this case, CryptDB protects the confidentiality of the data belonging only to users logged-out of the application during an attack, but cannot provide any guarantees for logged-in users. This paper focuses primarily on the solution to the first threat.
I like this paper since it’s quite straightforward by pointing out what it addresses and convincing by experiments. It also provides source code.

Review 10

Sensitive data can leak from online data repositories in many ways. An adversary can exploit software vulnerabilities to gain unauthorized access to severs, and attackers with physical access to servers can steal data from disk and memory. Therefore, to reduce the damage of data leak, encrypt all sensitive data stored on the servers. However, many applications requires servers to perform computations on the data. The solution is to enable a server to compute over encrypted data without decrypting data to text. Therefore, this paper presents CryptDB, which is a system that can execute a wide range of SQL queries over encrypted data.
Specifically, CryptDB addresses two threats. The first one is an adversary sho gains access to the DBMS server and tries to learn private data. The second threat is an adversary who gains complete control of the application and the DBMS servers.
The paper describes how CryptDB executes SQL queries over encrypted data. The CryptDB proxy stores a secret master key MK, the database schema, and the current encryption layer of each column. The DBMS server can also see an anonymized schema. Processing a query involves four steps: the application issues a query, which the proxy intercepts and rewrites: it anonymizes each table and column name, and using the master key MK, encrypts each constant in the query with an encryption scheme best suited for the desired operation, the proxy checks if the DBMS server should be given keys to adjust encryption layers before executing the query, and if so, issues an UPDATE query at the DBMS server, the proxy sends the encrypted query to the server, and finally the server returns the encrypted query result.
There are a few encryption methods used in CryptDB. RND provides the maximum security in
CryptDB. This scheme is probabilistic, meaning that two equal values are mapped to different ciphertexts with high probability. DET enables the server to learn which encrypted values correspond to the same data value, by deterministically generating the same ciphertext for the
same plaintext. OPE allows the server to determine order relations between data items based on their encrypted values, without revealing the data itself. HOM is as secure a probabilistic encryption scheme as RND, but allows the server to perform computations on encrypted data with the final result decrypted at the proxy. A separate encryption scheme is needed to allow equality join between two columns, because we use different column-specific keys for DET to prevent correlations between columns. SEARCH is used to perform searches on encrypted text to support operations such as MySQL’s LIKE operator.
The advantage of this paper is that it explains the motivation of having CryptDB very clearly. However, it would be better to have a more detailed technical explanation for the query on encrypted data.

Review 11

“CryptDB: Processing Queries on an Encrypted Database” presents a new database framework that keeps data encrypted on the database, and allows typical SQL queries on the encrypted data relatively quickly via a proxy layer. The purpose of such a framework is to protect against nosy database administrators or other unauthorized users who, if they get access to make queries on the database, should not be able to uncover actual data. Prior work has looked into encrypted databases, but have had performance challenges due to having to decrypt large portions of the database in order to perform each query. The way CryptDB works is that, for a given authorized user, it has a proxy layer with an encrypt/decrypt key that processes the user/application’s query to make it compatible with the encrypted data on the database; in other words, for constants in the query (e.g., column names, or data values) it encrypts them using the key. CryptDB also makes an effort to encrypt data with the most secure protocol as possible. However, certain kinds of query operations (e.g., equality join) cannot be performed on data encrypted by the most secure protocols. Therefore, CryptDB uses the concept of an “onion”, where it encrypts data in an “onion” or multiple layers, layers closer to the inside are less secure, and the onion is only decrypted to those less secure layers when the incoming SQL query requires a particular level of encryption. An onion exists for each class of computation that could occur (e.g., equality, order, search, addition). For example, with regard to equality operations, the “equality” onion uses the most secure protocol (random) when the query does not require equality checks, order-preserving encryption when the query requires equality checks but not order checks, and Join encryption (the most secure of the 3 protocols) for order checks. The paper also discusses what kinds of queries and applications CryptDB supports, the levels of security CryptDB provides, and performance impacts of using CryptDB

As mentioned earlier, CryptDB is great because it protects against unauthorized access to the database server and it is quicker than prior solutions because it evaluates queries on the encrypted data. The paper is also easy to read and understand.

CryptDB has some limitations though: how unsecure is it when onions are unencrypted to less than RND levels? CryptDB also doesn’t protect against overwriting/deleting data. It also assumes the DBA or other unauthorized users do not have access to the proxy layer or application server.

Review 12

This paper proposed cryptDB, which is a practical system that explores an intermediate design point to provide confidentiality for applications that use database management systems. The key insight of the paper is that most SQL queries use only a small set of well-defined operators, which make it possible to support the operators to run efficiently over encrypted data.

Besides the techniques the paper proposed, the paper did a good job explaining why we need to process queries on an encrypted database. I think the background information the paper gave is really necessary. It is shown that millions of people's medical records were stolen between 2009 and 2011. Other critical information, such as user profile and credit card information was also stolen. We have now reached a big data era. If the basic infrastructure such as database cannot guarantee a considerable level of privacy protection, we are all exposed to the "spies". People who got the information can be even more knowledgeable of ourselves than ourselves, and it's a situation we don't want. Al thought it is somehow late for the generation of ours that we grew up in the early era of big data, we didn't have the mind of technical protection of our personal information. However, it is never too late to put the critical information under protection, considering the next generations.

The paper first discussed two kinds of threats, server compromise, and arbitrary threats. Then it proposed the main contribution of the paper, processing queries over encrypted data. There are several encryption methods that are used in CryptDB, Random, Deterministic, Order-preserving encryption, homomorphic encryption, Join ( JOiN and OPe-JOiN), and Word search. The key contribution of the paper is the onion encryption layer which actually allows querying over encrypted data. The paper also gave a detailed explanation on how they execute over encrypted data, including read/write executions and join operation.

The strong part of the paper is to propose a method to query over encrypted data, which is a strong need for not only privacy data but also some database clients, including banks, government, hospitals. It becomes more critical when many of the organizations start to transmit their data to a cloud vendor. I think it is extremely important for us to consider the privacy problem under the big data era.

The weak part of the paper to me is that I don't think the experiment part is complete enough. I think the paper should consider different settings of OLTP and OLAP workloads. It seems that it will influence the performance for different workloads

Review 13

In this paper, the authors focus on the security issue of DBMS and proposed a system which can provide practical and provable confidentiality in the face of attacks. The security issues are definitely an important issue in DBMS, it can protect the user from exposing their privacy and protect companies from losing money. This problem is important because there are several threats to the security of DBMS. As they said in their paper, many online applications are vulnerable to theft of sensitive information because adversaries can exploit software bugs to gain access to private data, also curious or malicious administrators may also capture and leak the data. In order to handle these threats and making data safe, the CryptDB is introduced. This paper presents CryptDB, a system that explores an intermediate design point to provide confidentiality for applications that use database management systems. Next, I will summarize the crux of CryptDB with my understanding.

For CryptDB, it works by executing SQL queries over encrypted data using a collection of efficient SQL-aware encryption schemes. Before the introduction of CryptDB, there are several approaches to solve the security problem in DBMS. One approach to reducing the damage caused by server compromises is to encrypt sensitive data, however, most applications cannot adopt this method. Another approach would be to consider theoretical solutions such as fully homomorphic encryption, which allows servers to compute arbitrary functions over encrypted data, while only clients see decrypted data. As for CryptDB, it explores an intermediate design point to provide confidentiality for applications that use database management systems. CryptDB leverages the typical structure of database-backed applications, consisting of a DBMS server and a separate application server. CryptDB’s approach is to execute queries over encrypted data, and the key insight that makes it practical is that SQL uses a well-defined set of operators, each of which we are able to support efficiently over encrypted data.

CryptDB is focusing on two challenges, the first lies in the tension between minimizing the amount of confidential information revealed to the DBMS server and the ability to efficiently execute a variety of queries. The second challenge is to minimize the amount of data leaked when an adversary compromises the application server in addition to the DBMS server. In order to solve these challenges, CryptDB uses the following ideas. First of all, CryptDB executes SQL queries over encrypted data. It utilizes a SQL aware encryption strategy. Second, they use an adjustable query-based encryption technique. Third, they use a chain encryption keys to user passwords so that each data item in the DBMS can be decrypted only through a chain of keys. From the experiments, we can find that CryptDB achieves good performance without too many overheads.

Generally speaking, this is a nice paper with great insight, the main technical contribution of this paper is the introduction of CryptDB, a system that provides a practical and strong level of confidentiality in the face of two significant threats confronting database-backed applications. I think there are several advantages of CryptDB. First of all, I think it is the first practical system that is able to process queries in an encrypted DBMS. Second, it utilizes onions of encryption that compactly store multiple cipher-texts within each other in the database and avoid revealing weaker encryption schemes when they are not needed, this idea is quite pioneering in this field. Third, the CryptDB is flexible, it is not a stand-alone system and people don’t need to migrate their data to CryptDB, they can easily apply CryptDB on the top of their existing DBMS and enjoy the features provided CryptDB, which is programming friendly.

However, I think there are several disadvantages of the CryptDB. First of all, the CryptDB improve security by sacrificing the performance of the system. As they said in their paper, 14.5%-26% extra overhead is introduced, I think this overhead is not a small issue for heavy OLTP workloads. Second, I think the CryptDB is not safe enough if the adversary gains the complete control of application and DBMS servers, CryptDB cannot provide any guarantees for users that are logged into the application during attacks, I think this is something they need to improve in future. Third, my another concern is that whether the CryptDB can make an impact in the industry. I think for large companies, they all have a security team that builds firewalls to make sure their server cannot be easily attacked, imagine that you want to steal data from Google or Amazon, it is impossible because they have perfect security mechanisms that protect attacks from both internal and external. I think most companies would like to build more infrastructure to improve security rather than modify their DBMS and lose performance. But this solution may work well for small or start-up companies, however, the security threats for them are relatively small.

Review 14

CryptDB is an enhancement for DBMS’s that aims to improve security at a slight cost to performance (~26% reduced performance in experiments). It reduces the threat of DBMS servers getting compromised by utilizing a proxy that encrypts & decrypts queries & data so that everything on the DBMS can be encrypted, so that even if someone got access, they cannot learn anything useful. They also provide SQL-aware encryption that increases efficiency (previous encryption-based DMBS’s suffered from horrible performance). Adjustable query-based encryption is also provided, which is based on using different layers of encryption and adjusting the encryption scheme depending on what layer of encryption you desire. This allows adjustable confidentiality for content which can be leveraged for better performance while still providing strong security for the most vital information. Also, to ameliorate the situation where the proxy server is compromised, per-user secret keys are generated and stored instead of a single master secret key. This way, only logged-in users will be compromised rather than everyone if the proxy server is compromised.

The main strength of the paper is that it provides a reasonable level of increased security without the ridiculous loss of performance that previous systems had suffered from. There are still some vulnerabilities but they are very limited compared to what a no-encryption DBMS server would face.

One major weakness that I see is that since everything is encrypted & decrypted by the proxy, this means that if the proxy server is compromised, the security benefits are lost in the same way they would be if in a regular DBMS, the DBMS server was compromised. Not-logged-in users won’t be compromised, which is a slight advantage, but the others will be. Because of this, this seems to be more of just a shift in responsibility than a real “solution” to the issue.