Review for Paper: 6-RAID: High-Performance, Reliable Secondary Storage

Review 1

This paper illustrates RAID (redundant arrays of inexpensive disks).

The paper first introduces basic background on disks, including physical structure, data paths, technology trends, and performance factors/limitations.

The paper then introduces the motivation of data striping and redundancy: performance and reliability. Redundant disk array organizations can be distinguished by the granularity of data interleaving, and how information is computed and distributed across the disk array.

Then, different RAID levels are illustrated and compared:
- Level 0: non-redundant; best read performance, but single disk failure will result in data loss.
- Level 1: mirrored; use twice amount of disk than level 0, used where data availability and transaction rate is preferred over storage efficiency.
- Level 2: memory-style ECC; less cost for recovery than level 1 with Hamming codes.
- Level 3: bit-interleaved parity; single parity disk to tolerate any single disk failure, used where high bandwidth but not high I/O rate is required.
- Level 4: block-interleaved parity; data interleaved across disks in blocks rather than in bits, compared to level 3, use read-modify-write procedure when writing new data, so the single parity disk can be the bottleneck since it's accessed on every write.
- Level 5: block-interleaved distributed-parity; eliminates the hot spot problem in level 4 by distributing parity disk data, have the best small read/large read/large write performance.
- Level 6: P+Q redundancy; can protect against up to two disk failures.

Then, reliability models are discussed. There are three aspects: double disk failure, system crash, and uncorrectable bit errors. Designers should be improving all of them to get a more reliable system. The authors also discuss how to avoid stale data on failure, how parity is regenerated after a system crash, how to operate with a failed disk efficiently, and how disk arrays should be connected to be more reliable.

In general, this paper is well structured and well introduces different levels of RAID and design considerations. The idea I like most is when comparing different RAID levels, the authors consider performance per dollar, which gives some more insight into the system building.

Review 2

Improvements in the performance of semiconductor technology have grown rapidly these years, which results in larger primary memory system. Disk arrays meet the demands of larger secondary storage systems well by organizing multiple independent disks into a large high-performance logical disk.

The paper starts with introducing two classic disk techniques: striping across multiple disks to improve performance and redundancy to improve reliability. Then, seven Redundant Arrays of Inexpensive Disks (RAID) architecture were presented and compared in their performance, cost and reliability. The paper also presents future research opportunities at last.

Some strengths of this paper are:
1. Redundant disk array protects the data for a longer time than non-redundant disk arrays and solves the problem that large disk arrays are much easier to fail than single disk arrays.
2. The paper provides a thorough introduction in disk technology, which is useful for those who have little background in disk technology to understand following discussions.
3. Section 3.3.1 is a nice discussion about the matrix inside before showing the actual result.
4. The discussion about future research topics is interesting and may optimize the disk performance further.

Some drawbacks of this paper are:
1. Adding the redundancy slows down write performance in disk arrays.
2. System crashes cause parity inconsistencies in bit-interleaved and block-interleaved disk arrays.
3. Disk arrays are more vulnerable to correlated disk failures like power surges, an initial disk failure may lead to a chain of disk failure.

Review 3

This paper presented a structured overview of disk arrays and seven basic disk array organizations along with their advantages and disadvantages, and compared their reliability, performance and cost.

Both the wide performance gap between microprocessors and magnetic disks and requirement of faster access to large datasets, together with the trend toward large, shared, high-performance, network-based storage systems, motivate the development of disk arrays, which used parallelism between multiple disks to improve aggregate I/O performance.

However, since large disk arrays are highly vulnerable to disk failures, a number of different data-striping and redundancy schemes have been developed. In order to evaluate the options introduced by the combinations of these schemes, this paper provided a systematic tutorial and survey of disk arrays, and described seven basic RAID organizations:

1).Nonredundant (RAID Level 0)
Pros: low cost, best write performance, high capacity
Cons: not best read performance, less reliability because of no redundancy
Therefore, it is widely used in supercomputing environments where performance and capacity, rather than reliability, are the primary concerns.

2)Mirrored (RAID Level 1)
Pros: It provides high availability and transaction rate
Cons: It has low storage efficiency
It is used in database applications where availability and transaction rate are more important than storage efficiency.

3)Memory-Style ECC (RAID Level 2)
Pros: It offers slightly higher storage efficiency, able to achieve component failure detection and data recovery.

4)Bit-Interleaved Parity (RAID Level 3)
Pros: It is more efficient when achieving component failure detection, high bandwidth, simpler implementation than level 4, 5, 6
Cons: It offers slightly lower read performance and not high I/O rates.

5)Block-interleaved Parity (RAID Level 4)
Cons: Parity disk can easily become a bottleneck.

6)Block-Interleaved Distributed-Parity (RAID Level 5)
Pros: It has no bottleneck, and offers best small read, large read, and large write performance.
Cons: Small write requests are somewhat inefficient compared with redundancy schemes.

7)P + Q Redundancy (RAID Level 6)
Pros: It has stronger error-correction capabilities to protect against simultaneous failure of two disks.

Then the paper compared the reliability, performance, and cost of these seven RAID organizations in terms of throughput per cost with equivalent file capacity. Finally, future research may contain more published measurement results and experience, interactions among new disk array organizations including new problems and obvious simplifications or optimizations, task of configuring and tuning arrays, how to handle vulnerability caused by increased software complexity of storage systems, how to address issues introduced by massively parallel disks, and how to reduce latency for lower throughput workload.

This paper summarized the disk arrays by basic background on disks-related knowledge, and examined basic issues in the design and implementation of disk arrays, and suggestions on future topic in this area. This paper explained in a clear and detailed way to illustrate each design and the intuition behind it.

The main disadvantage of this paper is it could give more real-life examples of how companies used the RAID in their products.

Review 4

The problem & motivations:
Back to 1990s, the microprocessor developed much faster than the storage systems. The development of the microprocessor led to the expansion of the scope of applications. And those applications required a high-efficiency, network-based storage system. A structure called disk array has been proposed; however, many extension structures on the disk array is messy. Therefore, it needs a summary and comparison of those methods.

Contributions:
The paper summarizes 7 implementations of the disk array and analysis and compares them both qualitatively (in theory) and quantitatively (by experiment). It also set a standard on how to evaluate the performance of these implementations from 3 aspects (reliable, performance, cost).

Drawbacks:
Not clear enough for some parts. Miss many details about the implementations. Like why the Seed-Holoman code will only produce a minimum of 2 redundancy disks. And why the level 5 method is fast on large data written operation.

Review 5

The paper “RAID: High-Performance, Reliable Secondary Storage” by Chen, et. al. discusses the RAID (Redundant Array of Inexpensive Disks) scheme, which is designed to ensure high reliability and recoverability while maintaining or improving disk performance. RAID was originally devised in response to various computing trends taking place during the 1980s. At the time, microprocessor speed was increasing at a higher rate than disk access and transfer speeds. The natural solution to increase performance, using disk arrays where a logical disk can span multiple disks, introduces additional challenges, notably the greatly increased chance of drive failure. In essence, RAID is a data-striping and redundancy scheme that is comprised of seven levels (Levels 0 to 6). In order, these are the Non-Redundant, Mirrored, Memory-Style ECC, Bit Interleaved Parity, Block Interleaved Parity, Block Interleaved Distributed Parity, and P+Q Redundancy levels. The paper describes each level based on its defining attributes with respect to read/write performance, redundancy/recovery measures, etc. In general, the lower levels offer higher performance at the cost of fewer to no backups, while higher levels offer the opposite tradeoff. For example, Level 0 does not use any redundancy, and consequentially has the best write performance (but not the best read performance, surprisingly enough). Besides covering the basics on these layers, the paper also compares performance across various layers for I/O operations in terms of throughput, as well as their reliability. Additionally, it launches into a discussion on different types of failures besides the simple single-disk failure, and how they impacted the design considerations for RAID. Finally, there is a section on future research directions in this field, such as how to massively scale this technology for use with parallel computing.

The primary value of this paper is in providing a detailed, textbook-like description of how the RAID system works. Given its ubiquity, knowing these details is valuable for engineers across a wide array of fields. The writing was straightforward and easy to follow, and it covered both the implementation details as well as the thinking behind their design. The inclusion of background material on how hard drives work in the beginning of this paper further helped in ensuring understanding. Overall, the authors achieved their goals of writing an effective survey that explained the details of RAID technology.

For the more well-informed, perhaps this paper was a little too basic and could have delved into lower level details, but I do not view it as a weakness since I believe that was not the original intention behind writing this paper.

Review 6

This article covers RAID (Redundant Arrays of Inexpensive Disks) storage system which is data storage done in parallel, which speeds up access time on small transactions and also ups transfer rates on large files as well. RAID systems use data striping which distributes data over multiple disks for the sake of speed and reliability, since some data is redundant. Implementing redundancy does however have some overhead since you need to manage the duplicated data. This is important because it proposes a low cost, fast, and reliable data storage method.

We also looked into the different levels of RAID and analyzed their read/write performance from level 0, which has the fastest write since it doesn’t manage redundant data but not the fastest read, to level 6 which needs to implement some concurrency management, namely the read-modify-write procedure for operations on the date.

I did not like how the article at times seemed to repeat itself. There are parts that are fundamental to RAID systems that are understandably important to revisit. However, I felt that the repeated information was in separate sections, like the read write challenges of the redundant system discussed throughout the subsections of section 3.

I liked how this paper offered a section for those unfamiliar with the anatomy of the RAID storage system. The paper also had a good distribution of images to illustrate the subject of the paper. The paper was very generous with the information that it gave to the reader.

Review 7

This paper proposed an overall introduction of disk array structure called RAID(redundant arrays of independent disks). There are two main advantages of using RAID: 1)it uses data striping which allows multiple disk I/O to execute in parallel manner. 2) reliability improvement by redundant data which makes recovery possible in case of failure.

This paper gives comprehensively overview of 7 levels of RAID(level 5 has the best performance):
Level 0: non redundant:
1)shows not the best performance on read but best performance on write.
2)The data striping is for parallel manner.
Level 1: Mirroring:
1) always make a copy of the disk(recovery in case of failure).
2)Write is operated on both disks.
Level 2: Memory-Style ECC:
1) use hamming codes for error correction.
2)the number of disks is proportional to total number of disks.(logN)
Level 3: Bit-Interleaved Parity:
1) the data is interleaved bitwise over the data disks. Use a single disk to store parity information.
2)It is adopted frequently in applications requiring high bandwidth.
Level 4: Block-Interleaved Parity:
1) the data is interleaved across disks in blocks of arbitrary size.
2)The parity disk has potential to become a bottleneck.
3)Read requests require only one disk, while write request requires four disks I/O's read-modify-write.
Level 5: Block-Interleaved Distributed-Parity:
1) Parity data is evenly distributed on all the disks.
2)It has best small-read, large-read, and large-write performance
Level 6: P + Q Redundancy:
1) has a strong protection of up to two disk failures

This paper clearly introduces the motivations and the principles of different levels of RAID.

Review 8

The paper describes RAID, Redundant Arrays of Inexpensive Disks, acting as a survey paper of the technology, which was about 5 years old when the paper was written. These disk arrays were used as a cheap way to increase disk performance. Various RAID levels are described; RAID 0, the simplest system, only increases capacity, while the other systems proposed also offer increased reliability. The system is made more reliable by introducing redundancy in the form of mirroring and parity bits. Each RAID level has its drawbacks, whether it is complexity, reliability, or speed; therefore, the application area must be considered when choosing a system.

Most importantly, RAID may be one of the first “scaling out” approaches to storage that has been explored, which has defined the direction of database systems in recent years.

I thought that the paper overall was a strong survey that clearly covered various aspects of the RAID architecture. The detailed descriptions of each RAID level give the reader a clear idea of different approaches that are being used for implementations of disk arrays. Two pieces that I thought were very well done were section 3.4 and 3.5 on reliability and implementation considerations. When I read on page 147 that disk arrays with redundancy could last for much longer than a single disk (23 years), I wondered a) how long that was and b) if it was really true. Then on page 159, an equation presented a time to failure of 38 million years, which made me pretty skeptical. However, the following sections on system crashes, bit errors, and correlated disk failures clearly stated the potential problems with the system and the tradeoffs that needed to be made to improve reliability. This greatly increased my confidence in the system.

It was mentioned briefly in section 6.1, but I found myself pretty disappointed that there was not empirical data presented regarding RAID performance and reliability. For a 5-year-old system that seemed to be somewhat well-adopted, the assertion that not many papers had been published with empirical results (179) was concerning. When putting important data in a new system with some complexity, I would want to know what had and had not worked well for others in the past; this paper did not present that information. If that information was not available, I wonder if that signifies larger problems. The lack of this information was in stark contrast to the GFS paper which I read directly afterwards, and presented multiple pages of empirical data.

Review 9

This paper provides an overview of the disk arrays, which is driven by the needs of larger, higher-performance secondary storage systems. RAID is necessary to overcome the slow improvement of disk access times and to fulfill the needs of image-intensive applications, applications requiring faster access to larger datasets, etc. The goal of the paper is to present a systematic tutorial and survey of disk arrays.

The paper first introduced two orthogonal concepts adopted by all disk array designs: data striping for improved performance and redundancy for improved reliability. It then concisely described seven different RAID organizations: from the simplest, non-redundant Level 0 to Level 6 which can protest against up to two disk failures. The key difference between each organization is the granularity of data interleaving and how the redundant information is calculated and distributed. As an example, RAID level 3 is bit-interleaved and the parity is stored in only one disk, while RAID level 5 is block-interleaved and distribute the parity over all disks.

Then the paper conducts a comparison among all the organizations based on performance, cost and reliability. For performance and cost, the author carefully chose the metrics (performance normalized by cost) and what to keep equivalent (file capacity) for each system when doing the comparison. A table summarizing the performance (per dollar) for each RAID level and each I/O type is given in the paper, along with graphs showing the relationship between performance and parity group size.

In terms of reliability, the author mainly focuses on the effect of system crashes, uncorrectable bit errors and correlated disk failures. Their calculation shows that any combination of these problems will greatly shorten MTTDL and increasing chance of losing data.

In the rest of the paper, the author introduced some implementation considerations such as how to avoid stale data, how to regenerate parity after system crash etc. and some examples of future works.

In my opinion, this is a very good paper for anyone who wants to learn RAID. It very easy to read, as it focuses on high-level design ideas instead of implementation details and provides lots of background knowledge about disks and data paths. The only downside is that this is a very old paper and I am not sure if there’s any update to the designs it described.

Review 10

In the paper "RAID: High-Performance, Reliable Secondary Storage", Peter Chen introduces a new disk array architecture called RAID, redundant arrays of inexpensive disks. He also discusses past work done on disk arrays in order to organize and provide a framework for past and future work. He notes that improvements in microprocessors will only boost computation by a small amount unless secondary storage is also accompanied by an improvement. This discrepancy can be seen with microprocessors improving 50% per year when mechanical systems are only improving 10% per year. Disk arrays employ two tactics in order to combat this slow improvement - data striping and redundancy. Data striping distributes data transparently over multiple disks in order make them seem like one fast and large disk. However, we also have the problem where the more disks that are introduced in the disk array, the better the performance, at the cost of reliability. Thus, it becomes clear that redundancy is needed in order to improve the reliability and tolerate disk failures - this way data is not lost. RAIDS takes these approaches and combines the concepts into multiple approaches.

This paper gives an overview of seven RAID organizations and offers some insight into their performance and some possible improvements. These are the seven RAID organizations:
Non-redundant (Raid Level 0)
This has the lowest cost and does not employ redundancy. This has the best write performance and is usually used in scenarios of super computing where performance is preferred over reliability.

Mirrored (Raid Level 1)
This uses twice as many arrays as the previous approach. Data that is written is also written to a redundant disk for better reliability. Thus, when data is read, it can be selectively chosen from the one that accomplishes it in a better time. This is used in fields where transaction rates are valued more than storage efficiency.

Memory Style ECC (Raid Level 2)
This uses hamming codes in order to reduce redundancy within the system. The number of redundant disks equal the log of total number of disks. This enables all the disks to locate which one failed, and only one of them is needed to recover data.

Bit Interleaved Parity (Raid Level 3)
Unlike the previous method, this uses a single parity disk to recover information instead of multiple parity disks. Consequently, each read and write accesses all the data disks, so only one operation can occur at a time. Thus, applications of this can be used in fields that require high bandwidth, but low I/O rates.

Block Interleaved Parity (Raid Level 4)
Unlike the previous method, the parity disk is interleaved across disks in different sizes rather than bits. The size of the blocks, the striping unit, allows reads on one disk as long as it is smaller than it. Writes, on the other hand, require four disks and this can make the parity a bottleneck. Thus, RAID level 5 is better appreciated than this.

Block Interleaved Distributed Parity (Raid Level 5)
In order to solve the previous issues, the parity is uniformly distributed across all disks. This allows all disks to help with read operations. Thus, this drastically improves small read, large read, and large write operations. But, it does fall short when doing small writes.

P+Q Redundancy (Raid Level 6)
This operates very similar to the previous level but also uses reed-solomon codes to protect up to two disk failures. Furthermore small write operations now take a total of six disk operations rather than the previous four.

Although this paper was informative, there were some drawbacks. In particular, I did not see any discussion about scaling these models so that they work in industry. Usually, the entire point that things are ever researched or implemented, is so that they some day are used by others on a daily basis. Having a lack of data backing the popularity of these schemes is very questionable. Second, I feel that Chen's estimation on reliability was quite confusing. Since reliability is determined by many other factors external to disk arrays, the estimations don't seem to be useful or relevant.

Review 11

This paper summarizes that implementation and various advantages of the different levels of RAID systems for storage redundancy. Large amounts of data are often stored on several disks, which have a high rate of one failing. In order to correct for this, RAID introduces a measure of redundancy so that data on any one failed disk can be recovered. Data is duplicated in one of several methods:

RAID 0: No data duplication
RAID 1: Duplicate every disk
RAID 2: Use Hamming codes to keep track of bits on each disk
RAID 3: Use a single extra disk, which keeps track of bitwise parity on all other disks
RAID 4: Use a single extra disk, which keeps track of blockwise parity on all other disks
RAID 5: Same as RAID 4, but spread parity blocks among all other disks rather than just 1
RAID 6: P+Q parity using Reed-Solomon codes to protect against up to 2 simultaneous disk failures.

The differences between these are due to storage space, file read and write times, and cost. Cost and performance can be combined by measuring improvements in IOs per second per dollar, while keeping effective storage constant.

Single disk failures aren’t the only problem. Multiple simultaneous disk failures are possible, and only RAID 6 protects against this. When disks are reconstructed, uncorrectable bit errors can arise. In addition, system crashes can make parity bits inconsistent, which is worse for systems like RAID 6 that have more parity requirements.

RAID systems also require some metadata tracking. When disks fail, the data in them should be marked invalid before they are accessed. When a disk is reconstructed, it must be marked valid before any writes that could change its parity bits. For parity sectors, they should be inconsistent after every write, and any inconsistent sectors should be regenerated after any system crash.

The paper is able to give a summary of the common types of data redundancy, and is able to effectively compare them. It notes which RAID levels are effectively irrelevant due to being surpassed by other levels, like how RAID 5 is just RAID 4 with blocks spread out for efficiency. It’s generally quite good at going over the upsides and downsides of the different levels, in regards to IO efficiency as well as how they handle disk failure, system crashes, and other issues.

On the negative side, it included several equations for determining average time to data loss for a variety of situations. While this is important for comparing the RAID levels, the justification for each equation isn’t given, so the reader can’t evaluate their correctness easily.

Review 12

The paper mainly discusses disk arrays, which are developed to adapt to the increasing gap between micro-processors and magnetic disks and meet the requirements of faster access to large datasets. Disk arrays developed into 6 major forms in structure mainly different in data striping and redundancy. The core part of the paper is to measure the different in performance, cost and reliability for the six structures.
The RAID level 0 is the original version without redundancy, which is good to write but inefficient to read and suffers great risk of failure. The RAID level 1 takes a naïve strategy applying redundancy by simply copy the information with double disks required. The RAID level 2 introduces the concept of parity disk and reduce the disks required for recovery. Bit-interleaved parity move forward to optimize the number of parity disk required to 1 by considering the function of disk controller. Block-interleaved parity adopted a different way of data striping and developed into Block interleaved distributed–parity with typically better performance in small read, large read and large write. P+Q redundancy is much similar to block-interleaved distributed-parity but much more reliable.
The evaluation of the performance of disk arrays is normalized by cost. The comparison of various redundancy schemes illustrates that as the RAID develops, the performance of the newest version of RAID comes closer to RAID level 0 while with other characteristics like higher reliability and larger storage. The evaluation of reliability considers different scenario like system crashes, uncorrectable bit errors and correlated disk failures in block-interleaved redundant disk array. And P+Q disk arrays performs extremely better in reliability.

Review 13

Author starts off by explaining the motivation to have Redundant Arrays of Inexpensive Disks is that the performance gap between microprocessor and magnetic disks continues to widen. Microprocessor improves at much higher rate, which makes possible new applications. Under this situation, to have an array of disks seem to be a natural solution. Author also points out there is greater possibility to have disk failures when having an array of disk, therefore, there need redundancy to tolerate failures. The rest of paper assesses 7 different data-striping and redundancy schemes by comparing their reliability, performance, and cost. The paper has a section explaining related terminologies before going into assessment, which is very helpful for readers with no background knowledge in disk. The following section introduces 7 basic RAID organizations. In this section, author mainly focus on how the schemes distribute redundant information and what are the costs to recover from failures. When comparing these schemes in reliability, performance, and cost, author makes a point that metrics for comparison is hard to agree upon since it depends on the specific application. Even though there is no general standard for comparison, the paper still reaches some conclusions: cost of system is proportional to number of disks and P + Q redundant disk arrays are very effective in protecting against both double disk failures and unrecoverable bit errors but are susceptible to system crashes. Lastly, the paper discusses how to avoid stale data, regenerate parity after a system crash, and some other implementation considerations. Overall, this is very comprehensive paper. Author builds readers a good foundation to understand technical details. Also, the figure that shows the differences between 7 models help understanding. The performance metrics helps substantiate author's point of no general standard to evaluate the models.

There is one thing that I hope this paper spends more time on is the evaluation of performance. Author touches a lot of technical details in the parameter of reliability but not so much in performance.

Review 14

“RAID: High-Performance, Reliable Secondary Storage” by Peter M. Chen et al. provides an overview of Redundant Arrays of Inexpensive Disks (RAID) levels/approaches and their tradeoffs with regard to performance, cost, and reliability. The motivation for RAID is to store redundant data, via parity or other error correcting codes (ECC), so that if a disk failure occurs, no data is lost. The paper gives an overview of the 7 RAID levels, discussing their architectures (mirroring vs error correcting codes, interleaved vs not interleaved, block-level vs bit-level striping, distributed vs not distributed parity data) and tradeoffs with regard to performance (e.g., for write and read operations, the number/set of disks that need to be accessed and whether operations can happen in parallel), cost (e.g., storage/disk space needed), and reliability (e.g., against single disk failure, double disk failure, system crash, uncorrectable bit errors, and combinations of these). Choice of RAID level varies from application to application, as there are different priorities and requirements, but RAID levels 5 and 6 are commonly looked favorably upon.

As a non-expert on hardware and disk architecture, I found the paper to be generally accessible, and appreciated the layperson explanations of each RAID level. I also felt that the paper did a good job presenting performance, cost, and reliability tradeoffs, discussing both within each RAID level explanation, and then more generally afterward.

Although the paper suggests that RAID levels 5 and 6 are commonly seen as favorable and used in real systems, it took some care to find this information in the paper. There was not a general conclusion in the abstract or conclusion about which RAID levels are most common. Of course for different use cases, different RAID levels are better, but I think there still could be a better summary of which RAID levels I as a reader should most consider. Additionally, perhaps a table presenting advantages and disadvantages for different use case requirements could have been helpful.

Review 15

This paper first introduces disk technology and the reason why disk arrays are popularized, performance and reliability. After that, the paper describes RAID from level 0 to 6, including its concept, pros/cons, and proper scenarios.

The introduction and background sections give readers unfamiliar with the disk/storage area basic knowledge, which is important to help to understand what is discussed in section 3.

Section 3 is the most informative section to me. Data stripping and redundancy are first introduced. They are the key concept connected with the performance and reliability of disk arrays. Then RAID 0 to 6 are described in detail. Also, the evaluation of the different RAID organization is well done, from the three primary metrics (reliability, performance, and cost). What really attracts me that this paper gives many different ways to measure the metric, from different perspectives. I think it is very important to define a proper metric method when we deal with any problems.

I got lost in section 3.4. I think it's a little bit difficult for me to understand the measures for the reliability. But the description of uncorrectable bit errors and correlated disk failures are really interesting. Since in real-world, errors and failures just happened due to any possible reason. Also, section 3.5 is very useful for engineerings implementing RAIDs, they are practical suggestions which are valuable for engineers.

In general, this paper is a great review of the RAID technology. It is well written, well structured, and the use of tables and figures is really helpful for me to understand.

The only drawback to me is that it is not clear why the measure of reliability (Section 3.4) is defined in that way. That's why I got lost in that section.

Review 16

This is a survey summarizing the disk arrays techniques (RAID) and performance comparisons among them. In the past few decades, the speed of CPU grows very fast every year, however, the disk performance does not improve as fast as CPU thus become a bottleneck of modern computer systems. This paper is aiming at solving this problem by using a different kind of redundant arrays of inexpensive disks (RAID) to improve the performance of disks in several aspects. This problem is very valuable and important for modern computer systems because a disk with the high data rate, high I/O rate, good reliability and low latency are all desired prosperities of a disk system. Besides, different scenarios require different utilities. For example, scientific computation needs a high data rate while online transactions and storage may need a high I/O rate and good reliability. RAID is a good solution since different level can fit in different kinds of use case. This paper provides an overview of RAID techniques including how RAID 0-6 work, comparisons among different levels, reliability analysis and practical implementation advice.

The main contribution of this paper is that it gives a nice introduction of the RAID with simple words, I think this paper is very easy to understand even for the people with less experience in disk technologies. It discusses 7 different RAID level (RAID 0-6) and makes comparisons among these levels in performance and cost metrics. They not only talk about the advantages of each level but also point out the potential drawbacks. For performance evaluation, they give an objective evaluation by using metrics like I/Os per second per dollar. More importantly, when considering the reliability, instead of using basic formula, they discuss more factors like uncorrectable bit errors and correlated disk failures that make their reliability analysis more convincing. Also, they provide some implantation guidance that is very helpful for engineers who are building such a system.

I think this is a good summary paper for the RAID technologies in 1994. However, the thing I don’t like this paper is that the paper is too old. I think disk technology evolves very fast in the past 20 years including some innovative products like SSD that greatly improve the performance compared to traditional HDD. Maybe we need to find some other review papers that are more recent and state of art. Besides, in their paper, they give several formulas and experiments, however, I think they do not give a detailed introduction of how those formulas derived and how those experiments are executed. It will be better if more detailed information could be provided.

Review 17

This paper is a survey of the disk array storage systems called RAID. It gave an overview on data striping & redundancy and discussed the orthogonal but necessary nature of both elements on storage systems that need to be reliable while still offering good performance. It discussed the 7 different types of RAID storage, from levels 0 to 7, and analyzed the strengths and weaknesses of each level, using reliability, performance and cost as its metrics. Special attention was given to the technique of using parity-oriented strategies to implement efficient redundancy, and several variations were mentioned, along with their strengths and weaknesses.
As this was purely a survey paper, there were no “new” contributions made with the paper; in fact it reads more like a textbook chapter than a research paper. Because of this it is difficult to say it had any major “weaknesses”, especially as I found it quite easy to read. It was written well in my opinion; it never felt overly dense and figures were helpful and in good quantity.