Review for Paper: 3-Operating System Support for Database Management

Review 1

The paper “Operating System Support for Database Management” by Michael Stonebraker discusses services provided by operating systems for application programs, and how special needs of databases require working around limitations of those services. This article gives insight into the features of a DBMS, including how they differ in requirements from traditional operating system services. The paper suggests how UNIX and other operating systems could better support DBMS needs by altering the services they provide to fit database requirements.

More specifically, the paper discusses the needs of a DBMS for buffer pool management, storage access, and scheduling. Operating systems provide facilities for these needs, but these services are not exactly what DBMSs must have for efficient and correct operation. For example, UNIX’s buffer pool manager could interfere with crash recovery during a transaction, by marking a page in memory for lazy write to disk instead of forcing it out to disk immediately, potentially causing a transaction log not to be written as expected. UNIX’s buffer page replacement strategy is least-recently-used, which is inefficient for pages that are accessed cyclically. A DBMS can use its knowledge of query logic to handle buffer pool prefetching and replacement more effectively than the OS could.

The paper is a review and comment piece discussing the author’s observations on shortcomings of operating system support for DBMS needs. The paper contributes detailed observations on (1) the shortcomings of OS services for DBMS use, (2) alternative methods DBMSs use to work around OS service limitations, and (3) ways future OSs could better support DBMSs, which would remove the need for DBMSs to duplicate OS features. For example, the author proposes that future OSs allow a DBMS to offer prefetching advice and page replacement strategy advice to the OS buffer pool manager, which would allow the DBMS to use the OS manager instead of a custom-made substitute.

The article does not make a clear case for one solution to the mismatch between OS services and DBMS needs. The author briefly suggests that DBMSs be run on a real-time operating system, due to the lack of overhead from unused system services. In other places, the author advocates adding complexity to existing OS components such as the scheduler or buffer pool manager, to accommodate DBMS needs. However, either solution might be preferable to the current duplication of OS services by commercial DBMSs.


Review 2

In this three-and-a-half-decade old research paper, Mr. Michael Stonebraker wittingly explains the problems faced by the DBMS designers at that time due to the limitations posed by the Operating Systems of that era. The examples in the paper are taken primarily from the UNIX operating system which made use of INGRES RDBMS.

The author starts by addressing the problems faced because of the buffer pool management. He says even though the operating systems at that time had different techniques to deal with it such as Least Recently Used (LRU) replacement and Prefetch. He says even though the LRU replacement is taken as the best practice for buffer management it is marginally good in a DBMS environment. Similarly, prefetch isn’t a good option since predicting which block to use next is impractical and sometimes impossible. He then sheds some light on crash recovery service offered by the DBMS but suggests that even this issue should be addressed by the OS designers.

A brief intro about the File System provided by Unix is given in the paper. It also elaborates on the problems faced during the access of such systems. Physical Contiguity is one such problem since the blocks of a file system are not necessarily stored close to each other physically, but DBMS accesses a lot of sequential data causing large overheads. Storing information in a tree structured file system also doesn’t help since it also has large overheads for storage of desired trees. The author suggests that OS designers should provide DBMS facilities as lower level objects and character arrays as higher level ones.

The paper addresses scheduling problems which are dealt with locks and how performance is effected by using Server DBMS structure (uses message passing system for communication) over Process-Per-User structure (uses individual non-parallel approach). According to the author, both the models are flawed since the overheads involved touch undesired levels.

The author addresses the issue of consistency control which are provided by lock services and recovery techniques. He insists that such techniques should be a part of OS inherently since DBMS implementation of the same will incur additional costs. Disk management should also be taken care of and it is suggested that a shared-buffer space is the way to go since it checks redundancy.

Lastly, the Paging system is addressed. The author says that if the files are bound into a user’s paged virtual address space, it will become convenient. But it will have to be bound and unbound for various tasks, which will increase the cost related to its transactions. Large files are associated with large access overheads and the buffering also creates problems in the paged virtual memory context as addressed before.

The author, in the end, says that all these problems should be addressed in the next iterations of the then-famous OS since DBMS implementation of the same doesn’t help address performance issues. It is good to know that all the problems were addressed in the current versions of various Operating Systems and is a little fascinating to know that what we take for granted now were just ideas decades ago.


Review 3

This survey presents some incompatibility between operating system and database system, due to their different interest. The OS provides a framework for general application which trades the performance for specific application e.g. database system. Moreover, the author points out these conflict and provide some insight about these tension between operating system and databases.

The buffer pool management which provides a main memory cache for the file system. Least recently used replacement is good because if an object is used a lot it stays in the cache which is so called locality of reference. But the access pattern of a database is not align to it, and the author gives four patterns for INGRES databases

1. sequential access to blocks which will not be rereferenced;
2. sequential access to blocks which will be cyclically rereferenced;
3. random access to blocks which will not be referenced again;
4. random access to blocks for which there is a nonzero probability of rereference.

Only the case 4 matches the use case for LRU. In case 1 and 3, cashing only adds overhead. Finally LRU is worst for case 2. since even though there are lots of rereference, but non of data referenced is cached under such scheme. The author concludes that the buffer management should accept advice form application on eviction policy. The same problem in prefetch mechanism.

In the second part, file system is discussed. The current UNIX file system take the data as array. On the other hand database provide an abstraction where user's keys map to records. Constructing database on top of OS filing system is not always efficient due to the following requirement. The file might scattered over the disk and lose the physical contiguity. Second problem is that there are three tree structures: file control block tree, directories tree, and DBMS such as INGRES adds another tree for keyed access via a multilevel directory structure.

The third parts consider the issue about scheduling and processing. He provide four ways to organize a multiuser database system:

1. one process per user: expensive context switching
2. single process as a server: replication of OS services
3. server pool structure: share a lock table and are slightly different from the process per user model
4. disk server structure: trades messages for task switch.

In this paper Stonebraker mainly discusses how can an operating system be more friendly to database application, and exclaim that the operating system design should be more sensitive to database needs. I think it's an inevitable trade off between generality and specificity.




Review 4

This paper provides a discussion about whether several operating system services are appropriate for support of database management, and especially indicated how the services are not appropriate. The author examines the operating system services, points out the issues or problems and offers improvement suggestions if possible. The operating services include buffer pool management; the file system; scheduling, process management, and interprocess communication; consistency control; and paged virtual memory.

The problem here is that conventional operation systems often provide inappropriate services to database management, and this paper represents many problems or issues found. Take buffer pool management for example, the LRU (Least Recently Used) Replacement is generally considered a good method, however, it might not work well for some database environment such as sequential access to blocks which will not be re-referenced. The author suggests that the OS needs to find a way to get replacement strategy from an application program (like DBMS) to get a better performance. There are many other problems found, and therefore, the author suggests that the operating system should perform better services to database management and the future OS designers become more sensitive to fulfill what DBMS needs.

The main contribution of this paper is that it alerts the OS designers how OS services provide wrong or inappropriate services to DBMS. Just like mentioned above, these problems obviously stand in the way of development of DBMS, and DBMS prefers an efficient OS with desired services. For example, with a more suitable OS, we don't need to implement multitasking, scheduling, and message system (a “mini” OS) entirely in user space in addition to a DBMS.

An interesting observation: as the author mentioned at the end of this paper, there are so-called real-time OS which provides minimal facilities which is closed to what the author suggests. And the author hopes that future OS will provide bother sets of services in one environment. This is a good idea, but I am a little bit concerned whether we need to separate conventional OS with a small efficient OS that provides desired services to DBMS. It would be great if we can achieve both in one environment, however, if it is not possible, what is the disadvantage of developing two different OS separately?


Review 5

In this paper, Stonebraker looks at services that one operation system should provide for a DBMS and examines some (at that time) popular operating systems (UNIX-based OS’s) to indicate whether the provided services are appropriate for DBMS functions or not. The services that the author investigated on includes buffer pool management, the file system, scheduling, process management, interprocess communication and consistency control.

The results of studies for each service are summarized in bellow.

BUFFER POOL MANAGEMENT
No OS provides appropriate buffer management for database needs. So, most of DBMS’s do not use the buffer pool management service provided by OS. Instead, they maintain a buffer pool in user space and manage it with a DBMS specific algorithm.

THE FILE SYSTEM
In UNIX, the file system supports files which are arrays of characters and have variable size. While this approach is useful for other applications such as language processors and editors, it is not desired by DBMS. The author thinks that operating systems should support DBMS objects in lower levels and character arrays on higher levels.

SCHEDULING, PROCESS MANAGEMENT, INTERPROCESS COMMUNICATION
Two models for scheduling problem is discussed, non of which is attractive enough for DBMS because they have a huge overhead in task switching and messages. In the author’s view if OS designers do not make these facilities faster for databases, DBMS’s will continue implementing their own version of them in user space. The author suggest to create a new scheduling class for DMBS process and behave differently with them.

CONSISTENCY CONTROL
It is possible for a OS to provide proper consistency control. However, if DBMS implements it’s buffering management in user space, there will be some performance costs.

As a conclusion, we can see at least the multipurpose OS’s of that time (1980s) were not very well in satisfying DBMS’s needs. Most of the times, DBMS does not use OS facilities because that facilities are not fast enough for it’s need. I think the problem is that UNIX is not build with DBMS functionality and performance as the most important thing in the minds of it’s creators. An multipurpose operating system should work acceptable well in all type of tasks. In addition to these multipurpose OS’s we can have DB specific OS’s for serious DB related works.


Review 6

This paper surveys the various functionalities that Operating systems already provide, and examines how well this functionalities work for traditional database workloads. The first of these is the buffer pool management, which does not work well for databases because the OS does not know which blocks to prefetch into memory, while the DBMS does. Therefore, a DBMS buffer manager has to be created to run in user space, to circumvent the OS's version. The next issue is the storage of files on disk. The authors note that the file hierarchy does nothing for a DBMS, and DB developers must create their own indexes over flat character arrays. The authors then move onto scheduling and process management. Because DBMS manage their own locks to maintain consistency, they must also handle their own scheduling to avoid deadlocks or any other issues. The performance and other cost of replicating this facilities leave quite a bit to be desired, but is the best current option.

The paper does a great job of showing how much of the functionality of an OS is needed in a DBMS. For each issues addressed in the paper, the authors examine the feature, and how it performs for an DBMS, and then show what workarounds DBMS practitioners have had to come up with. The author does a good job of showing how the performance requirements, and other subtle differences in how the system is used, make the facilities provided by the Operating System useless. Essentially, the authors show us that general-purpose operating systems (as the name would suggest), are not very good for specialized, data-intensive, high-performance systems such as modern DBMS.

The author says, especially in the file system section, that perhaps OS designers should contemplate providing DBMS facilities as lower level objects. However, they do not provide any further discussion of how this could be done, or what lower level facilities they need. I had a hard time seeing how OS designers could provide DBMS facilities as lower level, and character array as a higher level - a character array seems to be as simple as it can get. I also don't think operating systems developers care much for DBMS developers - DBMS devs have a specific use case, while the OS devs are looking for good average performance, for a variety of different uses.



Review 7

The problem addressed by this paper is that the operating system services should be improved for more appropriate work with DBMS. The problem is important because the interaction between the operating system and DBMS can affect the computing performance greatly. The situation can be improved if the OS designers think more about supporting DBMS.

The approach proposed by the paper is that the operating system in the future should be able to provide different services for different users. That is to say, it should be overhead for all people, but be more compact to meet the needs of DBMS. It is because almost all of the current operating systems could not support DBMS effectively. 1) For buffer pool management, since the accessing overhead requires too many instructions and the prefetch algorithm does not work well, most DBMSs use a self-managed buffer pool to improve performance. 2) For the file system, character array is not a good object for DBMS and the operating systems should be more flexible to the objects used by the DBMS. 3) For scheduling and process management, the processes in the OS are too large for DBMS to work effectively. The size and quantity of DBMS processes and regular ones differs a lot, which results in the private operating system run by a DBMS. 4) For consistency control, it cannot work efficiently if the buffer pool is managed by DBMS itself. So the designer needs to solve the problem of cache at first. 5) For paged virtual memory, the performance for handling large files is slow. And it also has the buffering problem waiting for solving.

The strength of the paper is that it discusses several popular operating system services with the DBMS designers’ perspective. When mentioned every kind of OS, the paper evaluates the performance of the DBMS through their implementation and technical features.

One drawback is that the paper mentions little about the strength of current operating system services for supporting database management systems. If the paper can figure out more about the advantages of current designs of operating system with respect to support DBMS, it would be helpful to the designers.


Review 8

This paper goes over several key operations and functionality needed by DBMS that are either not present or poorly supported by existing UNIX operating system. The main contribution of this paper is pointing out the limitations of the existing operating system in supporting DBMS, and what features and changes need to be made to help the OS support databases management more efficiently.

Some of the key limitations of UNIX operating system highlighted are:
High overhead of disk access for managing buffer pools. Forced to manage the buffer in user level.
LRU buffer management isn’t necessarily the most optimal for DBMS workload.
No support for selected forceout of pages, limiting recovery functionality
Filesystem is not optimal for the access patterns of DBMS, and create redundancies.
Lack of message-passing support limits the process management to a one process-per-user style, over having a server process.
O.S. can support consistency control, but because performance limitation discourage it from maintaining a buffer pool, its ability to provide the service is crippled.

The paper makes it clear that for better performance and efficiency of DBMS system, operating systems need to be designed in a way that takes into account the characteristics of a DBMS system. One concern that isn’t addressed in the paper is the diversity and volatile nature of DBMS system. One control or management scheme that was popular in one year doesn’t necessarily have the same level importance in the next year, and the flexibility of the operating system to be able to support can be crucial, even if it comes at a performance cost.



Review 9

This paper illustrates the role that operating systems play in the database systems and focuses on some issues in following aspect that the operating systems may not be suitable to databases:
Buffer pool management: some feature(like LRU replacement) provided by OS does not work well under the common access pattern in the database, like INSGRES. Hence, the strategy widely used is to maintain a separate buffer in the user space.
The file system: Though it sounds a nice idea to build a record management systems inside the OS, but this approach rarely get good performance since underlying structure in the OS does not suit the DBMS requirements.
Scheduling, Process Management, and Interprocess Communication: neither process-per-user nor server DBMS structure neither is a realistic way to build a multiuser database system, due to the high cost of task switching or the unsupported message facility. Current solution is to implement a ‘mini’ OS in user space.
Consistency control: it is necessary that DBMS has full control on the buffer pool management, concurrency control and crash recovery considering the performance.
Paged Virtual Memory: though it is a common approach to binding files into a user’s paged virtual address space, yet this approach has some issue concerning large files and buffering.

From this paper, it is interesting to see that though supports from OS seems far from good, OS is still widely used as a underlying service to the DBMS.



Review 10

The main purpose of this paper is to discuss several ways in which OS features fail to provide useful tools for Database Management Systems (DBMS), and to suggest some solutions. The paper focuses on the following areas of the OS: buffer pool, file system, scheduling/process management, and consistency control.

OS buffer pools are often slow. They also often have a set replacement scheme like LRU, which is not good for some database operations (like reading sequentially and never re-referencing). Operating systems may also try and pre-fetch pages when it detects sequential reading, but this is not as good as what the DBMS would be able to do, since it knows exactly what pages it will read. The buffer pool also may not allow anyone to tell it to force-evict pages to disk, which is a problem for temporal-based crash recovery policies. Because of this, databases often ignore the OS and implement their own buffer pool. If the OS were to allow the DBMS to influence its replacement policy and read/write order, and improved performance, this might eliminate the need for the DBMS to have a duplicate buffer pool function.

File systems often structure files as character arrays of variable size. Since these arrays can be spread across pages, the DBMS using these files might not get the physical locality it desires for good performance. In addition, the OS will store the file system as a tree of directories, but in order to facilitate fast key-value lookup the DBMS will have to create its own tree of the file system. This involves unneeded overhead. Some OS’s attempt to solve these problems, but their solution may be built on top of this file-as-char-array approach, which hinders performance for the reasons above. The author suggests that OS’s should provide the facilities needed by the DBMS as low level features, and build the character-array file system on top of that.

With regard to process management, two models of DBMS are suggested, both of which have problems. Process-per-user models are slow because process switches are slow in most OS’s. Server models either have no concurrency, or require the DBMS to have its own multitasking code, or depend on OS messages which are also slow. The author suggests that performance for OS process switching and messages must improve, or that DBMS should be allowed access to privileged fast versions of these operations.

With regard to consistency control, the OS provides concurrency control and crash recovery, but as mentioned before the DBMS often uses its own buffer management. Coordinating between the DBMS and OS here would require extra bookkeeping that duplicates OS functions. This could be solved if the OS buffering solution were improved for database systems.



Review 11

The paper examined a number of operating system services at the time (i.e., circa-1981) for their applicability to support necessary database management functions. These operating system services discussed in the paper include buffer pool management, scheduling, process management, inter-process communication, consistency control and paged virtual memory. The main contribution of this paper, even for readers in the present, is that it demonstrated how much database management systems (DBMS) were affected by and dependent on the OS at the time, which is an issue that still exists today. It is important for DBMS designers to have good knowledge about the OS on which his/her DBMS will operate.

DBMS operate on top of the OS and it must consider the services that the OS provides. However, since OS and DBMS are designed for each of their own purposes, it is true that the OS services are often not well-suited for DBMS functions. For example, the popular buffer management strategy, LRU, for the OS does not work well with DBMS’s data access patterns. The paper showed that such case exists for every operating system services discussed in the paper with in-depth explanations.

Back in the days of 1980s when the paper was written, the disparity between the OS services and DBMS functions was very significant as the paper demonstrated. Even though many of the problems mentioned in the paper may not be as significant with today’s computer hardware and new techniques (e.g. light-weight threads), the situation itself has not changed as much. Today, DBMS still operate on top of the OS. Operating system services have been improved from the perspective of DBMS, yet DBMS itself have gotten much complex at the same time, demanding new types of service from the OS.

To sum it up, the paper teaches readers an important lesson about the relationship between operating systems and database management systems. Operating system services may not be optimal for database management functions. DBMS designers must be aware of this and take it into account when developing or maintaining his/her DBMS. The paper could have been better if it provided more examples from a wide variety of database systems and operating systems other than INGRES and UNIX, but the paper still delivers its points well.


Review 12

In this paper, several operating system services are closely examined to see if they are appropriate to fulfill the needs of DBMS.

One of the services is buffer management. Operating systems usually have its own page replacement algorithms. However, a DBMS can use some other algorithms to further improve the hit ratio since a DBMS knows more about the sequence of data access.

The interesting idea here is: other than using existing algorithms in OS or let the DBMS maintain a separate cache in user space, there should be another better option that the OS provides an interface to utilize the information provided by the DBMS (or other applications) and integrate it into its replacement strategy. The idea also applies to the other services provided by OS. If the OS does not provide appropriate features, the DBMSs will have to implement all the features on their own in user space.

This is a rather old paper which is published more than 30 years ago, some details might not be the same for modern OS; but the issues it points out are still very notable for now. I agree that to improve the performance of DBMSs, it is essential to integrate DBMS and OS well.


Review 13

This paper discusses services provided by buffer pool management, filesystems, scheduling, process management, and interprocess communication, and consistency control. This is important because database management systems provide higher level user support than traditional operating systems. The paper suggests that operating system services are too slow or inappropriate, and should move towards being both general purpose as an OS and efficient for DBMS needs.

Most OS provide file system main memory cache. UNIX provides a buffer pool, which handles all file I/O, of fixed size when the OS is compiled. Performance overhead of fetching blocks from the buffer pool manager includes the overhead from a system call and core-to-core move. Many DBMSs reduce overhead by using a DBMS managed buffer pool in user space. In INGRES, database access is a mixture of sequential access to non-referenced blocks, sequential access to cyclically re-referenced blocks, random access to blocks not again referenced, and random access to blocks with non-zero probability of reference. Correct pre-fetch strategy is impossible for an OS to implement because the next block that needs to be accessed is not necessarily the next logically sequential one. Crash Recovery is handled with an intentions list and a commit flag, which is set after the intentions list is complete. The final step in a transaction is to process intentions list and make actual updates.

One way of supporting file systems is having objects, such as files, in character arrays of dynamically varying size. Another way is providing a record management system inside the OS, which provides structured files. This is good for a DBMS, but not always efficient when based on character array objects. Character array object is commonly expanded one block at a time, scattering blocks of a file over the disk. Thus, the next logical block in a file is not guaranteed to be physically close to the last one. DBMS prefers extent based file systems, which grow an extent at a time, because a whole collection can be read when sequential access is desired. Tree Structure File Systems allow blocks in a given file to be kept in a tree of indirect blocks, and pointed to by an i-node, which is a file control block. A second tree allows files in a given mounted file system to have a user visible hierarchical structure with directories. A third tree supports keyed access through a multilevel directory structure. One tree that holds all three of these types of information would be more efficient.

The simplest model for multiuser DB systems is one OS per user. Another option is allocating one run-time database process to act as a server. The first approached is more common because NIX contains a message system incompatible with the server process. The one OS per user model creates issues in performance because each time a run-time DB process issues an I/O request not serviceable by data in buffer pool, a task switch happens and DBMS suspends until required data is switched in to the buffer. Also, portion of the buffer pool manager become critical sessions because the buffer pool is a shared data structure. The Server Model is viable if the OS provides message facility and allows n processes to originate messages to a single destination process, but this involves the duplication of OS code.

The ability to lock objects for shared or exclusive access and crash recovery support are necessary for consistency control. A commit point is used to ensure that all appropriate blocks are flushed and a commit delivered to the OS. However, this means that the buffer management must keep track of transactions and the OS and DBMS functionalities are duplicated. Ordering dependencies can be solved if the OS keeps track of the buffer pool and the intentions lift for crash recovery. This again leads to duplicated OS functionality because the buffer pool manager in user space must maintain its own intentions list too.

Binding files into a user’s paged virtual address space is often thought to be the OS method for DB management support. For large files, however, have the possibility of causing two page faults, one on the data itself and one for the page containing the page table of the data. Buffering in a paged virtual memory system has unanswered questions in prefetch, non-LRU management, and selected force out.

I felt that the paper was well written, concise, and highlighted the important concepts of database management, and does a good job of supporting his point that current OS do not support DBMS well. I felt that the paper could have been a little more specific on its discussion of crash recovery, and perhaps cover an example of a crash recovery method, such as the ARIES method.


Review 14

This paper discusses operating system support for database management systems including buffer pool management, the file system, scheduling, process management, interprocess communication and consistency control. This is important because a DBMS runs on an OS and the support the OS has for that DBMS can greatly affect performance of the DBMS.

The buffer pool management has to do with how the OS reads and writes data using a cache. It improves performance for many common functions but can cause problems in database systems. Because DBMS systems often know which parts of memory they will need to access they can more efficiently manage their own buffer pools, which does not take advantage of the OS.

For the purpose of DBMS systems certain file system properties are desirable. A DBMS can perform a lot of sequential access which prefers sequential storage on disk. DBMS systems should be lower level objects in the OS. It would be interesting to use table properties in a DBMS to be able to automatically determine how to allocate and manage memory on disk.

For scheduling, process management, and interprocess communication the paper discusses two models. One is a multi-thread single process model and one is a server model. Both of these models require DBMS systems to reimplment some OS functionality to gain efficiency. Scheduling in particular is a troubling problem of time and resource availability.

Consistency control is the issue of locking objects for exclusive access and crash recovery. OS functions provide this at a higher level but DBMS systems require it at a lower level for pages and records.

This paper covers a wide variety of issues with DBMS and OS integration. The paper describes how OS functionality works and contrasts it with what a DBMS needs. This is an important perspective.

This paper has some issues though. It only talks about the UNIX OS and then says that "other OS are comparable". I feel this is not a thorough analysis. Also, this paper is from 1981, which is an obvious weakness today. It is out of date. There are also a few speculative statements in the paper, rather than references to rigorous study. For instance, at the end of 3.2 the author states "the extra overhead for three separate trees is probably substantial".


Review 15

Part 1: Overview
This paper analyzes advantages and disadvantages of different operating systems for supporting a database management system. There are mainly six aspects to touch on, buffer management, file system, CPU scheduling, process management, inter process communication, and consistency. This paper lists popular operating systems and compared their strategies for scheduling, I/O control and etc. Figures are used to explain data flows.

Part 2: Contributions
This paper contributes to the literature a preliminary and high level overview of OS supports for DBMS. It got 573 citations and provides a great guideline for further study. Following researchers dived into these six problems and improved DBMS performance.
This paper reveals current problems that DBMS can hardly utilize OS supports and brings up some possible clues for fusing DBMS and OS up to improve general performance.
It brought out ideas like “selected force out”, conflicts between hidden file system structures and B+ tree for a DBMS, and additional waiting time caused by semaphore locks. These provides new features for OS developers to explore.

Part 3: Possible drawbacks
Lacking of simulation results may make the performance comparison less convincing. When mentioning LRU algorithm, it is better to provide some real data or at least simulation results.
Some popular operating systems are not discussed. This paper mainly talked about UNIX. Readers may want to know more about Windows, Linux and OS (though it’s UNIX based).
This paper is more like an introduction to operating system and did not dive deep. This may be restricted by the page length.



Review 16

The paper discussed about various operating system services provided for support of database management system (DBMS) functions. Operating system services including: buffer pool management, files system, process management, interprocess communication and consistency control are discussed. Most of the time these services are either too slow or inappropriate for a database management system. As a result DBMSs usually provides their own services and make little use of those offered by the operating system.
For example, while the buffer pool management service provided by UNIX, provides important functionalities including LRU replacement, and page prefetching, transaction crash recovery, those functionalities don’t translate into a good performance when used by DBMSs. LRU replacement work well for an access pattern which involves accessing of blocks with a probability of re-access. However, for other access patterns ,it is better to rely on the knowledge of DBMSs to decide what to do with the accessed block which might include tossing it right away, if it is not going to be accessed again. In addition, since DBMSs knows the logical record order, they are better in prefetching than the os.

In terms of the file system service, the paper discussed that record management system is better than character array files. However, if the record management is not supported from the ground up by the operating system, it is not going to be efficient. As a result, the paper recommend operating system to implement record based file system as a lower level objects. The other service discussed in the paper is process management.There are two ways of process management. One of them is to have one os process per each concurrent database user. The other is to allocate one dedicated database process which acts as a server and all concurrent users send work requests to this server. However, the paper suggests both approaches to be unattractive.The main problem is the overhead in task switching and messaging. The paper suggest a solution to make the overhead cheaper. For example, the os can create a special scheduling class for the DBMS. The DBMS processes should never be forcibly descheduled but might relinquish the CPU at appropriate intervals. Otherwise, the paper mentioned that, the DBMS designers will continue the practice of implementing their own scheduling and messaging system entirely in user space.

In addition, the paper addressed performance problem when consistency control and crash recovery for a transaction is provided in the os and the buffer management is provided by DBMS. This creates code duplications and in turn performance problem and increased human efforts. The paper suggests to have these services to be provided only by either the os or DBMS.

Generally, the paper identifies important issues in using os services by DBMS. For the DBMS to rely on the os services, these services should have low overhead. The os designer can solve this problem either by having a set of limited and light weight services suited to DBMSs or provides a special set of services , on top a set of general purpose services, for DBMSs. The paper recommend future operating system to have a small efficient set of services as compared to a general-purpose operating system mentioning the success story of real-time operating system as an example.

The main drawback of the paper is that it highly relies on UNIX operating system and a particular database system called INGRESS. It could have been even more insightful if the authors would have considered other combination of os and DBMSs. Its suggestion of having an os with a small set of services might not be as easily possible as developing complex system as os takes a lot of human efforts and time. As a result, companies would adopt to design a general os system to amortize those costs. The paper fails to address such circumstances and suggest an alternative detailed solution which improves performance of database system while still having a complex general purpose os.


Review 17

Operating system support for database management

In this paper, several common operating system services are examined and discussed about whether they are appropriate for support of database management functions.

The first service it talked about is cache, or buffer pool in UNIX. It is a very common and conceptually desired service, but it arises some problems when we use this service in database management. For example, the performance is lower because of higher overhead of fetching data from system cache, and LRU algorithm might not be appropriate due to some data access pattern in database system, and so on. Many required features of cache are missing to support database management.

Next, the file system is discussed in the paper. Some overhead also occurs here when DBMS use this system service. Separated tree structures are used to manage data. In addition to system tree structures, DBMS has their own. A character array provided in OS file systems are not useful for DBMS. This service must be re-designed more specifically for DBMS so that it can be more efficient.

The third sub-topic is process management and interprocess communication. Message overhead and lock management are examples of the key issues in using such system services.

Finally, for consistency control, as we may already seen in the paper, some of the service should be customized while some of them can be utilized with no big problem. It means that we can use some of our own modules in place of some system services and keep some others provided by OS. But to keep system consistency, we definitely need to carefully wire those original and customized module up.


Review 18

This paper argued that DBMS will be faster and simpler if some facilities or services can be provided on the OS level instead of re-implemented by DBMS on the user level.
Five OS facilities are discussed: 1) Buffer pool management, 2) File system, 3) Process related facilities, 4) Consistency control, 5) Paged virtual memory.
In each of these areas, some commonly used services, algorithms or tactics are questioned for their relatively bad performance.

Here are the main problems of OS:
1) Buffer pool management
Slow block switch, LRU tactic may not suitable, Pre-fetch always fail, Lack of good crash recovery.
2) File system
Lack of option for imposing physical contiguity for a file, Lack of additional indexing method for file access.
3) Process related facilities
Slow context switch, Slow messaging.
4) Consistency control
No block flushes, no intentions list.
5) Paged virtual Memory
High overhead, bad buffer tactics.

For each problem above this paper described some current ways of solving it, mostly by implementing wanted features at user level. Buffer pool management problems are solved by maintaining buffer memory at user level. File system problems may be solved by allocating a large file to make the file allocated physically contiguous. Process related issues are solved by writing a threading system at user level.

By discussing all the drawbacks of services of OS currently using, this paper made a few good point.
Clearly, current solutions (implement at user level) are not good enough for a database system. There is another approach, which is to improve the OS itself. This approach can solve problems previously cannot solve, large process context switch overhead, for example. And improve performance for things that previously implemented at user level.

But I think this paper is too extreme for some parts. In the paper, it argues that privilege should be given to database systems, and always let database systems run first. This might be good for an OS that design to support a DBMS, but it would be problematic for an OS designed for general usage.



Review 19

This paper compared several popular operating systems’ compatibility for database management. The buffer pool management provides cache for the file system. But if the DBMS managed buffer pool were placed in user space will reduce overhead more. The operating system need to cut the access overhead to provided the file system as a part of shared. The LRU has good performance only in database environment. And the prefect strategy will not always implement correct. The crash recovery requires an OS buffer manager which could select force out. It needs to commit the flag and append the list. But this requirement is not supported on any buffer manager. The file system expects a record management system inside the OS. The DBMS more expect a sequential physical contiguity or tree structure but not array object default in OS. The scheduling, process management, and inter-process communication has a dilemma on the cost of task switches and messages. Maybe run a mini OS in user space will work for the multitasking. DBMS expect an OS has the buffer pool and an intentions list to defend the ordering dependencies. Otherwise the code duplication is unavoidable. Paged Virtual Memory has to face the slow performance by a file open and clean the buffer problem.

The OS is not designed for DBMS, so split a dedicated OS for DBMS will be a solution for future OS. But the one dedicated will influence some implicit performance. The new OS for DBMS is a permanently plan for DBMS development. Divide and conquer is extremely urgent.


Review 20

This paper summarizes the requirements of a database system with respect to the operating system it is installed on. It also explores how well the operating systems are able to support DBMS services such as buffer pool management, process scheduling management and consistency control.

Buffer Pool Management
Operating systems such as Unix typically tend to use LRU which would be bad in light of any DBMS transaction unless it is a file that has a non-zero probability of being re-referenced. The author advises that in order to use the OS buffer management effectively, provisions must be made for the OS to accept “advice” from the DBMS regarding replacement strategy.

Prefetch
Though prefetch is a brilliant strategy when it comes to sequential access, the logical disk locations do not necessarily correspond to physical disk locations. One of the strategies that is specified in the paper “Anatomy of a database system” is to allocate a large file on an empty disk so that the offsets would correspond fairly well with respect to physical locations. This might help prefetch perform better in a given system.

Crash Recovery
The paper specifies the method where a DBMS can recover from a crash based on the combination of intentions list and commit flag being set or not set. The author specifies that the OS could provide a service where a selected force out would push both these components to the disk in the proper order.

Scheduling and processing
One of the solutions that the author provides to handle taskswitch overhead is for the OS to create a special scheduling class for the DBMS so that processes in the DBMS would never be forcibly de-scheduled which might help prevent critical section lock issues.

Consistency Control
It seems that buffering, concurrency control and crash recovery would be better if handled by the operating systems but for the performance lag which can be set right if these services are provided by the DBMS. This definitely results in duplication of code so a reasonable compromise must be reached that is able to take care of both these issues.

The paper specifies that the examples are drawn primarily with the help of Unix and INGRES however, this constitutes only a subset of the DBMS-OS systems currently present in the markets. A reference to other systems would have provided a holistic view of the role that operating systems play in a DBMS.

The author ends the paper with the suggestion that future operating systems could possibly consider providing DBMS services and being sensitive to needs of a database management system.



Review 21

This paper mainly talked about the difference between functionalities that are provided by the OS and that are required by the DBMS like system R.
In terms of the buffer pool management system, the need by the DBMS would be something that has the prefetch advice, block management, advice and select force out feature, but it is almost impossible to build. For the file system, the character array object supported by the os is not so helpful to the DBMS, and the tree structure is not so efficient for DBMS. As for scheduling, right now the existing OS are having problem in reducing overheads for task switches and messages. unless there is another scheduling class designed for DBMS like favred users and fast path control pass solution, DBMS’ ‘mini OS’ will still be favored. In terms of the consistency control, if the DBMS is building its utilities on top of OS infra, many facilities may be duplicated and the overall performance will be lowered. And the OS is using very small page size for virtual pages, it may cost great overhead for DBMS to store page table and it may cause too many page fault. In general, it is like the buffer system again, with lowered performance.
In summary, the paper demonstrate that general OS are designed for more advanced tasks and better extensibility, and the current DBMS just need minimum facilities with high performance.

The paper used a lot of concrete use case of the DBMS like large file and multi-user to illustrate the high performance need of current DBMS, and it provided some possible solutions of OS side with big overhead. It explored almost all possible alternative in today’s perspective, and that really helps convince the reader to buy its point.

But one thing the paper forget to mention is that the solid state hard drive sometimes can help boost the OS operation, and some transaction issues can be solved using the non-volatile memory. With those new techniques, I think the range of problems associated with adapting the OS to DBMS use will shrink largely.



Review 22

In this paper, the author talks about several operating services and point out the problems with these OS service when serving the DBMS. Also, the author gives some suggestions for the OS researchers to improve the performance of DBMS.
The first service the author talks about is the buffer pool management. First problem is overhead for fetching a block from the buffer pool manager is big so many DBMS build their own buffer pool in user space to reduce the overhead. Second problem is OS use the LRU to manage buffer, but it is not good to always use this strategy to serve all the access pattens in database. Third problem is prefetching is OS is not suitable in DBMS as different data layout with common files in the OS. The fourth is OS don’t have the service to selected force out which push the intentions list and commit fall to disk in the proper order.
Then move to the filesystem of OS. UNIX supports objects which are character arrays which the file pages scatters over disk. As the DBMS does considerable sequential access which is different, so this character array is not useful for DBMS. The author point out that it is better to provide DBMS facilities as lower level and character arrays as higher level one.
There are two model for multiuser database system: one process per user and server structure. The first one is not good for DBMS: (1) expensive task switch between processes. (2) may have long wait queue when many tasks competing same lock. The second one is better but UNIX message system is too expensive to support this model. The author’s suggestion is OS should have cheaper task switching and messaging instead of let DBMS designers to design DBMS’s own multi-tasking, scheduling and messaging systems.
For the concurrency control the author thinks (1) let buffering, concurrency control and recovery service provided by the DBMS in user space without duplicating functionality or(2) let them provided inside the OS with same performance as (1).
Then problem about paged virtual memory is the pages table is too large to keep in memory. But will have more page fault when page-out the page table to disk. One solution is modify a file control block to be smaller as sequential storage. Another way is to bind chunks of a file into one’s address space.
This paper is point out many problem in different service of OS that is not suitable for DBMS. And gives the OS reacher some suggestions to improve the OS in order to serve DBMS. The problem is the author comment this problems from the DBMS researcher’s point of view and as the DBMS’s characteristic is kind of different with other systems OS trying to serve, it seems hard to modify common OS only for serving DBMS better. There should have balance to serve different applications in computer. So having a smaller OS only serving DBMS seems a better choice.


Review 23

This paper proposes the idea that DBMS can implement their own version of the operating system’s services to increase performance. Without implementing the DBMS’s own services, it will have to use the standard services provided by the operating system. Because most of these operating systems are general-purpose, DBMS will run into performance issues with caching, transactions, multiple users, and large tables.

Databases do not follow the trends of a traditional program in terms of how memory is accessed, so using LRU is not an efficient way to cache. Generally, databases will either have sequential access to blocks that will not be accessed or cyclically accessed or random access to blocks that will never be accessed again. As such, it would be a better idea for DBMS to implement caching in user space. However, with all of these problems, Stonebraker’s argument is that DBMS are essentially re-implementing all of the basic services, so operating systems should accommodate programs such as DBMS when implementing these basic services.

I believe that Stonebraker does a good job of outlining the key points of why current operating systems and current database systems are not efficient together. He also paves the way for how operating systems can evolve to include application specific services so that special programs like DBMS can use those services without having to rewrite them.

However, there are some counter arguments as to why operating systems are designed the way they are today:

1. Operating systems are supposed to be light and general purpose. One will always have to make the tradeoff between flexibility and space used. I believe that it is not possible to cover all possible programs that could be run on an operating system and if there is a program that is special enough, like DBMS, they should have their own specific implementations for these services.

2. Companies that have large databases and want to completely optimize the overlap between the DBMS and the operating system will have the engineers and the money to do so themselves. In other words, the market for such an operating system will be so limited and so small that it will not be successful when marketed to the public.



Review 24

This paper lists several ways in which operating systems could provide better support for database management systems. While much research has gone into optimizing the performance of operating systems, the interactions between DBMS and OS tend to be very different from the interactions of most common applications, and, in many cases, OS code is poorly suited to handle the needs of a DBMS.

I found the discussion of replacement policy and prefetching quite interesting. LRU is generally a good replacement policy, since it take advantage of a great deal of spatial and temporal locality in memory accesses. However, DBMS memory access patterns are often atypical in comparison to other common applications. Many database queries access blocks that will not be rereferenced in the future, making LRU a very poor replacement policy.

The general theme of this paper is that the OS should be more flexible with its replacement, prefetching, scheduling, etc. There are many cases in which allowing a DBMS (or any application) to select from among several algorithms or policies could have a significant improvement on performance. I have several issues with the authors' approach, however.

While the authors are correct in pointing out that OS code often fails to provide the kind of flexibility that could improve the performance of DBMS applications, they don't have much data to show that introducing this kind of flexibility would be worth the investment. Operating systems are incredibly complex, and are designed to optimize common needs of applications. The amount of work required to extend them in the manner the authors suggest would be enormous. While arguably the cleanest solution would be to provide OS support for DBMS needs, the authors state that most DBMS applications have already implemented solutions to many of these problems in user space. Therefore, even if the OS were to provide such flexibility, each DBMS would have to rewrite large parts of their application that already have workable solutions.

I would have less of a problem with this if the authors were able to offer evidence that such an undertaking would significantly improve performance, but they offer very little in the way of statistics or prior research. In Section 3.2 they ask suggest UNIX file trees be restructured into a single tree, but the only reason they offer is that "The extra overhead for three separate trees is probably substantial."



Review 25

The paper examines several operating system services that support database management functions, including buffer pool management, the file system, process management, consistency control and paged virtual memory. The importance of understanding these OS services is that the DBMS designer must work with the OS he or she is faced with.

Many DBMSs have a managed buffer pool in user space to reduce overhead when the DBMS fetches a block. LRU replacement strategy is a generally a good tactic, but has bad performance in some situations of access. Therefore, many DBMSs use composite strategies to improve the performance. The file system supports object files and a record management system inside the OS. The UNIX implements services by the tree data structures.

For the scheduling, the first way to organize a multiuser database system is to have one OS processor per user. The other way is to allocate one run-time database process which acts as a server. However, both two approaches would face task-switch overhead. One solution to this problem is to create a special scheduling class for the DBMs. The consistency control is also important service provided by OS. The issues related to consistency control are commit point and ordering dependencies. This should be carefully designed for the purpose of correctness.

To sum up, it is important for DBMS designers to make use of services offered by OS, and the OS designers become more sensitive to DBMS needs. This paper provides a general summary of the services offered by OS, including buffer pool management, the file system, process management, consistency control and paged virtual memory.



Review 26

This paper talks about feasibility of using standard operating systems (OS) for database management system (DBMS) purposes and discusses what factors the OSs are lacking to make them more readily usable for DBMS. It touches on 5 crucial parts of an OS and analyzes if they would work for DBMS.

1.) Buffer pool management – This is something that a DBMS could potentially do but is not currently set up to do so. Performance issues are an issue with fetching bytes from the buffer pool taking far too many instructions. Another factor is crash recovery for transactions, which is something DBMSs provide but was lacking in OSs.

2.) File system – Stonebraker argues that OSs should provide DBMSs with lower level objects than character arrays because a character array is not useful for a DBMS.

3.) Scheduling and process management – Performance of scheduling with a DBMS would take too large of a hit if it were to use current OS standards. Task switching is a very costly operation and that would happen frequently due to the need to wait for critical sections to be used. A proposed way around this is using a server model which allows for many processes to have a single destination process, the problem with this though is that requires messaging the destination process. Messaging is also a costly operation and the performance of this would be too slow as well. Stonebraker says “There appears to be no way out of the scheduling dilemma”.

4.) Consistency – The main issues here are ordering dependencies being crucial as you would not want a different course of actions to happen depending on the order of events. This is something that can not currently be assured in DBMS. The other major issue was the inability to lock things smaller than a single file, “Such smaller locks are deemed essential in some database environments”. This was the one area though that Stonebraker was optimistic about saying it could be provided by an OS.

5.) Paged virtual memory – A large difference between the needs for normal OS and DBMS is that databases have much larger files generally, which also means the meta data needed is larger and can provide multiple page faults. This would yet again slow down performance but other than that and OS and DBMS are similar in this aspect.

In conclusion, at the time current OSs were not feasible to be used for DBMS purposes and more adjustments would need to be made moving forward for that to be the case. Stonebraker encourages OS designers to take these things into account so that there can be one more standard OS that works with the everyday machine and databases. The benefits of this would be great, especially now that there is more “big data” than ever.



Review 27

Because each DBMS must be developed to work with a certain set of operating systems, the DBMS must be constrained to work within the bounds of what each OS allows. That is, a DBMS can only do what the OS lets it do. The purpose of this paper is to examine several OS services within the realms of buffer pool management, file management, and process management, and memory management while evaluating their utility in the context of implementing a DBMS.

In each realm of OS services, Stonebraker describes, in depth, the current way operating systems perform specific services, and how these approaches affect the behavior of a DBMS. After describing the advantages/drawbacks of such approaches, he offers a short comment or two on how these can be improved upon. An interesting point that was reiterated several times within the paper was the ability for the DBMS to communicate with the OS via “advice,” since data and memory management, while optimal in the context of the operating system, would not work well with a database (e.g. using LRU cache replacement for sequential access). In particular, it is mentioned that one solution for the scheduling problem’s issue of overhead is giving special privileges to DBMS processes with the attention of properly allocating processing power to database management operations. However, I can see this becoming a problem for OS development, as one would have to consider the additional questions of how many classes of processes should be created, what are the most optimal methods of handling each class, and to what extent should each facet of process and memory management be left to the DBMS program? While this paper gives a very concise and succinct insight into many different branches of OS management that can be used to serve a DBMS program, there is not much emphasis placed on the advantages/disadvantages of potential solutions offered.

In addition, it is mentioned that current operating system services are “too slow” and that OS designers should be attentive to the needs of DBMSs (perhaps there should be an operating system designed for this very purpose, or maybe there already exists a flavor of Linux that does this already!). There may be tradeoffs, however, in terms of performance to other processes of the operating system, so the needs of every important or relevant task on the computer should be attended to, not just DBMSs.


Review 28

This paper discusses the result of examinations on operating system’s applicability to support database management function. The examinations were run on UNIX operating system and using INGRES relational database system. This paper could be seen as complement to Stonebraker’s other paper on database architecture. Since DBMS can only run on an OS, it makes DBMS designer must abide to the context of the OS. Therefore, it is important to understand whether an OS is capable to support database management. OS services discussed are: buffer pool management; the file system; scheduling, process management, and interprocess communications; and consistency control. After examinations, the paper concludes that existing OSs are either too slow or inappropriate, thus the reason why DBMS provides the services on its own. As an addition, latter section discusses the use of paged virtual memory by the OS to support DBMS.

To me, what I like from this paper is the side-by-side comparison between OS and DBMS. For each service, it explains how the OS run the service and what kind of services are actually required by the DBMS. We get to see that what OS provides do not always answer what DBMS needs. It bluntly points out that certain services/facilities on the OS is not useful or could be better developed. Another plus point is that the paper offers alternative solutions to the current problems with the OS. I think this should be helpful for OS designer because the paper provides bullet points on what are needed and what could be improved.

However, I think the paper puts too much blame on the OS designers. I think there are things that could not easily be modified. For example, in the File System section, we know that the file blocks are scattered over a disk volume but it is inevitable that DBMS does considerable sequential access, resulting in considerable disk arm movement. The paper offers alternative of providing facilities as lower level objects and character arrays as higher level. For another problem (scheduling), the DBMS is basically asking access to the lower level part of the OS. I do not think it would be wise.



Review 29

This paper provides an in depth analysis of existing Operating System utilities and services and compares them to those typically implemented as part of a Database Management System (DBMS). The purpose of this paper is to review these OS utilities and how they differ from what is currently used by INGRES, a typical DBMS of the time. In each section, the author reviews the ways in which these OS utilities would need to be re-tooled, improved, or completely overhauled and altered to suit the needs of a DBMS.

The technical contributions of this paper are to lay out the ways in which OS services and DBMS services differ and overlap. By doing this, Stonebraker is issuing a call-to-action of sorts to OS developers to support these higher-level and more complex features. He is pointing out places where OS design is lacking and is essentially being duplicated in DBMSs with only small changes. This seems to be where he is implying that OS developers need to step up their game to provide services that are more useful to users for a wider variety of applications.

The paper also points out certain aspects of an OS that limit the ability of DBMS designers to implement the structures they desire. Specifically in the section on the process management and interprocess communication, Stonebraker indicates that the server model (presented also in the Anatomy of a Database paper) is a desirable organization, however is not currently supported on Unix systems (as of 1981) due to the pipes/message-passing system without the user implementing (and by doing so, essentially duplicating) many OS utilities. This is not a good thing because there are several downsides to the process-per-user approach that could be avoided, though there are tradeoffs to consider in the Server model as well. All in all, operating System implementation should not inform DBMS structures.

Though this paper does discuss important characteristics of operating system utilities that are not adequate or advisable for use in DBMS, it is important to note that it is written about Unix in its 1981 form. Many of these utilities have greatly improved and evolved in more recent systems, and the reader must keep this in mind. Throughout this paper, Stonebraker makes assertions of problems that the OS community needs to address because of the duplication happening in DBMSs, but he never indicates how these changes might be useful in other applications. I think a more clear indication of the impact these OS modifications would have on other application domains would strengthen his argument.


Review 30

Summary:
This paper gives an in-depth analysis of several DBMS services and its benefits that could be gained when they were support directly by the Operating System. The benefits can be summarized as:
1) To prevent duplicated work of design and code for similar services in DBMS (user space) and OS (kernel space)
2) To provide more efficiency by reducing the overhead for similar services in DBMS and OS

The analysis followed a consistent pattern: It first introduce the technology currently used in OS and the DBMS, and then points out the downside of directly replacing the similar services in DBMS with those currently used in OS, normally in several detailed aspects. After the analysis of things wouldn’t work, it comes to the summary part, at which the author either proposes a model that could solve the downsides or he would lead the problem to an open ending discussion.

With this pattern, the author analyzed the following services: 1). Buffer Pool Management, 2) File System, 3) Scheduling, Process management and Inter-process Communication, 4) Consistent control, 5). Pusher virtual memory.


Strength:
1) This paper gives an in-depth analysis of a batch of important services in DBMS and OS with a consistent pattern, which both provides its reader a comprehensive understanding and also a clear view of the benefits can be gained for using customized OS services in DBMS.
2) For most the analysis, this paper proposes better abstractions that can be used to replace the current using ones by which can provide better support for the DBMS.


Weakness:
1) Though the paper provides in-depth analysis for all of the services it brings out, most of the analysis are conceptually and are raised by some of the assumption and observation instead of careful designed experiments. It would be more convincing if it does some quantitative analysis based on experiments.
2) This paper's main idea is to push the abstractions that are currently used in DBMS to OS. Though it is probably true that the OS could provide a better support and reduce the overhead by directly providing the services in the kernel, it also makes the OS to provide some functionality that violate its origin purpose of providing general support for all potential applications. Moreover, since DBMS and OS are different research fields, the synchronization between these two fields may always be a problem. It is hard for OS kernel to provide specific functionalities for DBMS in time in most cases. DBMS can have more flexibility by maintaining some of the service in its user space.



Summary:
This paper gives an in-depth analysis of several DBMS services and its benefits that could be gained when they were support directly by the Operating System. The benefits can be summarized as:
1) To prevent duplicated work of design and code for similar services in DBMS (user space) and OS (kernel space)
2) To provide more efficiency by reducing the overhead for similar services in DBMS and OS

The analysis followed a consistent pattern: It first introduce the technology currently used in OS and the DBMS, and then points out the downside of directly replacing the similar services in DBMS with those currently used in OS, normally in several detailed aspects. After the analysis of things wouldn’t work, it comes to the summary part, at which the author either proposes a model that could solve the downsides or he would lead the problem to an open ending discussion.

With this pattern, the author analyzed the following services: 1). Buffer Pool Management, 2) File System, 3) Scheduling, Process management and Inter-process Communication, 4) Consistent control, 5). Pusher virtual memory.


Strength:
1) This paper gives an in-depth analysis of a batch of important services in DBMS and OS with a consistent pattern, which both provides its reader a comprehensive understanding and also a clear view of the benefits can be gained for using customized OS services in DBMS.
2) For most the analysis, this paper proposes better abstractions that can be used to replace the current using ones by which can provide better support for the DBMS.


Weakness:
1) Though the paper provides in-depth analysis for all of the services it brings out, most of the analysis are conceptually and are raised by some of the assumption and observation instead of careful designed experiments. It would be more convincing if it does some quantitative analysis based on experiments.
2) This paper's main idea is to push the abstractions that are currently used in DBMS to OS. Though it is probably true that the OS could provide a better support and reduce the overhead by directly providing the services in the kernel, it also makes the OS to provide some functionality that violate its origin purpose of providing general support for all potential applications. Moreover, since DBMS and OS are different research fields, the synchronization between these two fields may always be a problem. It is hard for OS kernel to provide specific functionalities for DBMS in time in most cases. DBMS can have more flexibility by maintaining some of the service in its user space.



Review 31

This paper examined the applicability of a few operating system services for databases. As operating systems at the time when this paper was written were mostly for general purposes, we can see the special need of databases were not well met. While pointing out the un-match between OS services and the DBMSs’ need, the author also provides suggestions on improving OS support for DBMSs. The paper primarily used UNIX OS and INGRES DBMS as an example. In practice, most of the DBMSs use its own variation of OS provided services in user space and leave OS facilities unused.

Author discussed the following OS supports in detail:

Buffer pool management: The page request pattern of DBMS and that the operating system expect are quite different, thus leading to poor performance in direct using OS buffer pool management facilities. Most DBMSs maintain a separate cache in user space.
The file system: UNIX supports varying size character array model. Though this model works well for other services like word editors, it does not provide high performance for structured files, which is used in DBMSs.
Scheduling and processes: Due to the high overhead of task switch and inter-process messaging, neither server model nor individual process model achieves performance goal of DBMSs. The DBMSs has to re-implement multitasking, scheduling and messaging systems in user space. It would be nice if OS could provide a special scheduling class for DBMSs.
Consistency control: OS doesn’t provide finer grained locks for pages or records, which are desired by DBMSs. Though OS supports cleanup after crashes, it’s of different logic than DBMSs transactions. It is possible that the OS provides both concurrency control and crash recovery for transactions, but the buffer management problem sacrifices the usability of such services.
Paged virtual memory: Though claimed appropriate, binding files into a user’s paged virtual address space is not a good solution for database. Large files can introduce a huge page table, which doesn’t fit into memory. It also has the same buffering problem as discussed in buffering pool management.

Some of the problems addressed in this paper have been solved in modern operating system by providing more advanced facilities, such as efficient thread library.


This paper examined the applicability of a few operating system services for databases. As operating systems at the time when this paper was written were mostly for general purposes, we can see the special need of databases were not well met. While pointing out the un-match between OS services and the DBMSs’ need, the author also provides suggestions on improving OS support for DBMSs. The paper primarily used UNIX OS and INGRES DBMS as an example. In practice, most of the DBMSs use its own variation of OS provided services in user space and leave OS facilities unused.

Author discussed the following OS supports in detail:

1) Buffer pool management: The page request pattern of DBMS and that the operating system expect are quite different, thus leading to poor performance in direct using OS buffer pool management facilities. Most DBMSs maintain a separate cache in user space.
2) The file system: UNIX supports varying size character array model. Though this model works well for other services like word editors, it does not provide high performance for structured files, which is used in DBMSs.
3) Scheduling and processes: Due to the high overhead of task switch and inter-process messaging, neither server model nor individual process model achieves performance goal of DBMSs. The DBMSs has to re-implement multitasking, scheduling and messaging systems in user space. It would be nice if OS could provide a special scheduling class for DBMSs.
4) Consistency control: OS doesn’t provide finer grained locks for pages or records, which are desired by DBMSs. Though OS supports cleanup after crashes, it’s of different logic than DBMSs transactions. It is possible that the OS provides both concurrency control and crash recovery for transactions, but the buffer management problem sacrifices the usability of such services.
5) Paged virtual memory: Though claimed appropriate, binding files into a user’s paged virtual address space is not a good solution for database. Large files can introduce a huge page table, which doesn’t fit into memory. It also has the same buffering problem as discussed in buffering pool management.

Some of the problems addressed in this paper have been solved in modern operating system by providing more advanced facilities, such as efficient thread library.