A Response to Seltzer's Response

This article is a response to Margo Seltzer's response to my critique of her 1995 USENIX paper Logging versus Clustering: A Performance Evaluation.

Here are the four summary paragraphs from Seltzer's response (bulleted and italicized), followed by my comments.

This is a surprising comment, given the number and intensity of discussions we have had on this topic. The 1993 paper is riddled with flaws, including errors in the BSD-LFS implementation, errors in the choice of benchmarks, and errors and omissions in the presentation and analysis of the results. In fact, Seltzer personally fixed two major performance problems in BSD-LFS during the preparation of the 1995 paper, and her response web page describes a flaw in the 1993 paper's analysis. Click here for a more complete critique of the 1993 paper.

In the detailed explanation, Seltzer offers new theories about what might account for the insensitivity of LFS cleaner performance to CPU utilization. However, none of these theories is backed up with measurements (the measurements given by Seltzer neither prove or disprove her theories). In any case they make it even clearer that the paper does not adequately explain what is going on. I stand by my suggestion that the paper's results (and Seltzer's latest theories) should be taken with a grain of salt until the performance can be tied quantitatively to specific architectural features of LFS.

As I explained in my critique, the optimization need not affect the ability of the file system buffer cache to cache dirty data; the current approach represents a performance bug that can be fixed in a way that improves TP performance without adversely affecting other applications.

I'm delighted to see the additional measurements, because they validate my concerns and contradict Seltzer's conclusion above. The new data show that the paper erred by a factor of two in its estimates of performance degradation due to fragmentation. Seltzer's new estimates of performance degradation are 14% for reads and 24% for writes (this is the average of her reported numbers), whereas the average degradations reported in the paper were 6% for reads and 14% for writes. In the worst case, fragmentation degrades read performance by 33% and write performance by 47%.

What's most interesting about this is that the cost of fragmentation in FFS is comparable to the cost of cleaning in LFS. In Rosenblum's measurements of production LFS usage in Sprite, cleaning effectively added 20-60% to the cost of writes (but nothing to the cost of reads); in Seltzer's TP benchmark, which represents a pathological case for LFS, overall degradation due to cleaning was about 45%. However, the cleaner in LFS runs in the background so it may be possible to hide the cleaning costs by cleaning during idle periods; in FFS the fragmentation affects the file layout so there is no way to escape the overhead.

The Real Conclusions

If the original paper by Rosenblum and myself is combined with all of Seltzer's data, including both the data in the two USENIX papers and the data in her web response, I think it is reasonable to draw the following conclusions:

It's also important to remember some of the other advantages of LFS that weren't addressed in Seltzer's measurements, such as faster crash recovery and the ability to handle striped disks and network servers much more efficiently than FFS. Overall, the available data suggests that LFS is a much better file system than FFS.