Questions about Rio
- Does Rio work for write-back CPU caches?
Yes. The key is getting the data from the CPU cache to memory before the
CPU is reset. This is easy on our current Alpha platform (DEC 3000).
The DEC 3000 provides a physical reset button that jumps to the console.
A console (or boot) routine can flush the cache before rebooting. For
PCs, the Rio port to FreeBSD uses a "safe sync" mechanism that flushes the
data to disk during a crash. Safe sync is triggered from the low-level
keyboard interrupt handler. We are validating and refining this method, but
early measurements indicate this works for over 95% of crashes.
- Does Rio provide reliability or availability?
Rio's focus is reliability (not losing data). In general, availability (being
able to access data continuously) requires hardware replication. Rio can
provide availability in much the same way that disks can. For example, a cheap
serial line from the memory card to another computer would provide availability
while rebooting or during hardware failure. With safe sync, Rio can provide
availability with dual-ported disks (as with any disk-based system). We are
exploring using DEC's Memory Channel to provide high availability and
reliability in the face of hardware failures.
- Does Rio survive hardware failure?
Rio's focus is on surviving software crashes, because prior studies indicate
that software failures outnumber hardware failures 10-to-1. Rio can
be extended to survive hardware failures. A cheap serial line from the memory
card to another computer allows data to be transferred to another machine in
case of hardware failure. Safe sync with dual-ported disks allows Rio to
survive hardware failure that result in machine crashes. We are exploring
using DEC's Memory Channel to provide high availability and reliability in
the face of hardware failures.
Questions about Vista
- What are the target applications for Vista?
Applications that can make the best use of Vista use persistent data
(e.g. files, databases), have working sets fit in main memory (i.e. don't
thrash), and can tolerate down times of 30 seconds during reboot.
There are certainly some applications for which Vista is not a complete
solution (e.g. banks). These applications require ultra-high reliability
and availability, and hence should use replication. However, we believe
that most desktop applications can take advantage of Vista's fast
transactions. Here are some examples:
- object-oriented databases (e.g. CAD)
- file system operations (e.g. make each open/close a transaction)
- persistent Java (e.g. make each method a transaction)
- persistent processes (storing the process state in Vista enables
a process to survive crashes).
- real-time databases (e.g. manufacturing database storing sensor
data)
- Why doesn't Vista handle concurrency?
Vista strives to provide the minimal building block for transactions.
Many applications are single-threaded and hence need no concurrency control.
Applications that do need concurrency control can layer a locking mechanism
on top of Vista. Vista's fast transactions minimize lock hold times and
hence make this job easier. We just don't want to penalize single-threaded
applications (that don't need to lock data).
- Why isn't group commit good enough?
Group commit is a technique for improving throughput. Vista improves
latency, which is a more general way of improving throughput. Group
commit works only if there are concurrent, independent transactions.
Vista works even for applications with serial or dependent transactions.
Finally, Vista is 100 times faster than a system with group commit.
- Does Vista work for working sets larger than main memory?
Vista operates correctly with data sets that are larger than main
memory. As with any transaction system or database, Vista performs
poorly when thrashing. While Vista cannot make the disk I/Os disappear
in this case, it can make the disk writes cheaper by delaying them safely
in main memory and scheduling them efficiently.