Questions about Rio

Does Rio work for write-back CPU caches? Yes. The key is getting the data from the CPU cache to memory before the CPU is reset. This is easy on our current Alpha platform (DEC 3000). The DEC 3000 provides a physical reset button that jumps to the console. A console (or boot) routine can flush the cache before rebooting. For PCs, the Rio port to FreeBSD uses a "safe sync" mechanism that flushes the data to disk during a crash. Safe sync is triggered from the low-level keyboard interrupt handler. We are validating and refining this method, but early measurements indicate this works for over 95% of crashes.
Does Rio provide reliability or availability? Rio's focus is reliability (not losing data). In general, availability (being able to access data continuously) requires hardware replication. Rio can provide availability in much the same way that disks can. For example, a cheap serial line from the memory card to another computer would provide availability while rebooting or during hardware failure. With safe sync, Rio can provide availability with dual-ported disks (as with any disk-based system). We are exploring using DEC's Memory Channel to provide high availability and reliability in the face of hardware failures.
Does Rio survive hardware failure? Rio's focus is on surviving software crashes, because prior studies indicate that software failures outnumber hardware failures 10-to-1. Rio can be extended to survive hardware failures. A cheap serial line from the memory card to another computer allows data to be transferred to another machine in case of hardware failure. Safe sync with dual-ported disks allows Rio to survive hardware failure that result in machine crashes. We are exploring using DEC's Memory Channel to provide high availability and reliability in the face of hardware failures.

Questions about Vista

What are the target applications for Vista? Applications that can make the best use of Vista use persistent data (e.g. files, databases), have working sets fit in main memory (i.e. don't thrash), and can tolerate down times of 30 seconds during reboot. There are certainly some applications for which Vista is not a complete solution (e.g. banks). These applications require ultra-high reliability and availability, and hence should use replication. However, we believe that most desktop applications can take advantage of Vista's fast transactions. Here are some examples:
- object-oriented databases (e.g. CAD)
- file system operations (e.g. make each open/close a transaction)
- persistent Java (e.g. make each method a transaction)
- persistent processes (storing the process state in Vista enables a process to survive crashes).
- real-time databases (e.g. manufacturing database storing sensor data)
Why doesn't Vista handle concurrency? Vista strives to provide the minimal building block for transactions. Many applications are single-threaded and hence need no concurrency control. Applications that do need concurrency control can layer a locking mechanism on top of Vista. Vista's fast transactions minimize lock hold times and hence make this job easier. We just don't want to penalize single-threaded applications (that don't need to lock data).
Why isn't group commit good enough? Group commit is a technique for improving throughput. Vista improves latency, which is a more general way of improving throughput. Group commit works only if there are concurrent, independent transactions. Vista works even for applications with serial or dependent transactions. Finally, Vista is 100 times faster than a system with group commit.
Does Vista work for working sets larger than main memory? Vista operates correctly with data sets that are larger than main memory. As with any transaction system or database, Vista performs poorly when thrashing. While Vista cannot make the disk I/Os disappear in this case, it can make the disk writes cheaper by delaying them safely in main memory and scheduling them efficiently.