Frequently Asked Questions about Ken

What is Ken?: "Ken" refers to both a simple rollback-recovery protocol and its implementation as a C library. Both are described in Yoo et al., "Composable Reliability for Asynchronous Systems," 2012 USENIX Annual Technical Conference. Ken (in both senses of the term) facilitates reliable distributed application development. The name and protocol both come from Waterken, the Java platform that first implemented the Ken protocol. Waterken provides different programming abstractions compared to the C implementation of Ken.
How does Ken work?: One way to think about Ken is that it starts with the "actor" or "communicating event loop" distributed programming paradigm and adds a twist: Each iteration of each event loop is an ACID transaction. It turns out that remarkably strong global correctness guarantees follow from transactional event loops.
What kinds of guarantees does Ken provide?: Tolerated failures (crash-restart failures such as power outages and OS kernel panics) cannot corrupt or destroy local process state nor messages between Ken processes in a distributed system. Messages are delivered exactly once in FIFO order between each sender-receiver pair. Ken furthermore guarantees "distributed consistency" in the sense that a distributed system can't end up in a causality-violating state wherein one process remembers having received a message but, due to crash-induced amnesia, no other process remembers having sent it. More importantly, Ken masks failures in in the sense that an external observer can't infer the occurrence of tolerated failures within a Ken-based distributed system by observing the system's collective outputs. Finally, Ken's global correctness guarantees compose when independently developed Ken-based distributed systems are integrated, even when the integration was not foreseen nor planned for.
Such strong guarantees must require substantial programmer effort, no?: On the contrary, Ken completely automates reliability. The Ken application programmer takes no explicit steps to ensure reliability other than writing atop the Ken platform. For example, the programmer does not specify the beginning and end of transactions. Apart from programming in Ken fashion, achieving Ken's guarantees is effortless and foolproof.
What is it like to program a Ken application?: You write a handler function that processes incoming messages and inputs. The handler function can allocate and de-allocate data on a persistent heap, and the handler can send messages to other Ken processes and emit outputs. There are a few other facilities provided but Ken is quite simple. If you're familiar with existing event-driven programming paradigms, e.g., GUI programming or AJAX, the learning curve will be gentle.
What kinds of applications is Ken for?: Distributed applications that must prevent crash-restart failures from causing incorrect behavior or corrupting/destroying application state. Specific programs that we have built atop Ken and MaceKen include distributed hash tables, a distributed graph analysis program, and a distributed e-commerce scenario. In our experience, programming in Ken is convenient, and Ken is a versatile platform.
But is there a specific "sweet spot" use case?: Anecdotally, it appears that programmers confronted with the requirement to protect both local process state and inter-process messages from crashes frequently employ message queuing middleware for the latter and a relational database for the former, even when relational DB features aren't fundamentally required. The RDBMS is used simply to keep application state safe, because writing homebrew checkpointing and recovery code atop an ordinary file system is too tedious and error-prone. In the RDBMS-plus-MQ pattern, it is the programmer's responsibility to orchestrate the delicate interplay between two sets of operations: transactions that evolve the database from one consistent state to the next, and operations that ensure reliable messages. The slightest mistake (e.g., failing to record an outbound message in the database) can leave the application vulnerable to crash-induced distributed inconsistency. Furthermore the individual reliability guarantees of independently developed applications written in the RDBMS-plus-MQ pattern are unlikely to compose without additional programmer effort. Ken automates both process state reliability and message reliability, masks failures globally, and preserves global correctness under composition.
What about offbeat/unforeseen uses?: Perhaps the most remarkable unforeseen use of Ken occurred in July 2012, when a small group of developers integrated Ken into a mature, full-featured Scheme interpreter. The group reports that this took them only one day, and they received zero assistance from the Ken team. The result, "SchemeKen," is to the best of their knowledge the first crash-resilient Scheme interpreter. In August 2012 the Vrije Universiteit Brussel released SchemeKen as Open Source software.
Is Ken an alternative to conventional databases?: Sometimes. If your goal is to protect the integrity of application state from crash-restart failures, and if you want to update application state via ACID transactions, then Ken might be a reasonable way to achieve these goals. Ken is especially suitable when you want to store and manipulate your data in arbitrary C/C++ data structures rather than in relational format. A conventional database is a better choice if you require features such as relational algebra, schemas, and SQL.
Is Ken an alternative to reliable-messaging middleware?: Sometimes. Ken provides only a small subset of the functionality of a full-featured message queuing middleware package. Specifically, Ken provides only reliable exactly-once message delivery in FIFO order between each sender and receiver pair. If that's your only message reliability requirement, Ken might be a reasonable way to fulfill it, particularly if Ken's other features address your other requirements.
Are Ken transactions fast?: That depends on the storage medium that provides data durability. On an enterprise-class RAID system backed by 15K RPM spinning disks, Ken transactions take a few milliseconds; ACID transactions in conventional databases take roughly as long atop the same storage medium. Flash-based SSDs would likely be faster, and emerging non-volatile memory (NVRAM) would likely be faster still. Regardless of the storage medium, data durability to preserve data integrity entails a performance overhead. Ken currently strives for reasonable performance in two ways: Ken overlaps execution of the next iteration of the event loop with committing the previous iteration's checkpoint to durable storage; and Ken's checkpoints are incremental.
Tell me about the alternate "Go-Back-N" transport.: Ken's default UDP-based transport retransmits messages with a simple exponential backoff until an ACK is received. This is fine for client-server interactions and it's also fine if programmers use the ken_ackd() interface to avoid overloading a recipient with too many messages. The MaceKen team at Purdue University has contributed an alternative UDP-based transport that implements the "Go-Back-N" re-transmission policy. The alternative transport is available as a replacement for the default kenext.c file. The Go-Back-N implementation has been tested and used at Purdue but has not been extensively tested in Palo Alto. It may offer superior performance and/or convenience in some situations.
Can you make Ken faster by relaxing its guarantees?: Yes, but our intention for the foreseeable future is to maintain Ken's correctness guarantees rather than weaken them for the sake of performance.
What about trading simplicity for speed?: Maybe some day; probably not soon. Complexifying the programming model or the implementation wouldn't help reliability, which is a higher priority.
Can I use Ken with C++, particularly STL?: Yes, if you're careful. Ken is implemented in C89, but we have successfully integrated Ken into the Mace distributed systems toolkit from Purdue University, which is written in C++ and uses STL extensively. It can be done. As of August 2012 the distribution includes a sample Ken application that uses C++ STL.
What kinds of OSes can Ken run on?: Ken is intended to be portable across POSIX-compliant systems. It has been tested on HP-UX and on several distributions of Linux.
What about Mac OS X?: Members of the Ken community have reported that Ken can be made to work successfully on Mac OS X, with a bit of effort. Mac OS X appears to have a few POSIX non-compliance issues that complicate compilation, and special compilation may be required to ensure successful recovery. Furthermore the default file system reportedly doesn't support sparse files, which Ken requires. The workaround suggestions we've heard include: change the code to catch SIGBUS rather than SIGSEGV when memory pages are dirtied; use fcntl(F_FULLSYNC) instead of fsync() because the latter doesn't provide durability on Mac OS X; and "run Ken on a separate sparse disk image" (not yet tested) or reduce the size of Ken's state blob to circumvent the lack of sparse file support (tested successfully). Compiling with "gcc -Xlinker -no_pie" reportedly fixes an issue with mmap() that might be due to address space layout randomization. These suggestions come from the Ken community and have not been evaluated by the core Ken team because we lack Mac OS X machines. We have received a patch that embodies several of the above workarounds; write to us if you'd like to try it. (A much better solution would be for Mac OS X to support POSIX compliance, at least as an option, if it doesn't already.)
Which Open Source license covers Ken?: BSD.
Can you help me get started with Ken?: The Ken team will try to help those who have tried to help themselves, particularly if you're committed to building something useful atop Ken. We will also try to leverage support from the Ken user and developer community. If you're interested in improving the Ken infrastructure rather than developing applications atop Ken, we can try to work together.