Ken: A Platform for Fault-Tolerant Distributed Computing
- Description
- Ken is a lightweight C implementation of a
rollback-recovery protocol that provides crash-restart
resilience to distributed applications. Ken unifies and
automates reliability for both application data "at
rest" (local process state) and data "in
motion" (messages in a distributed system). Ken
ensures that crash-restart failures (power failures,
kernel panics, process crashes) can't corrupt or destroy
data, and Ken guarantees that messages are reliably
delivered and processed by recipients in
process-pairwise-FIFO order. Ken furthermore provides
strong global correctness guarantees and
prevents crash-restart failures and packet losses from
causing a distributed system to emit incorrect outputs.
Finally, Ken's strong guarantees compose
effortlessly when independently developed Ken-based
distributed systems are integrated.
- Source distribution
-
- Contributed extensions & enhancements
- 21 November 2012: Alternate "Go-Back-N"
transport contributed by Sunghwan Yoo of Purdue.
See file "
contrib_kenext_gbn.c
".
Use in place of "kenext.c
"
when building. As of 3 April 2013 this is bundled
into the main distribution tarball.
- Follow-on Projects
- MaceKen:
Integrates Ken into the Mace distributed-systems toolkit.
- SchemeKen:
Integrates Ken into a Scheme interpreter.
- V8Ken:
Integrates Ken into the V8 JavaScript engine. The long-term
goal of this project is NodeKen: bringing the benefits of
Ken to the Node.js server-side JavaScript platform.
Senior project leader Tom Van Cutsem has written a nice description of the
V8Ken value proposition.
- HP Indigo printing presses incorporate a Ken-style persistent
heap, which has dramatically reduced recovery times following
crashes due to power outages. A brief
tech
report describes this important internal technology transfer.
famus
is a minimalist, stand-alone, clean-slate
re-implementation of the mechanism underlying Ken's persistent heap.
- Publications
- The C implementation of Ken and the integration of Ken
into the Mace distributed systems toolkit are described
in Yoo et al.,
"Composable
Reliability for Asynchronous Systems,"
[local copy]
in the proceedings of the
2012 USENIX Annual Technical Conference.
An early abstract description of the Ken rollback-recovery
protocol and a characterization of its properties is
available in
HP Labs Tech Report 2010-155.
See the USENIX ATC paper (or the source code) for a more
up-to-date description of the implementation. Ken is
abstracted from its implementation in Waterken, a Java
platform by Tyler Close. The tech report contains
additional detail on Ken's genealogy. Kernel support for
Ken-style persistent heaps is described in Park et al.,
"Failure-atomic msync()", EuroSys 2013.
- Frequently Asked Questions (FAQs)
- Registration & Support
- Users are not required to register in any way, but may
"opt in" to receive notifications of changes to
Ken. If you have questions about how to use Ken or about
its implementation, please first seek answers in
Ken-related publications, the FAQs, and the source code.
If that fails you may write to the Ken team via
Terence
Kelly. The Ken team wants to
encourage the development of an ecosystem of mutually
supportive fault-tolerant distributed applications and
services, and to the extent possible we will try to
support efforts toward that goal.