University of Michigan

**Abstract**:
We analyze routing messages on hypercube computers which have faulty
processors and/or communication links. We consider the basic problem
of routing a single message from an arbitrary source to an arbitrary
destination. Here ``fault'' can be quite general, including,
for example, a link currently unavailable due to congestion.

A framework for the analysis of fault tolerant routing schemes is presented. It includes differing routing schemes, routing information, and fault distribution models. Using random routing, under the one-step local information routing model, we show that the a priori probability of successful message routing is high even for an exceedingly large number of faults. We also analyze the behavior of ``sidetracking'', a routing method which combines the concepts of local information and randomization. The empirical performance of the sidetracking algorithm indicates strongly that, in the limit as the cube dimension grows, for any fixed probability of node failure, the probability of successful routing converges to 1.

**Keywords**:
message routing, fault tolerance, resilience,
hypercube computer, parallel computing, randomized routing, random graph

**Complete paper**.
This appears in
*Proc. 3rd Conf. Hypercube Concurrent Computers and Applications*
(1988), ACM, pp. 318-327. It was digitized by ACM.

- J. Gordon,
*Efficient Schemes for Massively Fault Tolerant Parallel Communication*, PhD thesis, U. Michigan, 1990. - An overview of work of myself and students, and relevant papers, in the area of parallel computing.
- A modest
explanation of parallel computing, a
tutorial, Parallel Computing 101,
and a list of
parallel computing resources.

Copyright © 2006-2018 Quentin F. Stout |