Quentin F. Stout

Computer Science and Engineering, University of Michigan

**Abstract:**
Systems connected in a grid fashion have occurred throughout parallel computing
history, from cellular automata to sensor networks (typically an irregular
grid) to interconnection networks for supercomputers. For such systems
physical location and distance play an important role, as opposed to, say,
PRAMs or serial
computers.
More recently another physical consideration, power consumption, has
taken on importance. It is a factor in individual processors for
cell phones and laptop computers, in sensor networks, and in supercomputers.
The latter is different from the others in that the power is external and does
not diminish over time.

This paper addresses the problem of minimizing the peak power required. This is of importance in supercomputers, and also in systems such as single chips with many small processors. For example, there can be image processing chips which both detect images and do nontrivial processing on them. It is not realistic to assume that the entire chip can have most of its circuits active at the same time, especially since decreasing feature size makes it possible to pack even more processing power onto a chip. Similarly, peak power requirements for larger systems introduce packing constraints and affect the cost of the power supply and air conditioning needed. Packing constraints in turn affect communication time. For example, peak power considerations affected features of the BlueGene/L such as the use of slower, but numerous, processors, each with relatively little memory compared to other supercomputers.

We utilize the basic *mesh-connected computer* model: the system has
n processors arranged as an
√n × √n grid, where each processor can communicate
only with its
immediate neighbors (either the 4 neighbors sharing an edge, or the 8 sharing
an edge or corner). Each processor can store a fixed number of words of
logarithmic length, and all operations on these words, including sending one to
a neighbor, take constant time and energy.

Standard algorithms for mesh-connected computers have all processors on
at all times, i.e., peak energy n.
Since the communication diameter √n is a lower bound for any nontrivial
problem, n^{3/2} is a lower bound on the total
energy these algorithms use.
For problems such as sorting the total data movement required shows that this
energy is required, and hence any reduction in peak power will force an
increase in the time.
However, many other problems require only o(n^{3/2}) data movement,
and hence there is the possibility that the peak power can be reduced without
increasing the time.

We show that this is indeed the case for several algorithms on naturally 2-dimensional data sets, such as images and adjacency matrices of graphs. Power efficient algorithms are given for problems involving labeling connected components, deciding if a maze can be solved, finding minimal spanning trees, etc. We also show that efficient algorithms exist for problems involving graphs given as an unordered set of edges, where the running time is a function of the number of vertices as well as of n. These algorithms have the property that most of the processors are turned off most of the time.

Squirrels are used to help describe the algorithms, in that the squirrels scurry around the processors, carrying information from place to place. The presence of a squirrel indicates that the processor is on, and the information indicates the contents of a message being passed. The number of squirrels indicates the peak power. Rodents are often forced to solve mazes, so it seemed appropriate to use them here to illustrate maze-solving, and other, algorithms. I'm not sure if squirrels normally cooperate, so perhaps these some other rodents such as naked mole rats.

Note that the cellular automata model here can be viewed as representing the limits of classical physics, since it takes a fixed amount of time for light to travel a fixed distance. Thus the time-optimal algorithms are optimal for classical physics in 2-dimensions. Further, the algorithms minimizing various power aspects show the minimum amount of energy needed to solve the problem within classical physics. The author has results for 3-dimensional space as well, but there wasn't room to include them here. In some cases they are similar, but for others 3 dimensions enables new approaches.

**Keywords:** peak power consumption, low power, energy, parallel computing,
mesh-connected computer,
cellular automata, sensor network, supercomputer, squirrel algorithm

**Complete paper**.
A preliminary announcement of these results
appeared in *Proceedings of Symposium on Parallelism in
Algorithms and Architectures (SPAA) * 2006. In that paper I
used rats instead of squirrels since normally mazes are solved by
rats, not squirrels. However, there were many complaints about
the thought of rats running around doing our bidding, and so I
changed to squirrels. Perhaps too many people have seen the
movie *Willard*.

I never got around to publishing the full paper. The paper Low Power Mesh Algorithms for Image Problems is a subset of the full paper.

- Space, Time, Power: Evolving Concerns for Parallel Algorithms, a talk about the evolution of parallel computers and the goals/concerns that people had. It includes mention of the algorithms in this paper.
- My papers in parallel computing, and an overview of my research
- A whimsical explanation of parallel computing, a tutorial, Parallel Computing 101, and a list of parallel computing resources

Copyright © 2006-2020 Quentin F. Stout. |