What is Parallel Computing?
A Not Too Serious Explanation.

Before I explain parallel computing, it's important to understand that

You can run, but you can't hide.
I'll come back to this later.


Suppose you have a lot of work to be done, and want to get it done much faster, so you hire 100 workers. If the work is 100 separate jobs that don't depend on each other, and they all take the same amount of time and can be easily parceled out to the workers, then you'll get it done about 100 times faster. This is so easy that it is called embarrassingly parallel. Just because it is embarrassing doesn't mean you shouldn't do it, and in fact it is probably exactly what you should do. A different option would be to parallelize each job, as discussed below, and then run these parallelizations one after another. However, as will be shown, this is probably a less efficient way of doing the work. Occasionally this isn't true because on computers, doing all of the job on one processor may require storing many results on disk, while the parallel job may spread the intermediate results in the RAM of the different processors. RAM is much faster than disks. However, if the program isn't spending a lot of time using the disk then embarrassingly parallel is the smart way to go. Assume this is what you should do unless you analyze the situation and determine that it isn't. Embarrassingly parallel is simple, and if you can get the workers do it for free then it is the cheapest solution as well.

What if the jobs take widely different amounts of time, but still have no interaction? If you have enough of them, just start handing them out until every worker has one, and then when workers are done they come back and request another. This will work fairly well, especially if you can put longer jobs first and shorter ones later. This form of parallelization is still pretty simple (unless it is teenagers that are doing the jobs - see "Do Teenagers Deserve Negative Wages?"). Many of the departmental computers that have more than one processor are run in this fashion, servicing jobs as they arrive. This is also how many airlines run their check-in queue: in queuing theory this is known as a "single queue multiple server" system, which is more efficient than the "multiple queue multiple server" systems common in checkout lines at grocery stores.

Suppose instead that this work you have is only a single job, but it takes a very long time. Now you have to do something to reorganize the job, somehow breaking it into pieces that can be done concurrently. For example, if the job is to build a house, it can be broken up into plumbing, electrical, etc. However, while many jobs can be done at the same time, some have specific orderings, such as putting in the foundation before the walls can go up. If all of the workers are there all of the time, then there will be periods when most of them are just waiting around for some task (such as the foundation) to be finished. Not very cost-effective, and you are not getting the job done 100 times faster. Such is the life of a parallel programmer.

Definition: Parallel computing is the use of two or more processors (cores, computers) in combination to solve a single problem.

The programmer has to figure out how to break the problem into pieces, and has to figure out how the pieces relate to each other. For example, a parallel program to play chess might look at all the possible first moves it could make. Each different first move could be explored by a different processor, to see how the game would continue from that point. At the end, these results have to be combined to figure out which is the best first move. Actually, the situation is even more complicated, because if the program is looking ahead several moves, then different starts can end up at the same board position. To be efficient, the program would have to keep track of this, so that if one processor had already evaluated that position, then others would not waste time duplicating the effort. This is how must parallel chess-playing systems work, including the famous IBM Deep Blue machine that beat Kasparov.

Is Parallel Computing Easier or Harder than Serial Computing?

(Standard computing is also known as "serial computing", so you've been doing serial computing for years!)


Most of the universe is inherently parallel, with numerous things happening simultaneously. You may put your pants on one leg at a time, but millions of people are putting their pants on simultaneously. Many problems have an abundance of natural parallelism that a programmer can exploit. Sometimes people are forced to rethink the problem when they try to develop a parallel program, and they change their entire approach in order to directly utilize the inherent parallelism.


Why can't you hide?

In the past, when someone needed to solve bigger problems they could often wait a bit, and a faster computer would come out. This was mainly due to Moore's law, which people interpreted to mean that the speed of computers would double about every two years. However, this isn't true any longer: if you look at the GHz speed of the processors in a desktop computer 2 years ago, versus the speed now, it has barely, if any, increased. Basically this happened because they can't make reliable low-cost processors that run significantly faster.

What has doubled, however, is the number of processors on a single chip. Some smartphones now have 8 cores (processors), you can buy CPU chips with 12 cores, and this number will soon increase. Even the Apple watch now has 2 cores. These are called "multi-core" or "many-core" chips. There are also graphics processing units (GPUs) with over 1000 highly specialized processors. This is closer to what Moore was talking about, because what he really said was that the number of transistors would keep doubling. The rate of improvement is slowing down, but significant increases in the number of transistors should continue for at least a decade.

If there are only a few cores in your computer then it is pretty easy to find tasks that can be done simultaneously, such as waiting for keystrokes and running a browser, i.e., embarrassingly parallel. However, as the number of processors keeps increasing, the parallelization problems mentioned above become increasingly serious. There are now parallel computers that have > 1,000,000 cores, and people are planning for ones that will have > 100,000,000.

Overall, this means that there is a massive need to make use of the parallelism in multi-core chips for almost any problem, and to use many of these combined together for large problems. Job security for me! (and you, if you learn parallel computing)

Some Resources on Parallel Computing

If you want to learn more about parallel computing, there are some books available, though I don't like most of them. Many colleges and universities teach classes in this subject, and there are some tutorials available. For example, the author teaches a parallel computing class and a tutorial on parallel computing. These are aimed at larger problems, not ones that a single multi-core chip is sufficient for. However, my university (and many others) is developing courses addressing multi-core parallelism. Some of my colleagues have even developed a few teaching modules used in high school and elementary school classes, where the students become the parallel processors. If you attend my tutorial or class you too will become a processor in a parallel computer, and one of your tasks will be to flip a coin. You'll have to attend to see why. Don't forget to bring a coin.

There are also resources available via the web - here are some pointers to parallel computing resources such as manuals, software, parallel computers, etc.


Some addresses for me:

Quentin F. Stout
Computer Science and Engineering
University of Michigan
Ann Arbor, MI 48109-2121

734-763-1518 (office, messages)      734-763-8094 (fax)
qstout @ umich · edu      www.eecs.umich.edu/~qstout/

If you are close enough, "Hi" is an effective address.

Quentin Copyright © 2000-2018 Quentin F. Stout