EECS 587, Parallel Computing
I'll admit as many people as possible, up to the room capacity.
Highest priority is CSE students, next is candidates/precandidates from other departments.
You Can't Hide
It is almost impossible to buy a computer that isn't parallel.
Even an iPhone has 2 cores. Some graphics processing chips (GPUs) have 128 specialized compute cores,
and some supercomputers incorporate these chips (though the chips
are very difficult to use efficiently).
The Tianhe-2 supercomputer has > 3,000,000 cores and a theoretical peak
performance of > 50 Petaflops, i.e., 50,000,000,000,000,000 floating
point operations per second.
The number of cores/chip will continually increase, and hence
parallel computing is needed to make use of whatever system you buy.
This is especially true for compute-intensive tasks such as simulations,
analyzing large amounts of data, or optimizing complicated systems.
Parallel computers are easy to build - it's the software that takes work.
Typically about half the class is from Computer Science and Engineering,
and half is from a wide range of other areas throughout the sciences,
engineering, and medicine.
Some CSE students want to become researchers in
parallel computing or cloud computing,
while students outside CSE typically
intend to apply parallel computing to their discipline, working on
problems too large to be solved via serial codes written for
Most students are graduate students, but there are also a
few undergraduates and postdocs.
Satisfying Degree Requirements
This course can be used to satisfy requirements in a variety of
- CSE Graduate Students: the software distribution requirement
and the general 500-level requirement for the MA and PhD.
- CSE Undergraduates:
"computer oriented technical elective" requirements for
the CE and CS degrees. However, you must first get permission from the academic advisors.
- Graduate Students outside of CSE: cognate requirements.
- Computational Discovery and Engineering students: it is a core methodology course. Most students in this program should take this course.
- Scientific Computing students:
computer science distributional requirements.
Most SC students take (and need) this class.
- CSCS students: an approved course "related to
- Various other programs: general computer
related distributional requirements.
- All students: you can take it because you want to.
The course covers a wide range of aspects of parallel computing, with the
emphasis on developing efficient software for commonly available
systems, such as the clusters here on campus.
We will emphasize using standard software, including
MPI (Message Passing Interface) and OpenMP. Use of standard software helps
make the code more portable, preserving the time invested in it.
Because there are several parallel computing models, you also have to
learn some about various parallel architectures since there is significant
This includes aspects such as shared
vs. distributed memory, fine-grain vs. medium-grain, MIMD/SIMD,
cache structure, etc.
I'm working on arrangements to that we will also have an
assignment involving next-generation GPUs, programmed with CUDA or OpenCL.
We examine many reasons for poor parallel performance, and
a wide range of techniques for improving it. Concepts covered include
domain decomposition for geometric and graph-based problems;
deterministic, probabilistic and adaptive load balancing; and
You'll learn why modest parallelization is relatively easy to achieve,
and why efficient massive parallelization is quit difficult - in particular,
you will continually encounter Amdahl's Law.
Examples and programs will be numeric, such as matrix multiplication,
and nonnumeric, such as sorting or discrete event simulation,
but they do not assume any deep knowledge
in any field and you'll be given all of the requisite information
you need to understand the problem.
The focus of the class is on techniques and analyses
that cut across applications.
You'll also learn some about high performance computing in general.
It makes no sense to spend a lot of money on parallel computers if
you could get just as much performance on a serial computer had you
only tuned your code. For example, intelligent use of cache is critical and
can give an order of magnitude improvement in some cases.
Here is a somewhat whimsical overview of
The grade is based on computer programming projects, a few written
homeworks, and a final project of your choosing (though I must
approve it). The final project is about 1/2 of the grade.
Often students can integrate this project in with
their other work, so, for example, it may be part of their thesis.
Many students have used this project to start a new research area.
Several of the projects have lead to publications, and a few have resulted in
awards (Best Thesis, Gordon Bell, finalist for Best Paper at the
Supercomputing conference, etc.)
students in nuclear engineering might parallelize
serial code which is a simulation of
a reactor; students involved in data analysis may develop new
algorithms for complex pattern matching;
students interested in operating systems might develop a technique for
measuring and improving performance;
students interested in theoretical computer science might develop
a new algorithm for a graph problem, etc.
Almost all projects involve developing a parallel program, but in some situations
it may be more theoretical, such as developing an algorithm for an
abstract model of parallelism.
If you don't have any ideas for
a project I'll help you find some.
Programming ability in C, C++, or Fortran, plus
willingness to rethink how problems should be solved.
You do not need any prior experience with
computing or supercomputing.
As noted above, it is ``application neutral'' so you don't need a strong background in all of computer science,
nor do you need to know numerical analysis.
If you don't have much programming experience then you should consider
taking EECS 402, Computer Programming for Scientists and Engineers,
offered every Winter semester.
None, but we'll use some on-line
computer manuals and tutorials, as well as some papers and small
sections of texts.
(You could always buy my
book, but it is
not very relevant to the course since it is focused on abstract
models of parallelism and algorithms. None of the algorithms in it are
appropriate for the computers we'll be using.)
Some parallel computing
available on the Web.
- Start thinking in parallel.
||Copyright © 2000-2015 Quentin F. Stout