Algorithms

An algorithm is a step-by-step procedure for solving a problem. A flow chart is a visual representation of an algorithm and a computer program is an implementation of an algorithm. Algorithms are the main focus of the software part of computer science and engineering. People get their PhD's for discovering new algorithms - clever, fast, efficient new ways of doing things. For example, there are lots of ways to sort lists of thing - lots of sorting algorithms. Some are easy to write, some are really fast most of the time, some are guarenteed to be good enough all the time, etc. A program designer chooses the algorithm that best fits the design requirements. Most people get their PhD in Computer Science for inventing an algorithm and proving that it is better than other ones already invented. Even for things as simple as sorting, there are lots of algorithms with widely varying perfomance. Check out the relative performance of some sorting algorithms here.

Analysis of Algorithms

For most programs you will ever have to write, there are going to be many different ways of doing them. Even with these small, simple programs, many of your questions are answered by responding "that is a design decision". In otherwords, different people would do it different ways. Some design decisions don't make much difference in how a program runs or how much of the computer's resources it uses. For example, whether you call a variable xvar or fred doesn't matter at all. Or whether you write a prompt that says "Please enter an integer: " or "Gimme some value, stupid!" won't affect the time it takes for the program to execute its statements or how much space it takes up in memory, though it may effect how people feel about using it! Here we are going to talk about how your code effects computer resources, especially CPU time.

How fast a program runs depends on lots of things. First of all, it depends on what computer you are running it on. If you are executing a program on a Turing Machine, you might have to wait years for something that would execute in a few seconds on a Cray. If you run several different programs that do the same thing on the same machine (e.g., if we ran everyone's code for program 4 on the same computer), they will take different amounts of time to execute. If one program is written to take full advantage of the fastest opcodes for a given CPU, it will run faster than a program that isn't. This is called system efficiency. System efficiency can be effected by which complier you use to compile your code and what kinds of optimization they do.

Another factor used to evaluate algorithms is how much computer memory they use. This is called space efficiency. If a person writes code that uses a lot of variables, it will take up more room than one that needs to use fewer. (Remember, every variable and parameter you declare takes up a slot in the activation record.) A program that manipulates images or sound files will take up a of memory. Image processing programs often have to store several sets of temporary image sized data. If one algorithm needs only 1 or 2, then it is more space efficient than one that needs 5 or 6.

The most common measure used to compare algorithms is computational efficiency. This measure has been designed to be independant of the size of the computer memory, the speed of the CPU, the efficiency of the compiler, or the optimization level used. It bases the evaluation of the algorithm on how hard it makes the computer work - how many comparisons it does, how many loops it has and how often the loops go around, and the number of assignment statements. Essentially it measures the amount of handling each item of data requires. Algorithms are evaluated on how they would perform in the best case situation, the worst case situation, and the average case situation.

Think of the Sorted Linked List ADT from the last lecture note. If the user typed the numbers in order with the largest one first, the program would be really fast, since it would just compare the input with the first value on the list and know it needed to insert it at the top. If the user typed in the numbers from 100 to 1, it would take as long to insert the last one as it did to insert the first one! This is the best case for the algorithm implemented by the Sorted Linked List ADT. On the other hand, if the user entered the small numbers first, the program would have to search every item on the list till it got to the end, before it could insert it. The 1 would go in fast, but it would take a long time to find the place to insert the 100, since it would have to check all the other 99 values, instead of just 1! This is the worst case. On average, you should only have to search halfway through the list to find the place to insert. This idea of computational efficiency is formalized into what is known as Big-O Notation.

Big-O Notation

The time it takes for a computer program to execute depends on how much data it has to input and how many comparisons it does with the data. Your program 4 runs fast on small files with few () and {} and takes a long tome on large files with lots of () and {}. Computational efficiency estimates how the time it takes to run the program varies with the size of the input. Essentially it is an estimate of the rate of increase in the time it takes to run a program as its input gets bigger. It is based on the structure of the algorithm (the program) itself and is independant of what machine it is run on. For example, some programs take about twice as long to run if you give them twice as much data. Other programs that are less efficient may take about 4 times as long to run if they are given twice as much data.

Computer scientists needed a notation so they could compare the computational efficiencies of the algorithms they used without having to write a whole paragraph. So they chose the function notation from mathematics. If the value of a function depends on a parameter x, you write that as f(x). The formal definition of computational complexity is:


A function f(n) is O(g(n)) if there is a constant K and a count N
such that f(n) <= K * g(n), for n >= N.

n is the number of pieces of data the program inputs and processes. You count up the number of operations on each data item in the program and make the function f(n) out of the count. For example, in program 4 you read in each character, compare it to 4 different characters (is it a '('? is it a ')'? etc.), maybe push it on the stack, maybe pop something and compare it, and maybe increment a counter. So the number of comparisons at the beginning is 4n (each of the n input characters gets compared 4 times, roughly) and maybe 1/10 of them gets pushed and maybe another 1/10 gets popped and compared. So, for program 4 f(n) = 4n (for the 4 comparisons of each input) + 1/10n (for the ones that get pushed) + 1/10n (for the popping) + 1/10n (for the comparing after it is popped) + (maybe) 1/100n (for incrementing the error counter).

f(n) = 4n + 1/10n + 1/10n + 1/10n + 1/100n
     = 4n + 3/10n + 1/100n

(Notice that a lot of this is made up - I don't know for sure if 1 out of every 10 characters is a '(' or a '{', so I guessed! If the file was a text file instead of a C program, I would be way off! big-O notation is a rough approximation in most cases, although there are some algorithms for which exact computations can be made for worst case or best case analysis.)

Well, this equation for f(n) is kind of complicated and this is a really simple program! (Imagine computing f(n) for MicroSoft Word!) Computer scientists are only interested in things like this for really big values of n. Computers are so fast these days that it really doesn't matter for small amounts of input. So lets think about a really big number of inputs - N. If there are millions of characters being entered, 1/100 N is not going to add much to the time it takes, and neither will 3/10n. So we can forget them. So now we can think of f(n) of being about equal to 4n.

This is just an approximation, but it gives us a way of talking in simple terms about complicated algorithms. Here are some simple examples using big O.

Consider a program that prints out your name. It takes the same amount of time to execute, no matter what. If a program takes a constant number of inputs and has no loops its computational complexity is constant. In big-O notation, you write O(1).

Next consider a program that reads in characters from a file and counts them. It will take it twice as long to count the characters in a file that was twice as big. If a program's runtime grows linearly as the amount of input data increases, it is O(n), where n is the number of data items.