How to make sure that microchips
work as intended

Valeria Bertacco
Stanford University, 2002

Ph.D. Thesis - Elevator talk

Microchips nowadays have to take care of many complex things: running your car, switching the lights downtown, navigating communication satellites and maybe even pacing your heart. Unfortunately, they rarely work right the first time they are designed. My work is to develop new software that helps engineers automatically verify that complex microchips are designed correctly the first time around and work as intended. Verification -- the activity of checking that designs are correct -- involves making sure that there are no "hidden bugs", which could have rare but catastrophic consequences such as, for instance, having two trains occasionally run into each other.

My research ensures that what engineers mean to design corresponds to the chip's actual behavior. Because of the complexity of these designs, microchips often have additional unexpected behavior that could be dangerous, or they behave properly most of the time but not always. One famous and also one of the first examples of this type of situation occurred in 1994, when it was found that the Intel Pentium microprocessor could make errors in performing divisions. Even though this error was very rare, the business and perceived implications forced Intel to replace all the Pentium processors in the market and suffer huge financial costs.

The microelectronic industry has been using, up until now, logic simulation to perform verification. This technique tries to simulate the behavior of the design for as many different conditions as time and resources permit. A logic simulator uses a description of the microchip behavior, which is usually already available as part of the design process, takes some inputs provided by the engineers, and computes the output that the microchip would generate with those specific inputs. This output is then compared to the expected result. If the simulated and expected results match, the circuit is operating correctly at least in that specific case; otherwise there is a possible bug in the circuit. For instance, if the microchip was for a hand calculator, one possible logic simulation would be to provide the input 3 times 5 to the logic simulator and check that the output is actually 15. However, because of the complexity of these designs, too many different inputs values need to be verified, and it often becomes impossible to make sure that everything works correctly, even using "verification farms", thousands of computers running logic simulations day and night for entire weeks.

Because of this, in recent years, researchers have come up with alternative types of simulators that can check many inputs -- in fact all possible inputs -- at the same time. For the simple hand calculator example, these new methods can check that the multiplication of two values a and b always produces the value a times b, no matter the specific values, all in a single simulation. This new family of simulators that can check many inputs at the same time is called symbolic simulators. Searching for a design error using a symbolic simulator would be like searching for a shoe that you lost under your bed; only that, instead of just blindly moving your hand around under the bed, like a logic simulator would do, now you can move away the bed altogether, so that you can see the whole area under it with a single glance. In technical terms, this is called breadth-first search.

These new methods however have some limitations. The problem is that they can only work with simple designs. In fact, any symbolic simulator needs to manipulate a lot of algebraic expressions, much more complex than the "a times b" of the hand calculator example, and this often requires more resources than are available. My research has developed new techniques to improve on the capabilities of these symbolic simulators so that they need less computing power and in particular less memory. My key contribution relies on a compact representation of the algebraic expression so that we can exploit the specific structure of the design we are simulating. The result is that these new techniques can now verify complex designs as logic simulators do while at the same time maintaining the breadth-first approach of the new symbolic methods.

The increased power of symbolic simulation has the potential of becoming a valid and robust alternative to the current verification practices of the microelectronic industry. In a broader context, this will lead to the ability of developing better and faster digital circuits in less time and with lower costs. So, next time you are stuck in traffic at rush hour, be confident that the lights will eventually let you get out of there.

This statement has been developed as part of the I-RITE Stanford Research Communication Program in Fall 2002.