# EECS 583 – Winter 2023 – Midterm Exam

Wednesday, March 15, 2023 Time constraint: 1 hr 20 min Open book, open notes

| Name:                                                           |                                                                                                                       |
|-----------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------|
| Please sign indicating that you have<br>University of Michigan. | e upheld the Engineering Honor Code at the                                                                            |
| "I have neither given nor received aid on this                  | s examination."                                                                                                       |
| Signature:                                                      |                                                                                                                       |
| *                                                               | sections. The point value for each question is show your work unless the answer is obvious. If you f the exam sheets. |
| Part I: Short Answer 7 questions, 40 pts total                  | Score:                                                                                                                |
| Part II: Long Answer 4 questions, 60 pts total                  | Score:                                                                                                                |
| Total (100 possible):                                           |                                                                                                                       |

# Part I. Short Answer (Questions 1-7) (40 pts)

1) In register allocation, live ranges are constructed by taking the intersection of 2 dataflow analyses. Name the 2 dataflow analyses used. (5 pts)

2) In liveness analysis, how would the analysis results be changed if the meet function was modified from using union of live variables (IN sets) of the successor blocks to *intersection* over the same sets of variables? Briefly explain your answer. (5 pts)

3) Calculate the *liveness IN set* for the following basic block, where the OUT set is calculated as {r3,r4}. (5 pts)

$$OUT = \{r3, r4\}$$

4) We know from class that common subexpression analysis (CSE) creates opportunities to apply copy propagation because it introduces moves. Is the reverse true, where copy propagation creates opportunities for CSE? If yes, show an example with at most 3 instructions. If not, explain why. (5 pts).

5) Using profile data is generally viewed as a way to improve the effectiveness of compilers. Are there any *potential pitfalls* (e.g., making application performance worse) using profile data? Briefly explain why or why not. (5 pts)

6) Calculate the *critical path length* of the following basic block. Assume the following instruction latencies: load = 2, add = 1, mpy = 3. (5 pts)

```
1. r1 = load(r6)
```

2. 
$$r2 = add(r7,3)$$

3. 
$$r3 = mpy(r1,r2)$$

5. 
$$r5 = add(r3,r4)$$

<sup>4.</sup> r4 = load(r8)

7) Given the following *scheduled* superblock with associated liveness information for each exit point, determine the basic blocks that instructions 1, 2, and 3 could have originated from. Note that instructions may have originated from multiple basic blocks. You should make **no assumptions** about the addresses used by the load and store instructions. Stores are formatted as store (address, data). (10 pts)



Circle the basic blocks that the instruction could have originated from:

| Instruction 1: | BB1 | BB2 | BB3 |
|----------------|-----|-----|-----|
| Instruction 2: | BB1 | BB2 | BB3 |
| Instruction 3: | BB1 | BB2 | BB3 |

## Part II. Longer Problems (Questions 8-11) (60 pts)

- **8)** For the following code:
  - a. Draw the control flow graph (CFG) (6 points)
  - b. Compute the CD (Control Dependence) sets for each Basic Block (BB) (6 points).
  - c. Using the CDs, determine the minimum number of predicates required to if-convert the code. (3 points)

Hint: The CFG should contain 7 basic blocks, not including pseudo entry/exit basic blocks. Some basic blocks may be empty.

```
do {
    if (a > 10) {
        if (b < 10) {
            c += 10;
        } else {
            b -= 1;
        }
        a -= 1;
    }
    else {
        b += 1;
        a -= 2;
    }
} while (a > 0);
```

## **Control Flow Graph**

| ВВ | CD set |
|----|--------|
| 1  |        |
| 2  |        |
| 3  |        |
| 4  |        |
| 5  |        |
| 6  |        |
| 7  |        |

The minimum number of predicates required to if-convert the code is \_\_\_\_\_\_.

- 9) Satisfy static single assignment (SSA) form by filling in the blanks in the code segment below. (15 points)
  - The result and arguments of a phi node must be different instances of the same variable (e.g. y1 = phi(y2, y3)).
  - Some phi nodes may be unnecessary and should be left empty.
  - Choose operands between a0 and a8, and b0 and b8. Repetition is allowed.



- **10)** The following profile augmented CFG shows the code after optimizing it using LICM. Four instructions (I1 to I4) were hoisted, **one each** from BB2, BB3, BB4, and BB5. You can assume that none of the instructions cause exceptions. The number nearest the branch arrow indicates the number of times that branch was taken.
  - a. Find a mapping of instructions to basic blocks such that each instruction (I1, I2, I3, I4) is mapped to a **different** basic block (BB2, BB3, BB4, BB5) such that the instruction could have been hoisted from that basic block. (10 points)
  - b. Calculate the savings, in terms of the number of **dynamic** instructions reduced, by this optimization. (5 points)



| Instruction          | I1 | 12 | 13 | 14 |
|----------------------|----|----|----|----|
| Original Basic Block |    |    |    |    |

- 11) Given below is a loop dependence graph and a processor model. (M), (A), and (B) refer to memory, ALU, and branch instructions respectively. The memory instructions use the memory units and the ALU and branch instructions use the ALU units.
  - a. Determine the MII. Show your working. (5 points)
  - b. Generate the rolled and unrolled schedules. Lower instruction numbers will have a higher priority, i.e. instruction 1 has the highest priority. (10 points)

| Processor Model |
|-----------------|
|-----------------|

- 4 fully pipelined function units
- 2 ALU units
- 2 MEM units

### Unrolled Schedule (may contain extra rows)

|   | ALU1 | ALU2 | MEM1 | MEM2 |
|---|------|------|------|------|
| 0 |      |      |      |      |
| 1 |      |      |      |      |
| 2 |      |      |      |      |
| 3 |      |      |      |      |
| 4 |      |      |      |      |
| 5 |      |      |      |      |
| 6 |      |      |      |      |
| 7 |      |      |      |      |
| 8 |      |      |      |      |

| (2(M)) (4(M))                                         |
|-------------------------------------------------------|
| (2,0)<br>(3(A) (2,0)                                  |
| (1,0)                                                 |
| (1,1) $(1,0)$ $(1,1)$ $(1,0)$ $(1,1)$ $(1,0)$ $(1,1)$ |
| 9(B)                                                  |

#### Rolled Schedule (may contain extra rows)

|   | ALU1 | ALU2 | MEM1 | MEM2 |
|---|------|------|------|------|
| 0 |      |      |      |      |
| 1 |      |      |      |      |
| 2 |      |      |      |      |
| 3 |      |      |      |      |