HW3 out
PA1 due Friday!

Review
We have seen how to express statements precisely, how to prove them,
and even how to analyze algorithms and prove interesting facts about
them. We are now done with the first unit of the course and turn our
attention to modular arithmetic and its applications.

Overview
What do DVDs and clocks have in common?
- They're both round.
- They both use modular arithmetic!

Suppose I want to talk to someone in Cairo, which is nine hours
ahead of Pacific time. It would probably be a good idea to figure
out what time it is in Cairo first to make sure that he isn't
asleep. If it's 5pm right now, what time is it in Cairo? We add 9 to
5 to get 14, and then subtract 12 to get 2. So it's 2am there, and I
should probably wait until tomorrow to make the call.

We use mod 12 arithmetic to calculate time. We will see how such
arithmetic works in general and look at many applications.

Standard Arithmetic
Suppose we restrict ourselves to natural numbers. What operations
are possible on the naturals?

Given two natural numbers, we can add them and end up with a
natural.

We can also multiply them and end up with a natural.

What about subtraction? If we subtract 5 from 4, we end up with -1,
which is unnatural. So in order to allow subtraction, we need to
expand our set of elements to all integers.

OK, so now we can subtract. What about division? Well, 1 / 2 is
clearly not an integer, so we need to expand our set to the
rationals.

We can continue adding operations until we end up with the reals.

On a computer, however, all of these sets are inconvenient, since
they can't be represented using a finite number of bits. We try to
get around this using floats/doubles, but even they have maximum
values, and we have to worry about roundoff, etc. So rather than
trying to represent infinite sets, let's restrict ourselves to
finite sets like we do when talking about the time.

Modular Arithmetic
To represent the time, we use mod 12 arithmetic. Once we get past
12, we wrap back around to 1. For our purposes, however, since we
are computer scientists, we like to start at 0. So instead, we will
start with 0 and wrap back around to it once we pass 11. If we have
a number much bigger than 11, then we just repeatedly subtract 12
until we get between 0 and 11.
Ex: What is mod(22, 12)? 10.
What is mod(146, 12)? 2.

For non-negative x, m, we define
mod(x, m) = x - floor(x/m) * m.
This tells us how many multiples of m to subtract, namely
floor(x/m).

What if we have numbers less than 0? Then we just add a multiple of
12 until we get between 0 and 11.

We can now define equivalence classes that consist of all integers
that are the same mod 12.
Ex: -12, 0, 12, 24, 36 are equivialent
-11, 1, 13, 25, 37 are equivalent

We use x ≡ y (mod m) to denote the fact that x and y are in the same
equivalence class mod m.
Ex: 2 ≡ 14 (mod 12)
3 ≡ 147 (mod 12)
What does this actually mean? It means that 2 and 14 differ by a
multiple of 12. More formally, (2 - 14) = 12k for some integer k.
In general, if a ≡ b (mod m), then (a - b) = mk for some integer k,
or a = b + mk.

We can add, multiply, and subtract mod m: we do so as if we were
dealing with integers, and then convert to the right equivalence
class mod m.
Ex: 2 * 32 ≡ 64 ≡ 4 (mod 12)

Let's write out a few addition and multiplication tables.

+ 0 1 2   * 0 1 2   + 0 1 2 3   * 0 1 2 3
0 0 1 2   0 0 0 0   0 0 1 2 3   0 0 0 0 0
1 1 2 0   1 0 1 2   1 1 2 3 0   1 0 1 2 3
2 2 1 0   2 0 2 1   2 2 3 0 1   2 0 2 0 2
3 3 0 1 2   3 0 3 2 1

They're just what you would expect, with the numbers reduced mod m.

For standard arithmetic, we have a number of useful identities. For
example, we know that if a = c and b = d, then a + b = c + d. Does
the same identity hold in modular arithmetic?
If a ≡ c (mod m), then (a-c) = km for some k. Similarly, (b-d) =
lm for some l. Then (a+b) - (c+d) = (a-c) + (b-d) = (k+l)m, so
(a+b) ≡ (c+d) (mod m). The identity still holds.
Similarly, we have if a ≡ c (mod m) and b ≡ d (mod m), then ab ≡ cd
(mod m).

This makes computing modular arithmetic expressions much easier. We
can reduce operands before performing an operation so that we work
with smaller numbers.
Ex: (13+11) * 18
≡ (6+4) * 4 (mod 7)
≡ 10 * 4 (mod 7)
≡ 3 * 4 (mod 7)
≡ 12 (mod 7)
≡ 5 (mod 7)

Now that we have addition, multiplication, and subtraction (which is
we had to introduce rationals to be able to express the result of
1/5, but we'd like to stick to our nice, finite set this time.

We can actually reduce division to multiplying by a reciprocal, or
multiplicative inverse. Thus, to divide by 5, we instead multiply by
its inverse 1/5. Any number x, when multiplied by its inverse 1/x,
results in 1. Thus, when it comes to modular arithmetic, the inverse
of a number x (mod m) should give us 1 (mod m) when multiplied by x.

Let's look at our mod 3 multiplication table. Does every number have
an inverse? We see that 1 * 1 ≡ 1 (mod 3), so 1 is its own inverse.
Similarly with 2. What about 0?

What about mod 4? We see that 1 and 3 have inverses, but 0 and 2
don't.

So some numbers have an inverse mod m, and some don't. Can we come
up with a general rule for when an inverse exists?

Does 3 have an inverse mod 12? We would require a value x such that
3a ≡ 1 (mod 12)
3a = 12k + 1 for some integer k
But this is impossible. No matter what k is, 12k will be a multiple
of 3, so 12k+1 cannot be a multiple of 3.

In general, if we want the inverse of x mod m, we need
ax ≡ 1 (mod m)
ax = km + 1 for some integer k
If x and k share any prime factor p, then we have
(multiple of p) = (multiple of p) + 1,
which is impossible! Thus, if gcd(x, m) ≠ 1, then there is no
inverse of x mod m.

What if we are working mod some prime, say 5. Does every non-zero
number now have an inverse?
We don't know yet! The statement we proved is equivalent to
x has inverse mod m => gcd(x, m) = 1 (contrapositive)
To conclude that
gcd(x, m) = 1 => x has inverse mod m
is a converse error! It turns out that it is true, but we have
to prove it.

It seems hard to prove. So once again, we are desperate, and what
do we do when we are desperate? Prove something harder!

Claim: If gcd(x, m) = 1, then the values a*x where a = 0, 1, ...,
m-1 are all distinct modulo m.
Note that if this is true, then there must be some a such that a*x
≡ b (mod m) for any b. In particular, there must be some a
such that a*x ≡ 1 (mod m), which is the inverse of x mod m.
Proof: Suppose for the purpose of a contradiction that there are
a1, a2 such that a1 ≠ a2 (mod m) and a1 x ≡ a2 x (mod m). Then
we have (a1 - a2) x = km for some integer k. The RHS is a
multiple of m, so the LHS must be a multiple of m. Since gcd(x,
m) = 1, it must be that (a1-a2) is a multiple of m, i.e. a1 - a2
= lm for some integer l. This implies that a1 ≡ a2 (mod m),

So we see that gcd(x, m) = 1 <=> x has inverse mod m. Thus, GCD is
important, and we now turn our attention to computing it.

GCD
How can we compute gcd(x, y)? The simplest way is to just try every
number 1, 2, ..., min(x,y) and find the largest one that divides
both x and y. But this is really slow. As we will see in RSA, we
need to compute GCDs of large numbers, say 128 bit. So if x and y
are around 2^128, we would need to test around 2^128 possible
divisors. If we do that, the answer won't matter anymore, because
we'll all be dead before we finish. So we want something much
faster, say on the order of 128 or 128^3 or something reasonable
like that.

live forever, so he came up with an algorithm that wouldn't take
forever. It relies on the following fact:
gcd(x, y) = gcd(x - y, y) [assume x >= y >= 0]
Proof: If d divides x, y, then x = kd, y = ld, so d divides x - y
= (k-l)d. If d divides (x-y) and y, then x-y = kd, y = ld, and x
= (k+l)d, so d divides x as well.

So we know that gcd(568, 132) = gcd(436, 132), since 568-132=436.

In fact, if we apply the above fact many times, we can show that
gcd(x, y) = gcd(mod(x, y), y) [x >= y >= 0].

So using the above fact, here is Euclid's algorithm:
gcd(x, y):
if y = 0 then:
return x
else:
return gcd(y, mod(x, y))

Let's try to prove that the algorithm is correct for all x,y >= 0.
We have two variables here, and we've only seen induction over a
single variable n to prove ∀n∈N . P(n). So what variable
do we do induction over?

In general, determining the induction variable can be tricky when
we have multiple variables. If we make the wrong choice, it makes
our job a lot harder in the proof.

Here, let's do induction over y. So we define
P(y) = ∀x∈N . x >= y => algorithm computes gcd(x, y),
and we want to prove ∀y∈N . P(y).

Proof by strong induction:
Base case: y = 0
Algorithm returns x, which is correct since x|x and x|0.
IH: Assume P(k) for all 0 <= k <= y.
IS: We need to prove P(y+1). We know by our lemma above that
gcd(x, y+1) = gcd(mod(x, y+1), y+1), which is the same as
gcd(y+1, mod(x, y+1)). This is what the algorithm returns, and
since mod(x, y+1) < y+1, by the IH, it computes it correctly.

Here's an example of running the algorithm on 568, 132
gcd(568, 132)
gcd(132, 40)
gcd(40, 12)
gcd(12, 4)
gcd(4, 0)
4

Notice that the numbers get quite a bit smaller in each
iteration. In fact, after two iterations, we can prove that the
first argument x goes down by at least a factor of 2. Thus, the
number of iterations is logarithmic in x, i.e. linear in the number
of bits in x. The total running time is actually O(n^3), where n is
the number of bits in x, since each iteration actually takes O(n^2)
time.
Proof by cases:
Case 1: x/2 >= y
Then the first argument in the next iteration is y <= x/2.
Case 2: x >= y > x/2
Then the arguments in the next iteration are (y, mod(x, y)), and
then in the iteration after that (mod(x, y), mod(y, mod(x, y))).
So the first argument is mod(x, y) after two iterations. But
mod(x, y) <= x - y < x - x/2 = x/2 since y > x/2.

So now, we can compute gcd(x, y), so we can tell if x has an inverse
mod y. But how do we determine what the inverse actually is? We will
see next time.