HW8 due Wednesday
Final exam Thursday 5-8pm in 10 Evans
No regrades for HW8 (not enough time) or final exam (UCB policy)
Review session tomorrow 3-5pm in 306 Soda

Countability
We have now completed the fourth unit of this course, continuous
probability, and now move on to the last unit, diagonalization and
self-reference.

We have seen in probability the difference between discrete and
continuous random variables. The former can only take on values in a
discrete, but possibly infinite, subset of R, while the latter can
take on values in a continuous subset of R. In the discrete case, we
could talk about the event that a random variable takes on a
particular value, and we could use summations to compute
expectations and variances. In the continuous case, we had to resort
to looking at intervals of values that the random variable takes on,
and we had to use integrals to compute expectations and variances.
Why were these two cases different?

In both cases, a random variable could take on infinitely many
different values. However, they appear to be different types of
infinity, and we will now see exactly how they are different.

We follow the reasoning of Cantor, starting by a principle that we
have seen in counting:
(4) Isomorphism Principle
If two sets S1 and S2 can be put into a bijective
correspondence, then |S1| = |S2|.
Here, we use the term "cardinality" to refer to the size of a
(possibly infinite) set. To understand what the above principle
means, let's first define a bijection.

Let f: A -> B be a function from set A to set B, i.e. with domain A
and range B. Then f is "one-to-one" or "injective" if f(x) is
different for different values of x. More formally,
∀x,y∈A . x ≠ y => f(x) ≠ f(y)
∀x,y∈A . f(x) = f(y) => x = y.
(The above two statements are contrapositives of each other.)
And f is "onto" or "surjective" if f takes on all possible values
in B:
∀y∈B ∃x∈A . f(x) = y.
Finally, f is "bijective" if it is one-to-one and onto.

Then two sets S1 and S2 have a bijective correspondence if there is
a bijective function f : S1 -> S2.

As an example, we saw that the power set P(S) (set of all subsets)
of a set S = {a1, a2, ..., an} has a bijective correspondence with
the set of n-bit strings. The following function provided this
bijection:
f(x) = (g(x,a1), g(x,a2), ..., g(x,an)),
where
g(x,a) = { 1 if a∈x
{ 0 otherwise.
As an example, let S = {1, 2, 3, 4, 5}, x = {1, 3, 4}. Then
f(x) = (1, 0, 1, 1, 0).
Since P(S) has a bijective correspondence with the set of n-bit
strings, |P(S)| = |{0,1}^n| = 2^n.

The principle we use, then, is that two sets S and T have the same
cardinality if there is a bijective function between S and T.

We say that a set S is "countable" if there is a bijection between S
and some subset of N. This implies that either S has a finite
cardinality (if the subset of N is finite) or that S is "countably
infinite." In the latter case, we will see that S and N have the
same cardinality.

As an example, consider the set of positive integers Z^+. It seems
that Z^+ and N should not have the same cardinality, since Z^+ is a
strict subset of N. However, the following function demonstrates a
bijection between N and Z^+:
f:N->Z^+  f(n) = n + 1.
This function is one-to-one, since f(n) is different for different
n. It is onto, since there is an input n for every output in Z^+.
Thus, it is a bijection, so N and Z^+ have the same cardinality and
Z^+ is countable. Strange, but true, and this is what happens when
we deal with infinities.

What about the set of even integers E? The following is a bijection
between N and E:
f:N->E  f(n) = 2n.
So E is also countable, and N and E have the same cardinality, even
though we'd expect the cardinality of N to be twice that of E!

What about the integers Z? We can define the following bijection
between N and Z:
f:N->Z  f(n) = { n/2       if n even
{ -(n+1)/2  if n odd.
So we get
n    f(n)
0      0
1     -1
2      1
3     -2
4      2
...    ...,
and we see that this is indeed a bijection. Thus, Z is countable.

Enumerating the elements of S in a list implicitly defines a
bijection between S and a subset of N. If the list is finite, then
it defines a bijection between S and {0, 1, ..., |S|-1}. If the list
is infinite, then it defines a bijection between S and N. For S = Z,
we get
elements in Z  (<-->  elements in N)
0        (<-->        0)
-1        (<-->        1)
1        (<-->        2)
-2        (<-->        3)
2        (<-->        4)
...       (<-->       ...)
Thus, showing that S can be enumerated in a (possibly infinite) list
is enough to show that S is countable, since we can use the list to
demonstrate a bijection.

In enumerating a set S, though, we have to make sure that our
enumeration covers all elements in S. In other words, given any
element s∈S, it must appear at some concrete position in the
list of all elements in S. This is necessary for the implicit
bijective function we are defining to be onto.

Using the above technique, we can show that any subset T of a
countable set S is also countable. We know that S can be enumerated
in a list, so removing the elements not in T from the list for S
results in a list for T. Thus, T is countable.

This implies that any subset of N is countable, and any infinite
subset U of N (e.g. Z^+) is countably infinite, since we can
enumerate U in a list. This defines an implicit bijection between U
and N, so they have the same cardinality. This further implies that
all countably infinite sets have the same cardinality, since by
definition, a set S is countably infinite if it has a bijection with
a countably infinite subset of N.

As another example of enumeration, consider the set BS of all finite
binary strings, i.e. BS = {0,1}^*. We can enumerate BS in a list,
ordering elements first by length and then by lexicographic order:
BS = {ε, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, ...},
where ε is the empty string. Since BS can be enumerated, it
is countable.

A bad enumeration of BS is to enumerate its elements strictly in
lexicographic order:
BS = {ε, 0, 00, 000, 0000, ...}
As you can see, this enumeration doesn't cover all elements in BS,
so it is not a valid enumeration of BS.

What about the set ES of all finite English sentences? Every English
sentence can be encoded as a finite binary string (using ASCII or
Unicode), so there is a bijection between ES and a subset of BS.
Since BS is countable, any subset of BS is countable, implying that
ES is countable.

What about pairs of natural numbers, N x N? For Cartesian products
of finite sets, the product rule tells us that
|S1 x S2| = |S1| x |S2|.
So it seems like the cardinality of N x N should follow a similar
pattern, that we get some sort of ∞^2, whatever that means.
However, we can enumerate the elements in N x N, first by sum and
then by lexicographic order:
N x N = {(0,0), (0,1), (1,0), (0,2), (1,1), (2,0), (0,3), (1,2),
(2,1), (3,0), ...}.
So N x N is countable and has the same cardinality as N! And in
fact, we can use the same technique to show that the Cartesian
product N^m is countably infinite for any m∈Z^+.

What about Q, the set of rationals? Then
Q = {a/b : a∈Z ∧ b∈Z^+ ∧ gcd(a,b) = 1},
where we represent a rational as a reduced fraction with a positive
integer denominator. (By our definition, 0 has the unique
representation 0/1; for 0/b for any other b, gcd(0, b) = b ≠ 1.)
Then there is a bijection between Q and the following subset of N x
N:
S = {(a, b) : a∈Z ∧ b∈Z^+ ∧ gcd(a,b) = 1}.
Since a subset of a countable set is countable, S is countable, and
so is Q.

Every set we've looked out so far, even the complicated ones, has
been countable. Are there sets that are not countable?

Diagonalization
Let us now consider the set of real numbers in the interval [0, 1].
Is this set countable? Let's try enumerating all the elements in a
list. All real numbers have decimal representations, though those
representations may be infinite (e.g. 1/3 = 0.3333...). In fact, we
will use only infinite representations in our list, by filling with
0's if necessary (e.g. 1/4 = 0.250000...). Finally, if a number has
more than one valid decimal representation (e.g. 1 = 0.9999...),
then we pick one arbitrarily (though we pick 0.9999... for 1 so that

So let's assume that [0, 1] is countable and proceed to enumerate
its elements by writing out their decimal representations in a list.
(Since the list is infinite anyway, we don't mind writing down
infinite decimal representations in the list.) We list them in some
arbitrary order:
i     r∈[0, 1]
0    0.250000...
1    0.333333...
2    0.999999...
3    0.326244...
4    0.624601...
5    0.648756...
...       ...
In the above list, we've underlined the diagonal, so that for the
ith element in the list, its ith digit after the decimal is
underlined. This results in a new element in [0, 1]:
s = 0.239206...
Now consider a new number t, where each digit in t is the corresponding
digit in s, plus 2, modulo 10. So we get
t = 0.451428...
Clearly t∈[0, 1]. Does t appear in our enumeration above? Well,
t is not equal to element 0 in the list, since it differs from
0.250000... by 2 in digit 0 after the decimal. (Note that if two
decimal representations differ by more than 1 in a digit, then they
cannot represent the same number.) And t differs from 0.333333... by
2 in digit 1, so it is not the same as element 1. Similarly, t is
not equal to any element i in the list, since it differs from
element i by 2 in its ith digit. Thus, t is not in the list.

This is a contradiction, since the list contains all real numbers in
[0, 1], but t is in [0, 1] and is not in the list. Thus, our
original assumption that we could enumerate [0, 1] is false, and [0,
1] is not countable.

This technique was invented by Cantor and is called
"diagonalization." It involves the following steps:
(1) Assume that a set S can be enumerated.
(2) Consider an arbitrary list of all the elements of S.
(3) Use the diagonal from the list to construct a new element
t.
(4) Show that t is in S but is different from all elements in the
list and so is not in the list. Contradiction.
This shows that the original assumption that S is countable was
false. Therefore, S is "uncountable" or "uncountably infinite," and
its cardinality is strictly larger than that of N.

Since [0, 1] is uncountable, by the contrapositive of our statement
above that any subset of a countable set is countable, the set R of
all real numbers is also uncountable.

Let's look at another example of diagonalization. Consider the set
IBS of all infinite binary strings. We assume that IBS is countable,
so we can list all elements of IBS:
i      s∈IBS
0    010101...
1    111001...
2    000000...
3    010011...
4    111101...
5    000111...
...      ...
We flip the digits in the diagonal to construct a new element t that
is not on the list:
t = 101110...
Since t∈IBS but is not in the list of all elements in IBS, this
is a contradiction, and the assumption that IBS is countable is
false.

What happens if we try to use diagonalization on a countable set? If
we try to proceed as above for the set BS of finite binary strings,
the resulting t will have infinite length. (We have to change the
construction of t slightly. Since the ith string in the list of
elements of BS may have length less than i, we may not be able to
flip its ith bit to get the ith bit of t. In that case, we just
choose the ith bit of t arbitrarily, which ensures that t is
different than that ith string in the list.) Since t has infinite
length, it is not in BS and should not appear on the list anyway. So
step (4) of the diagonalization process fails, and no contradiction
is generated.