```Administrative info
PA2 due Monday
HW5 due Tuesday

Review
Recall that a probability space consists of the following:
(1) A random experiment.
(2) The sample space (set of possible outcomes).
(3) The probability of each possible outcome of the experiment.
Further recall that the probabilities must satisfy the following:
(1) ∀ω∈Ω . 0 <= Pr[ω] <= 1
(2) ∑_{ω∈Ω} Pr[ω] = 1
An event is a subset of the sample space, i.e. a set of outcomes
from the sample space. The probability of an event E is the sum of
the probabilities of the outcomes in E:
Pr[E] = ∑_{ω∈E} Pr[ω].
In the case of a uniform distribution, this simplifies to
Pr[E] = |E|/|Ω|.

Probability Identities
Before we move on, let's note some facts about probability that can
make it easier to compute probabilities.

We defined the complement of an event E as E = &Omega\E. Then,
Pr[E] = 1 - Pr[E]. Proof:
∑_{ω∈Ω} Pr[ω]
= ∑_{ω∈E;} Pr[ω] +
∑_{ω∈Ω\E} Pr[ω]
= Pr[E] + Pr[E] = 1.

Let A and B be events in Ω. Then Pr[A ∪ B] = Pr[A] + Pr[B]
- Pr[A ∩ B]. Writing this out in terms of sums, we get
∑_{ω∈Ω} Pr[ω]
= ∑_{ω∈A;} Pr[ω] +
∑_{ω∈B} Pr[ω] -
∑_{ω∈A ∩ B} Pr[ω]
As in inclusion/exclusion for sets, the first two terms double count
the probabilities of those outcomes in A ∩ B, so we have to
subtract the probability of A ∩ B.

EX: What is the probability that a random integer n between 1 and
100 is divisible by 5 or 7?
ANS: Let A be the event that n is divisible by 5, B be the event
that it is divisible by 7. Pr[A] = 1/5, Pr[B] = 14/100 = 7/50,
and Pr[A ∩ B] = 2/100 = 1/50. So Pr[A ∪ B] = 1/5 + 7/50
- 1/50 = 16/50 = 8/25.

Let A_1, ..., A_n be n mutually disjoint events in Ω. Then
Pr[A_1 ∪ ... ∪ A_n] = Pr[A_1] + ... + Pr[A_n]. This follows
from the above, generalized to n events using induction, and then
removing the intersection terms which are all 0.

EX: Suppose I roll a red and a blue die. What is the probability that
the red die is less than 4?
ANS: Let A_i be the event that the red die is i. Then Pr[A_i] = 1/6
for 1 <= i <= 6, and the A_i are mutually disjoint. Thus,
Pr[A_1 ∪ A_2 ∪ A_3] = Pr[A_1] + Pr[A_2] + Pr[A_3] =
1/2.

Conditional Probability
A pharmaceutical company is marketing a new test for HIV that it
claims is 99% effective, meaning that it will report positive for
99% of people who have HIV and negative for 99% of those who don't
have HIV. Suppose a random person takes the test and gets a positive
test result. What is the probability that the person has HIV?

This is an example of conditional probability. Given some
information about a particular event in the sample space, we want to
compute new probabilities for other events.

Let's start off with simpler examples before coming back to the
above.

EX: Suppose I flip a fair coin twice. The result of the first flip
is heads. What is the probability that I got two heads?
ANS: Let's start by drawing the sample space Ω. There are 4
equally likely outcomes HH, HT, TH, and TT. We are now told
that event A = "the first flip is H" has occurred. Which
outcomes are now possible? There are only 2 outcomes in A, HH
and HT, each of which is equally likely. So we have a new
sample space Ω' that consists of just the outcomes HH and
HT, each with probability 1/2. Let event B = "both flips are
heads." What is the probability of B in this new sample space?
Only one of the two outcomes in Ω' is in B, so Pr[B] =
1/2 in the new sample space. We write this as Pr[B|A], "the
probability of B given A," which is the probability of B
occurring in a new sample space consisting of just those
outcomes in A.

Generalizing the above procedure, suppose we are told an event A
occurs. Then what is the new conditional probability of each outcome
ω, i.e. Pr[ω|A]? For ω ∉ A, this is clearly
0. For ω ∈ A, the relative likelyhood of any two outcomes
in A should remain the same, but we need to renormalize so that we
satisfy the requirement that all probabilities add to 1. By
definition, we had ∑_{ω ∈ A} Pr[ω] = Pr[A], so
if we normalize by dividing by Pr[A], i.e. Pr[ω|A] =
Pr[ω]/Pr[A], we get ∑_{ω ∈ A} Pr[ω|A] =
∑_{ω ∈ A} Pr[ω]/Pr[A] = Pr[A]/Pr[A] = 1.

Now suppose we have another event B. What is Pr[B|A]? The outcomes
in B that are not in A contribute nothing, since their new
conditional probabilities are 0. So only the outcomes in both B and
A contribute any probability, and we get Pr[B|A] = ∑_{ω
∈ B ∩ A} Pr[ω|A] = ∑_{ω ∈ B ∩ A}
Pr[ω]/Pr[A] = Pr[B ∩ A]/Pr[A].

To summarize, when conditioning on an event A, we cross out any
possibilities that are incompatible with A and then renormalize by
1/Pr[A] so that the probabilities of the remaining outcomes add to
1. We can compute the probabilities of events directly in this new
sample space or use the identities above to get the same result.

EX: Suppose I toss a red and a blue die, and I tell you that the
resulting sum is 4. What is the probability that the red die is
1?
ANS: Let A be the event that the sum is 7, B be the event that the
red die is 1. The outcomes (1, 3), (2, 2), and (3, 1) are in A,
so Pr[A] = 1/12. What is Pr[B ∩ A]? Only the outcome (1, 3)
is in B ∩ A, so Pr[B ∩ A] = 1/36. Then Pr[B|A] = Pr[B
∩ A]/Pr[A] = 1/3.
We could also have redefined the sample space to come up with
the same result. Given A, we have a new sample space Ω'
consisting of the outcomes (1, 3), (2, 2), and (3, 1), each
with probability 1/3. Then B has probability 1/3 in this new
sample space. So Pr[B|A] = 1/3.

EX: Suppose I toss a red and a blue die, and I tell you that the
resulting sum is 7. What is the probability that the red die is
1?
ANS: Let A be the event that the sum is 7, B be the event that the
red die is 1. Pr[A] = 1/6 as we computed before. What is Pr[B
∩ A]? Only the outcome (1, 6) is in B ∩ A, so Pr[B
∩ A] = 1/36. Then Pr[B|A] = Pr[B ∩ A]/Pr[A] = 1/6.

EX: Suppose I toss 3 balls into 3 bins (with replacement). Let
A = "1st bin empty," B = "2nd bin empty." What is Pr[A|B]?
ANS: Pr[B] = 2^3/3^3 = 8/27, Pr[A ∩ B] = 1/3^3 = 1/27, so
Pr[A|B] = (1/27)/(8/27) = 1/8.
Thus, the fact that the 2nd bin is empty makes it much less
likely that the 1st one is as well.

EX: Suppose I flip a fair coin 51 times. If the first 50 flips are
heads, what is the probability that the 51st is heads?
ANS: Let A be the event that the first 50 flips are heads, B be the
event that the 51st is heads. There are only 2 outcomes in A
out of 2^51, so Pr[A] = 1/2^50. There are 2^50 outcomes in B,
so Pr[B] = 1/2. Only one outcome is in both A and B, so Pr[A
∩ B] = 1/2^51. Then Pr[B|A] = (1/2^51)/(1/2^50) = 1/2.
So the first 50 flips tell us nothing about the 51st; the
probability of heads is still 1/2.

We have seen multiple examples where Pr[B|A] = Pr[B]. We say that A
and B are "independent" of this is the case. Intuitively, two events
A and B are independent of knowing that one happens does not change
the likelihood of the other happening. So the 51st flip of a fair
coin is independent from what came up before.

If A and B are independent, we get Pr[B|A] = Pr[B ∩ A]/Pr[A] =
Pr[B], so Pr[B ∩ A] = Pr[A] Pr[B]. This is a very useful
identity.

EX: Suppose I flip a coin with probability p of heads n times. What
is the probability of a particular outcome with k heads?
ANS: Each flip is independent, with probability p for heads. The k
heads flips have probability p, and the n-k tails flips have
probability (1-p). So we get p^k (1-p)^(n-k) for an outcome
with k heads.

EX: Suppose a casino advertises the following game. You pick a
number from 1 to 6. The casino rolls three dice, and if your
number comes up, you win. What is your probability of winning?
ANS: It's not 1/2! Let A_i be the event that your number comes up
on the ith die. We want to know
Pr[A_1 ∪ A_2 ∪ A_3]
= 1 - Pr[A_1 ∩ A_2 ∩ A_3]
= 1 - Pr[A_1] Pr[A_2] Pr[A_3]
= 1 - (5/6)^3 ≈ 1 - 0.58 = 0.42.
In the third line above, we used the fact that the results of
each dice are mutually independent. We will come back to the
concept of mutual independence later.
So your probability of winning is less than 1/2.

Suppose you are flying to Las Vegas (in order to play the game
above). Your friend, fearing for your safety, gives you the
following advice: "You know, you should always carry a bomb on an
airplane. The chance of there being one bomb on the plane is pretty
small, but the chance of two bombs is miniscule. So by carrying a
bomb on the airplane, your chances of being blown up are
astronomically reduced." What do you think of his advice?

Let A be the event that you carry a bomb on board, B be the event
that someone else carries a bomb on board. How are A and B related?
They are independent, so Pr[B|A] = Pr[B], and the likelihood that
someone else has a bomb doesn't change one bit if you bring one
aboard.

```