9 Chapter 2 Exercise Solutions

9.1 Solution to Exercise 2.2

There are 3 possible prizes in each of 3 boxes, so there are \(3^3 = 27\) possible outcomes.
See Table 9.1 (ignore \(X\) and \(Y\) columns for now); there are 27 rows, each row for a different possible outcome.

Table 9.1: Sample space for Exercise 2.2

box1	box2	box3	X	Y
1	1	1	1	3
2	1	1	2	2
3	1	1	2	2
1	2	1	2	2
2	2	1	2	1
3	2	1	3	1
1	3	1	2	2
2	3	1	3	1
3	3	1	2	1
1	1	2	2	2
2	1	2	2	1
3	1	2	3	1
1	2	2	2	1
2	2	2	1	0
3	2	2	2	0
1	3	2	3	1
2	3	2	2	0
3	3	2	2	0
1	1	3	2	2
2	1	3	3	1
3	1	3	2	1
1	2	3	3	1
2	2	3	2	0
3	2	3	2	0
1	3	3	2	1
2	3	3	2	0
3	3	3	1	0

9.2 Solution to Exercise 2.3

See the sample space \(\Omega\) of 27 possible outcomes in Table 9.1.

\(A_1^c = \{222, 223, 232, 322, 233, 323, 332, 333\}\) is the event that none of the boxes contain prize 1, so \(A_1\) consists of the \(27-8 = 19\) other outcomes.
\(B_1 = \{111\}\)
\(A_1 \cap A_2 \cap A_3 = \{123, 132, 213, 231, 312, 321\}\) is the event that at least one of each prize is obtained (that is, a complete set of prizes)
\(A_1 \cup A_2 \cup A_3 = \Omega\), the set of all possible outcomes; you have to get at least 1 of one of the prizes.
\(B_1 \cap B_2 \cap B_3=\emptyset\); you can’t get only prize 1 and only prize 2
\(B_1 \cup B_2 \cup B_3 = \{111, 222, 333\}\) is the event you only obtain one of the prizes (in every box)

9.3 Solution to Exercise 2.4

See Figure 9.1

Figure 9.1 (a): \(A\), Katniss’s dart lands within 1 inch of the center of the dartboard.
Figure 9.1 (b): \(B\), Katniss’s dart lands more than 1 inch but less than 2 inches away from the center of the dartboard.
Figure 9.1 (c): \(E\), Katniss’s dart lands within 1 inch of the outside edge of the dartboard.

9.4 Solution to Exercise 2.6

See Table 9.1
The possible values of \(X\) are \(\{1, 2, 3\}\)
The possible values of \(Y\) are \(\{0, 1, 2, 3\}\)
The possible \((X, Y)\) pairs are \(\{(1, 0), (1, 3), (2, 0), (2, 1) (2, 2), (3, 1)\}\), In particular, the following pairs are NOT possible: \((1, 1), (1, 2), (2, 3), (3, 0), (3, 2), (3, 3)\)
\(\{X = 1\} = \{111, 222, 333\}\) is the event that only one distinct prize is obtained (that is, you get the same prize in every box)
\(\{X=2\}\) is the event you get two distinct prizes, which consists of 18 putcomes. It’s easier to write \(\{X = 2\}^c = \{111, 222, 333, 123, 132, 213, 231, 312, 321\}\).
\(\{X = 3\} = \{123, 132, 213, 231, 312, 321\}\) is the event that you get all 3 prizes; that is, the event that you get the complete set
\(\{Y = 0\}=\{222, 223, 232, 322, 233, 323, 332, 333\}\) is the event that none of the boxes contain prize 1
\(\{Y = 1\}=\{122, 123, 132, 133, 212, 213, 312, 313, 221, 231, 321, 331\}\) is the event that exactly one of the boxes contains prize 1
\(\{Y = 2\}=\{112, 113, 121, 131, 211, 311\}\) is the event that exactly two of the boxes contain prize 1.
\(\{Y = 3\}=\{111\}\) is the event that all three of the boxes contain prize 1.
\(\{X = 2, Y = 1\} = \{122, 133, 212, 313, 221, 331\}\) is the event that one box contains prize one and the other two boxes contain either both prize 2 or both prize 3.
\(\{X = Y\} = \{112, 113, 121, 131, 211, 311\}\) is the event that the number of boxes that contain prize 1 is equal to the number of distinct prizes obtained (in this case it only happens if both \(X\) and \(Y\) equal 2)
Let \(I_1\) be the indicator random variable that prize 1 is obtained (in at least one of the three packages). Identify and intepret \(\{I_1 = 0\}\).
\(X = I_1+ I_2+ I_3\)
Label the boxes instead of the prizes. Let \(J_1\) be the indicator random variable that box 1 contains prize 1, \(J_2\) that box 2 contains prize 1, and \(J_3\) that box 3 contains prize 1. Then \(Y = J_1+ J_2+ J_3\).

9.5 Solution to Exercise 2.7

Figure 9.1 (a): \(\{X \le 1\}\), Katniss’s dart lands within 1 inch of the center of the dartboard.
Figure 9.1 (b): \(\{1 < X < 2\}\), Katniss’s dart lands more than 1 inch but less than 2 inches away from the center of the dartboard.
Figure 9.1 (c): \(\{X > 11\}\), Katniss’s dart lands within 1 inch of the outside edge of the dartboard.
\(\{X = 0\}\), the event that the dart hits exactly in the center, is just the single point in the center
\(\{X = 1\}\), the event that the dart hits exactly 1 inch from the center, is the circle with radius 1 inch (the outside egde of the shaded region in Figure 9.1 (a))

9.6 Solution to Exercise 2.10

The latest series of collectible Lego Minifigures contains 3 different Minifigure prizes (labeled 1, 2, 3). Each package contains a single unknown prize. Suppose we only buy 3 packages and we consider as our sample space outcome the results of just these 3 packages (prize in package 1, prize in package 2, prize in package 3). For example, 323 (or (3, 2, 3)) represents prize 3 in the first package, prize 2 in the second package, prize 3 in the third package. Suppose that each package is equally likely to contain any of the 3 prizes, regardless of the contents of other packages, so that there are 27 equally likely outcomes, and let \(\textrm{P}\) be the corresponding probability measure.

Let \(A_1\) be the event that prize 1 is obtained—that is, at least one of the packages contains prize 1—and define \(A_2, A_3\) similarly for prize 2, 3.
Let \(B_1\) be the event that only prize 1 is obtained—that is, all three packages contain prize 1—and define \(B_2, B_3\) similarly for prize 2, 3.

Compute \(\textrm{P}(A_1)\)
Compute \(\textrm{P}(B_1)\)
Interpret the values from parts 1 and 2 as long run relative frequencies.
Interpret the values from parts 1 and 2 as relative likelihoods.
Compute \(\textrm{P}(A_1 \cap A_2 \cap A_3)\)
Compute \(\textrm{P}(A_1 \cup A_2 \cup A_3)\)
Compute \(\textrm{P}(B_1 \cap B_2 \cap B_3)\)
Compute \(\textrm{P}(B_1 \cup B_2 \cup B_3)\)
\(A_1^c = \{222, 223, 232, 322, 233, 323, 332, 333\}\) is the event that none of the boxes contain prize 1, so \(A_1\) consists of the \(27-8 = 19\) other outcomes. Since the outcomes are equally likely, \(\text{P}(A_1) = 18/27=0.667\).
There is only one outcome that satisfies \(B_1\) so \(\text{P}(B_1) = 1/27 = 0.037\).
Over many sets of 3 boxes, about 66.7% of sets of 3 boxes will contain at least one prize 1, and about 3.7% of sets of 3 boxes will contain only prize 1.
It is 18 times more likely to obtain at least one prize 1 than it is to obtain only prize 1. Also, it is 2 times more likely to obtain at least one prize 1 than to not obtain it (18/9), and it is 26 times less likely to obtain only prize 1 than it is to obtain any other collection of prizes.
\(A_1 \cap A_2 \cap A_3 = \{123, 132, 213, 231, 312, 321\}\) is the event that at least one of each prize is obtained (that is, a complete set of prizes) so \(\textrm{P}(A_1 \cap A_2 \cap A_3) = 6/27 = 0.222\)
\(A_1 \cup A_2 \cup A_3 = \Omega\), the set of all possible outcomes; you have to get at least 1 of one of the prizes so \(\textrm{P}(A_1 \cup A_2 \cup A_3) = 1\)
\(B_1 \cap B_2 \cap B_3=\emptyset\); you can’t get only prize 1 and only prize 2, so \(\textrm{P}(B_1 \cap B_2 \cap B_3) = 0\)
\(B_1 \cup B_2 \cup B_3 = \{111, 222, 333\}\) is the event you only obtain one of the prizes (in every box), so \(\textrm{P}(B_1 \cup B_2 \cup B_3) = 3/27 = 0.111\)

9.7 Solution to Exercise 2.13

Since the dart lands uniformly at random anywhere on the dartboard, probabilities are computed as the ratio between the area corresponding to the event of interest divided by the area of the total dartboard (\(12^2\pi\))

See Figure 9.1. Find the area of the shaded pieces by subtracting areas of circles.

\(\text{P}(X \le 1) = \frac{1^2\pi}{12^2\pi} = 1/144 = 0.00694\)
\(\text{P}(1 < X < 2) = \frac{2^2\pi - 1^2\pi}{12^2\pi} = 3/144 = 0.021\)
\(\text{P}(X > 11) = 1 - \frac{11^2\pi}{12^2\pi} = 1 - (11/12)^2 = 0.160\)

9.8 Solution to Exercise 2.14

Find the area of the events of interest by finding areas of corresponding circles and subtracting as needed.

\(\textrm{P}(X \le 0.1) = \frac{0.1^2\pi}{12^2\pi} = (0.1/12)^2 = 0.0000694\)
\(\textrm{P}(X \le 0.01) = \frac{0.01^2\pi}{12^2\pi} = (0.01/12)^2 = 0.000000694\)
\(\textrm{P}(X = 0) = 0\), the single point has no area
\(\textrm{P}(X \ge 11.9) = 1 - \frac{11.9^2\pi}{12^2\pi} = 1 - (11.9/12)^2 = 0.0166\)
\(\textrm{P}(X \ge 11.99) = 1 - \frac{11.99^2\pi}{12^2\pi} = 1 - (11.99/12)^2 = 0.00166\)
\(\textrm{P}(X = 12) = 0\), the circle representing the outside edge has no area
Well, both of these events—the dart lands exactly in the center and the darts lands exactly on the edge—have 0 probability. However for practical purposes we would never be interested in probabilities like \(\textrm{P}(X = 0.000000000\ldots)\) or \(\textrm{P}(X = 12.000000000\ldots)\) with infinite precision.
However we define “close to”—within 1 inch or within 0.1 inch or within 0.01 inch, etc—the dart is more likely to land close to the edge than close to the center.

9.9 Solution to Exercise 2.16

Construct a two-way table representing the joint distribution of \(X\) and \(Y\).
Sketch a plot representing the joint distribution of \(X\) and \(Y\).
Identify the marginal distribution of \(X\), and sketch a plot of it.
Identify the marginal distribution of \(Y\), and sketch a plot of it.
Compute and interpret \(\text{E}(X)\).
Compute and interpret \(\text{E}(Y)\).
One dimension will have possible values of \(x\), the other possible values of \(y\). The interior cells should contain the probability of each \((x, y)\) pair.

\(y\)

\(x\) 0 1 2 3 Total

1 2/27 0 0 1/27 3/27

2 6/27 6/27 6/27 0 18/27

3 0 6/27 0 0 6/27

Total 8/27 12/27 6/27 1/27 1
Make a tile plot with color or shading representing probability; see Figure 9.2
The possible values of \(X\) are in the leftmost column (1, 2, 3) and the probabilities are in the Total column. See Figure 9.3.
The possible values of \(Y\) are in the top row (0, 1, 2, 3) and the probabilities are in the Total row. See Figure 9.4.
\(\textrm{E}(X) = 1\times (3/27) + 2 \times (18/27) + 3 \times (6/ 27) = 2.11\). Over many sets of 3 boxes, we expect 2.11 distinct prizes on average.
\(\textrm{E}(Y) = 0\times (8/27) + 1\times (12/27) + 2 \times (6/27) + 3 \times (1/ 27) = 1\). Over many sets of 3 boxes, we expect 1 box containing prize 1 on average.

Figure 9.2: Joint distribution of \(X\) and \(Y\) in Exercise 2.16

Figure 9.3: Marginal distribution of \(X\) in Exercise 2.16

Figure 9.4: Marginal distribution of \(Y\) in Exercise 2.16

9.10 Solution to Exercise 2.22

Each row of the table below represents a different conditional distribution of \(Y\) given \(X=x\). For example, the row for \(x=1\) represents the conditional distribution of \(Y\) given \(X=1\): Given \(X=1\), \(Y\) is 0 with probability 2/3 and 3 with probability 1/3.

\(y\)

0 1 2 3 Total

\(x\)

1 2/3 0 0 1/3 1

2 1/3 1/3 1/3 0 1

3 0 1 0 0 1
Take expected values according to each conditional distribution. In general, \(\text{E}(Y|X=x)\) depends on \(x\), but in this case \(\text{E}(Y|X=x) = 1\) for all values of \(x\).; regardless of how many distinct prizez people obtain in their, the average number of prize 1s obtains is 1.

\[\begin{align*} x & \quad \text{E}(Y|X=x)\\ 1 & \quad 0(2/3) + 3(1/3) = 1\\ 2 & \quad 0(1/3) + 1(1/3) + 2(1/3) = 1\\ 3 & \quad 1(1) = 1 \end{align*}\]
Each column of the table below represents a different conditional distribution of \(X\) given \(Y=y\). For example, the column for \(y=1\) represents the conditional distribution of \(X\) given \(Y=1\): Given \(Y=1\), \(X\) is 1 with probability 1/4 and 2 with probability 3/4.

\(y\)

0 1 2 3

\(x\)

1 1/4 0 0 1

2 3/4 1/2 1 0

3 0 1/2 0 0

Total 1 1 1 1
Take expected values according to each conditional distribution. For example, \(\text{E}(X|Y = 0) = 1.75\); among the people who never get prize 1 in their 3 boxes, the average number of distinct prizes they obtain is 1.75.

\[\begin{align*} y & \quad \text{E}(X|Y=y)\\ 0 & \quad 1(1/4) + 2(3/4) = 1.75\\ 1 & \quad 2(1/2) + 3(1/2) = 2.5\\ 2 & \quad 2(1) = 2\\ 3 & \quad 1(1) = 1 \end{align*}\]

9.11 Solution to Exercise 2.17

Suppose there are 1000 questions on the test. (That’s a long test! But remember, 1000 is just a convenient round number.) We can classify each question by its type (know, eliminate, guess) and whether we answer it correctly or not. The probability that we answer a question correctly is 1 given that we know it, 0.5 given that we can eliminate two choices, or 0.25 given that we guess randomly.

Know Eliminate Guess Total

Correct 700 100 25 825

Incorrect 0 100 75 175

Total 700 200 100 1000

The probability that we answer a randomly selected question correctly is 825/1000 = 0.825.
The overall probability of answering a question correctly is closer to 1 than 0.5 or 0.25. To construct the table and obtain the value 0.825, we basically did the following calculation

\[ 0.825 = (1)(0.7) + (0.5)(0.2) + (0.25)(0.1) \]

We see that the overall probability, 0.825, is a weighted average of the case-by-case probabilities 1, 0.5, and 0.25, where 1 gets the most weight in the average because there is a higher percentage of questions that we know.

	Know	Eliminate	Guess	Total
Correct	700	100	25	825
Incorrect	0	100	75	175
Total	700	200	100	1000

9.12 Solution to Exercise 2.18

Most people produce a sequence that has 30 G’s and 10 R’s, or close to those proportions, because they are trying to generate a sequence for which each outcome has a 75% chance for G and a 25% chance for R. That is, they use a strategy in which they predict G with probability 0.75, and R with probability 0.25.
There are two cases: the true flash is either green (with probability 0.75) or red (with probability 0.25). Given that the flash is green, your probability of correctly predicting it is 0.75 (because your probability of guessing “G” is 0.75). Given that the flash is red, your probability of correctly predicting it is 0.25 (because your probability of guessing “R” is 0.25). Use the law of total probability to find the probability that your prediction is correct: \((0.75)(0.75) + (0.25)(0.25) = 0.625\).
Just pick G every time! Picking green every time has a 0.75 probability of correctly predicting any flash. When events are independent, trying to guess the pattern doesn’t help.

9.13 Solution to Exercise 2.19

We don’t know what you guessed, but from experience many people guess 80-100%. Afterall, the test is correct for most of people who carry HIV, and also correct for most people who don’t carry HIV, so it seems like the test is correct most of the time. But this argument ignores one important piece of information that has a huge impact on the results: most people do not carry HIV.
Let \(H\) denote the event that the person carries HIV (hypothesis), and let \(E\) denote the event that the test is positive (evidence). Therefore, \(H^c\) is the event that the person does not carry HIV, another hypothesis. We are given

prior probability: \(P(H) = 0.005\)
likelihood of testing positive, if the person carries HIV: \(P(E|H) = 0.977\)
\(P(E^c|H^c) = 0.926\)
likelihood of testing positive, if the person does not carry HIV: \(P(E|H^c) = 1-P(E^c|H^c) = 1-0.926 = 0.074\)
We want to find the posterior probability \(P(H|E)\).

Assuming 1000000 Americans

Tests positive Does not test positive Total

Carries HIV 4885 115 5000

Does not carry HIV 73630 921370 995000

Total 78515 921485 1000000

Among the 78515 who test positive, 4885 carry HIV, so the probability that an American who tests positive actually carries HIV is 4885/78515 = 0.062.
See Table 9.2.
The result says that only 6.2% of Americans who test positive actually carry HIV. It is true that the test is correct for most Americans with HIV (4885 out of 5000) and incorrect only for a small proportion of Americans who do not carry HIV (73630 out of 995000). But since so few Americans carry HIV, the sheer number of false positives (73630) swamps the number of true positives (4885).
Prior to observing the test result, the prior probability that an American carries HIV is \(P(H) = 0.005\). The posterior probability that an American carries HIV given a positive test result is \(P(H|E)=0.062\). \[ \frac{P(H|E)}{P(H)} = \frac{0.062}{0.005} = 12.44 \] An American who tests positive is about 12.4 times more likely to carry HIV than an American whom the test result is not known.
So while 0.067 is still small in absolute terms, the posterior probability is much larger relative to the prior probability.
In this risk group

	Tests positive	Does not test positive	Total
Carries HIV	4885	115	5000
Does not carry HIV	73630	921370	995000
Total	78515	921485	1000000

a person is 19 times more likely to not have HIV than to have it (\(0.95/0.05 = 19\)).
A positive test is 13.2 times less likely when the person does not have HIV than when they have it (\(0.074/0.977 = 1/13.2\)).
The product of these ratios is \(19(1/13.2) = 1.44\).

Since posterior is proportional to the product of prior and likelihood, a person in this risk group who tests positive is 1.44 times more likely to not have HIV than to have HIV.

The posterior probabilities of not having HIV and having HIV are in a 1.44 to 1 ratio, the so the posterior probability of not having HIV is \(1.44/(1+1.44) = 0.59\) and the posterior probability of having HIV is 0.41.
Yes, the posterior probability is influenced by the prior probability. Even though the prior probability of 5% is still relatively small in absolute terms, the posterior probability given a positive test is not close to 50/50.

Table 9.2: Bayes table for Exercise 2.19

hypothesis	prior	likelihood	product	posterior
Carries HIV	0.005	0.977	0.0049	0.0622
Does not carry HIV	0.995	0.074	0.0736	0.9378
sum	1.000	NA	0.0785	1.0000

9.14 Solution to Exercise 2.20

Hypotheses are which player is best (A, B, C). Evidence is that A beats B. The likelihood is the probability that A beats B given each of the best players.

If A is best, probability A beats B is 2/3.
If B is best, probability A beats B is 1/3.
If C is best, probability A beats B is 1/2.

Compute the posterior probabilities as in the following Bayes table.

best_player	prior	likelihood_A_beats_B	product	posterior
A	0.50	0.6667	0.3333	0.6349
B	0.35	0.3333	0.1167	0.2222
C	0.15	0.5000	0.0750	0.1429
Total	1.00	1.5000	0.5250	1.0000

A’s probability of being the best increased, which makes sense because A won the match. B’s probability of being the best decreased considerably, which makes sense because B lost the match. C’s probability of being the best decreased slightly, despite C not being involved in the match. (We have now observed some actual evidence in A’s favor while we don’t have any observations about C yet, and this information asymmetry results in a decrease in C’s posterior probability.)
Hypotheses are which player is best (A, B, C). Evidence is that B beats A. The likelihood is the probability that B beats A given each of the best players.

If A is best, probability B beats A is 1/3.
If B is best, probability B beats A is 2/3.
If C is best, probability B beats A is 1/2.

best_player	prior	likelihood_B_beats_A	product	posterior
A	0.50	0.3333	0.1667	0.3509
B	0.35	0.6667	0.2333	0.4912
C	0.15	0.5000	0.0750	0.1579
Total	1.00	1.5000	0.4750	1.0000

A’s probability of being the best decreased, which makes sense because A lost the match. B’s probability of being the best increased, which makes sense because B won the match. C’s probability of being the best changed slightly, despite C not being involved in the match.
The prior is the posterior from the first part. Evidence is that A beats C. The likelihood is the probability that A beats C given each of the best players.

If A is best, probability A beats C is 2/3.
If B is best, probability A beats C is 1/2.
If C is best, probability A beats C is 1/3.

best_player	prior	likelihood_A_beats_C	product	posterior
A	0.6349	0.6667	0.4233	0.7273
B	0.2222	0.5000	0.1111	0.1909
C	0.1429	0.3333	0.0476	0.0818
Total	1.0000	1.5000	0.5820	1.0000

By winning both matches, A’s probability of being the best has increased considerably. By losing their only matches, B’s and C’s probabilities of being the best have decreased considerably.

9.15 Solution to Exercise 2.21

You’d pick A and your subjective probability of being correct is 0.5.
Use the law of total probability, conditioning on who is the best player (A, B, C) \[\begin{align*} \text{P}(\text{A beats B}) & = \text{P}(\text{A beats B}|A)\text{P}(A) + \text{P}(\text{A beats B}|B)\text{P}(B) + \text{P}(\text{A beats B}|C)\text{P}(C)\\ & = (2/3)(0.5) + (1/3)(0.35) + (1/2)(0.15) = 0.525 \end{align*}\]
Given that A beats B, we would predict A to be the best player, and we would have probability 0.6349 of being correct.
Given that B beats A, we would predict B to be the best player, and we would have probability 0.4912 of being correct.
Use the law of total probability, conditioning on the result of the first match (A beats B or B beats A). \[\begin{align*} \text{P}(\text{correct}) & = \text{P}(\text{correct} | \text{A beats B})\text{P}(\text{A beats B}) + \text{P}(\text{correct} | \text{B beats A}) \text{P}(\text{B beats A})\\ & = (0.6349)(0.5250) + (0.4912)(0.4750) = 0.5666 \end{align*}\] The information gained from observing the first match increases our probability of being correct from 0.5 to 0.5666.

9.15.1 Solution to Exercise 2.24

\(X\) can take values 1, 2, 3, \(\ldots\). Even though it is unlikely that \(X\) is very large, there is no fixed upper bound. Even though \(X\) can take infinitely many values, \(X\) is a discrete random variables because it takes countably many possible values.
\(X= 1\) only if she makes her first attempt, so \(\text{P}(X = 1) = 0.4\). If Maya does this every practice, then in about 40% of practices she will make her first three pointer on her first attempt.
\(X= 2\) only if she misses her first attempt and makes her second attempt, so since the attempts are independent \(\text{P}(X = 2) = (0.6)(0.4)=0.24\). If Maya does this every practice, then in about 24% of practices she will make her first three pointer on her second attempt.
In order for \(X\) to be 3, Maya must miss her first two attempts and make her third. Since the attempts are independent \(\text{P}(X=3)=(1-0.4)^2(0.4)=0.144\). If Maya does this every practice, then in about 14.4% of practices she will make her first three pointer on her third attempt.
The key is to realize that Maya requires more than 3 attempts to obtain her first success if and only if the first 5 attempts are failures. Therefore, \[ P(X > 3) = (1-0.4)^3 = 0.216 \]

9.16 Solution to Exercise 2.25

Let \(D\) be the probability that the original microorganism dies after the first minute; \(\textrm{P}(D) = 1/4\). Condition on the first “step” and use the law of total probability \[ p = \textrm{P}(E) = \textrm{P}(E|D)\textrm{P}(D) + \textrm{P}(E|D^c)\textrm{P}(D^c) = (1)(1/4) + \textrm{P}(E|D^c)(3/4) \] \(\textrm{P}(E|D) = 1\) since if the first microorganism dies the population goes extinct immediately.

The key is to find an expression for \(\textrm{P}(E|D^c)\) in terms of \(p\). If the first microorganism does not die (\(D^c\)) there are 2 microorganisms at the start of the second minute; let’s call them Marge and Homer. In order for the population to go extinct, we need Marge and all her descendants to go extinct, and the same for Homer. But Marge is just a single microorganism, so the probability that her line eventually goes extinct is \(p\); similarly the probability that Homer’s line goes extinct is \(p\). Since all microorganisms behave independently, the probability that both Marge and Homer’s lines eventually go extinct is \((p)(p)=p^2\). That is, \(\textrm{P}(E | D^c) = p^2\).

Plugging into the equation above yields \[ p = (1)(1/4) + p^2(3/4) \]

Solve (quadratic formula) this equation to get¹ \(p= 1/3\). The probability that the population eventually goes extinct is 1/3. This microorganism population is 2 times more likely to survive forever than to go extinct!
The process is the same as the above, with 3/4 replaced by \(s\) \[ p = (1)(1-s) + p^2s \] Solving gives two solutions, 1 and \(1/s - 1\). However, if \(s<1/2\) then \(1/s - 1 > 1\), which is not a valid probability. Therefore the probability of eventual extinction is 1 if \(s \le 1/2\), and \(1/s - 1<1\) if \(s > 1/2\).

9.17 Solution to Exercise 2.27

It is helpful to construct a two-way table to answer the following questions.

	x
	3	4	5	Total
A wins	0.55³ = 0.166	3(0.55³)(0.45) = 0.225	6(0.55³)(0.45)² = 0.202	0.593
A does not win	0.45³ = 0.091	3(0.45³)(0.55) = 0.150	6(0.45³)(0.55)² = 0.165	0.407
Total	0.258	0.375	0.368	1

There is only one outcome for which A wins in 3 games: AAA. There are three outcomes in which A wins in 4 games: AABA, ABAA, BAAA. (Not AAAB because then the series would be over in 3 games.) Since the games are independent, an outcome like AABA has probability \((0.55^3)(0.45)\), so the probability that A wins in 4 games is \(3(0.55^3)(0.45)\). There are six outcomes in which A wins in 5 games: AABBA, ABABA, BAABA, ABBAA, BABAA, BBAAA. (Not outcomes like AAABB or AABAB because then the series would be over in 3 or 4 games.) Since the games are independent, an outcome like AABBA has probability \((0.55^3)(0.45)^2\), so the probability that A wins in 5 games is \(6(0.55^3)(0.45)\). You can fill in the rest of the table similarly.

The probability that team A wins the series in 3 games is \(\text{P}(X=3, A) = 0.55^3=0.166\).

Either the stronger team wins 3 in a row or loses 3 in a row. The probability is \(0.55^3+(1-0.55)^3=0.2575\).
The probability that team A wins the series is the sum of first row: \(\text{P}(A) = 0.593\).
No. \(\text{P}(X=3, A) = 0.55^3=0.166 \neq 0.152 = (0.2575)(0.593) = \text{P}(X=3)\text{P}(A)\). Alternatively, \(\text{P}(X = 3 | A) = 0.166/0.593 = 0.2799 \neq 0.2575 = \text{P}(X = 3)\). Given that \(A\) wins the series the series it more likely to end in 3 games than when B wins the series.
Total row above. \(X\) can take values 3, 4, or 5. Consider first the ways in which the stronger team wins in 4 games: AABA, ABAA, BAAA. (For example AABA, means the stronger team wins game 1, 2, and 4, and the weaker team wins game 3). Each of these outcomes has probability \(0.55^3(0.45)\) so the probability that team A wins in 4 games in \(3(0.55)^3(0.45)\). Similarly, the probability that team B wins in 4 games is \(3(0.45)^3(0.55)\). So \[ \text{P}(X = 4) = 3(0.55)^3(0.45) + 3(0.45)^3(0.55) = 0.3750 \] You could find \(\text{P}(X=5)\) in a similar way, or just use the fact that the probabilities have to add up to 1. The distribution of \(X\) is given by the following table. \[\begin{align*} x & \qquad & & \qquad \text{P}(X = x)\\ 3 & \qquad & & \qquad 0.2575\\ 4 & \qquad & & \qquad 0.3750\\ 5 & \qquad & & \qquad 0.3675\\ \end{align*}\]

9.18 Solution to Exercise 2.26

The prior is the posterior from the first part. Evidence is that A beats C. The likelihood is the probability that A beats C given each of the best players.

If A is best, probability A beats C is 2/3.
If B is best, probability A beats C is 1/2.
If C is best, probability A beats C is 1/3.

best_player	prior	likelihood_A_beats_C	product	posterior
A	0.6349	0.6667	0.4233	0.7273
B	0.2222	0.5000	0.1111	0.1909
C	0.1429	0.3333	0.0476	0.0818
Total	1.0000	1.5000	0.5820	1.0000

By winning both matches, A’s probability of being the best has increased considerably. By losing their only matches, B’s and C’s probabilities of being the best have decreased considerably.

The prior is the posterior from the previous part. Evidence is that B beats C. The likelihood is the probability that B beats C given each of the best players.

If A is best, probability B beats C is 1/2.
If B is best, probability B beats C is 2/3.
If C is best, probability B beats C is 1/3.

best_player	prior	likelihood_B_beats_C	product	posterior
A	0.7273	0.5000	0.3636	0.7018
B	0.1909	0.6667	0.1273	0.2456
C	0.0818	0.3333	0.0273	0.0526
Total	1.0000	1.5000	0.5182	1.0000

By winning both matches, A’s probability of being the best has increased considerably. By losing one match and winning one, B’s probability of being the best decreased somewhat. By losing both matches, C’s probability of being the best has decreased considerably.

The prior is the original prior. Evidence is that A beats B and A beats C and B beats C in three conditionally independent matches. The likelihood is the probability of these match results given each of the best players.

If A is best, likelihood is (2/3)(2/3)(1/2).
If B is best, likelihood is (1/3)(1/2)(2/3).
If C is best, likelihood is (1/2)(1/3)(1/3)

best_player	prior	likelihood	product	posterior
A	0.50	0.2222	0.1111	0.7018
B	0.35	0.1111	0.0389	0.2456
C	0.15	0.0556	0.0083	0.0526
Total	1.00	0.3889	0.1583	1.0000

The posterior is the same. It doesn’t matter if the posterior is updated after each match, or at once after all three matches.

Technically, there are two solutions, 1 and \(1/3\). There are some technical justifications that can be made to show that the extinction probability is the smaller of the two solutions, but this is beyond our scope.↩︎

	\(y\)
\(x\)	0	1	2	3	Total
1	2/27	0	0	1/27	3/27
2	6/27	6/27	6/27	0	18/27
3	0	6/27	0	0	6/27
Total	8/27	12/27	6/27	1/27	1

	\(y\)
	0	1	2	3	Total
\(x\)
1	2/3	0	0	1/3	1
2	1/3	1/3	1/3	0	1
3	0	1	0	0	1

	\(y\)
	0	1	2	3
\(x\)
1	1/4	0	0	1
2	3/4	1/2	1	0
3	0	1/2	0	0
Total	1	1	1	1