Unit 8 Probability And Sampling — Unit Plan

TitleTakeawaysStudent SummaryAssessment
Lesson 7
Simulating Multi-step Experiments

The more complex a situation is, the harder it can be to estimate the probability of a particular event happening. Well-designed simulations are a way to estimate a probability in a complex situation, especially when it would be difficult or impossible to determine the probability from reasoning alone.

To design a good simulation, we need to know something about the situation. For example, if we want to estimate the probability that it will rain every day for the next three days, we could look up the weather forecast for the next three days. Here is a table showing a weather forecast:

today
(Tuesday)
Wednesday Thursday Friday
probability of rain 0.2 0.4 0.5 0.9

We can set up a simulation to estimate the probability of rain each day with three bags.

  • In the first bag, we put 4 slips of paper that say “rain” and 6 that say “no rain.”
  • In the second bag, we put 5 slips of paper that say “rain” and 5 that say “no rain.”
  • In the third bag, we put 9 slips of paper that say “rain” and 1 that says “no rain.”

Then we can select 1 slip of paper from each bag and record whether or not the simulation predicts that there will be rain on all three days. If we repeat this experiment many times, we can estimate the probability that there will be rain on all three days by dividing the number of times all three slips say “rain” by the total number of times we perform the simulation.

Battery Life (1 problem)

The probability of a certain brand of battery going dead within 15 hours is 13\frac{1}{3}. Noah has a toy that requires 4 of these batteries. He wants to estimate the probability that at least one battery will die before 15 hours are up.

  1. Noah will simulate the situation by putting marbles in a bag. Drawing one marble from the bag will represent the outcome of one of the batteries in the toy after 15 hours. Red marbles represent a battery that dies before 15 hours are up, and green marbles represent a battery that lasts longer.

    How many marbles of each color should he put in the bag? Explain your reasoning.

  2. After doing the simulation 5 times, Noah has these results. What should he use as an estimate of the probability that at least one battery will die within 15 hours?
    trial result
    1 GGRG
    2 GRGR
    3 GGGG
    4 RGGG
    5 GGGR
Show Solution
  1. 1 red marble and 2 green marbles (or some multiple of these). Sample reasoning: Based on the probability of each battery dying, 13\frac{1}{3} of the marbles should be red.
  2. 45\frac{4}{5} or equivalent.
Lesson 8
Keeping Track of All Possible Outcomes

Sometimes we need a systematic way to count the number of outcomes that are possible in a given situation. For example, suppose there are 3 people (A, B, and C) who want to run for the president of a club and 4 different people (1, 2, 3, and 4) who want to run for vice president of the club. We can use a tree, a table, or an ordered list to count how many different combinations are possible for a president to be paired with a vice president.

With a tree, we can start with a branch for each of the people who want to be president. Then for each possible president, we add a branch for each possible vice president, for a total of 34=123\boldcdot 4 = 12 possible pairs. We can also start by counting vice presidents first and then adding a branch for each possible president, for a total of 43=124 \boldcdot 3 = 12 possible pairs.

Tree diagram with three branches for the first choice, labeled “A,” “B”, and “C.” Choices “A”, “B”, and “C” each have four branches labeled with a different number from 1 through 4.
Tree diagram with three branches for the first choice, labeled “A,” “B”, and “C.” Choices “A”, “B”, and “C” each have four branches labeled with a different number from 1 through 4.

Tree diagram with four branches for the first choice, labeled 1, 2, 3, and 4. Choices 1, 2, 3, and 4 each have three branches, labeled with a different letter “A,” “B,” or “C.”<br>
 
Tree diagram with four branches for the first choice, labeled 1, 2, 3, and 4. Choices 1, 2, 3, and 4 each have three branches, labeled with a different letter “A,” “B,” or “C.”  

A table can show the same result:

1 2 3 4
A A1 A2 A3 A4
B B1 B2 B3 B4
C C1 C2 C3 C4

So does this ordered list:

A1, A2, A3, A4, B1, B2, B3, B4, C1, C2, C3, C4

Shirt Options (1 problem)

A school club is selling shirts for a fundraiser. Shirts come with one of each of these options:

  • size: small, medium, large, extra large
  • color: black, white
  • design: logo on front, no logo on front

  1. How many different shirts are available with these options? Explain or show your reasoning.
  2. Diego wants a medium-sized shirt. How many shirts can he choose from with that requirement?
Show Solution
  1. 16. Sample reasoning: 422=164 \boldcdot 2 \boldcdot 2 = 16
  2. 4
Lesson 9
Multi-step Experiments

Suppose we have two bags. One contains 1 star block and 4 moon blocks. The other contains 3 star blocks and 1 moon block.

If we select 1 block at random from each, what is the probability that we will get 2 star blocks or 2 moon blocks?

Two bags of blocks. The bag on the left contains 5 blocks: 1 star block and 4 moon blocks. The bag on the right contains 4 blocks: 3 star blocks and 1 moon block.

To answer this question, we can draw a tree diagram to see all of the possible outcomes.

A tree diagram. The first choice has 5 branches, representing the 5 blocks in the bag: one branch is labeled “star,” the other 4 are labeled “moon.” Each of these branches has 4 branches, representing the 4 blocks in the second bag. 3 branches are labeled “star” and one is labeled “moon.” The word “star” in the first choice, and the 3 “star” choices branching from it are highlighted gold. From the first choice, the word “moon” is highlighted blue on each of the four remaining branches. From each of those branches, the one choice of “moon” for each is also highlighted blue.

There are 54=205 \boldcdot 4 = 20 possible outcomes. Of these, 3 of them are both stars, and 4 are both moons. So the probability of getting 2 star blocks or 2 moon blocks is 720\frac{7}{20}.

In general, if all outcomes in an experiment are equally likely, then the probability of an event is the fraction of outcomes in the sample space for which the event occurs.

A Number Cube and 10 Cards (1 problem)

Lin plays a game that involves a standard number cube and a deck of ten cards numbered 1 through 10. If both the cube and card have the same number, Lin gets another turn. Otherwise, play continues with the next player.

What is the probability that Lin gets another turn?

Show Solution

660\frac{6}{60} (or equivalent), since there are 6 outcomes for which the numbers match and 60 equally likely outcomes in the sample space (610=606 \boldcdot 10 = 60)

Lesson 10
Designing Simulations

Many real-world situations are difficult to repeat enough times to get an estimate for a probability. If we can find probabilities for parts of the situation, we may be able to simulate the situation using a process that is easier to repeat.

For example, if we know that each egg of a fish in a science experiment has a 13% chance of having a mutation, how many eggs do we need to collect to make sure we have 10 mutated eggs? If getting these eggs is difficult or expensive, it might be helpful to have an idea about how many eggs we need before trying to collect them.

<p>Photograph of fish eggs </p>

We could simulate this situation by having a computer select random numbers between 1 and 100. If the number is between 1 and 13, it counts as a mutated egg. Any other number would represent a normal egg. This matches the 13% chance of each fish egg having a mutation.

We could continue asking the computer for random numbers until we get 10 numbers that are between 1 and 13. How many times we asked the computer for a random number would give us an estimate of the number of fish eggs we would need to collect.

To improve the estimate, this entire process should be repeated many times. Because computers can perform simulations quickly, we could simulate the situation 1,000 times or more.

The Best Power-Up (1 problem)

Elena is programming a video game. She needs to simulate the power-up that the player gets when they reach a certain level. The computer can run a program to return a random integer between 1 and 100. Elena wants the best power-up to be given 15% of the time.

Explain how Elena could use the computer to simulate the player getting the best power-up at least 2 out of 3 times.

Show Solution

Sample response: Elena could have the computer generate 3 random integers between 1 and 100. If at least 2 of the numbers are between 1 and 15, then the player gets the best power-up at least twice. She could repeat this process many times and estimate the probability as the proportion of trials for which at least 2 of the numbers are between 1 and 15.

Section B Check
Section B Checkpoint
Lesson 15
Estimating Population Measures of Center

Some populations have greater variability than others. For example, we would expect greater variability in the weights of dogs at a dog park than at a beagle meetup.

Dog park:

A picture of 2 small dogs, 2 medium sized dogs, and 3 large dogs.

Mean weight: 12.8 kg       MAD: 2.3 kg

Beagle meetup:

A picture of 7 similar sized beagle dogs.

Mean weight: 10.1 kg       MAD: 0.8 kg

The lower MAD indicates that there is less variability in the weights of the beagles. We would expect that the mean weight from a sample that is randomly selected from a group of beagles will provide a more accurate estimate of the mean weight of all the beagles than a sample of the same size from the dogs at the dog park.

In general, if samples from a population have similar sizes, a sample with less variability is more likely to have a mean that is close to the population mean.

More Accurate Estimate (1 problem)

Here are dot plots that represent samples from two different populations.

Sample 1:

&lt;p&gt;A dot plot labeled "Sample 1" with the numbers 15 through 75, in increments of 5, indicated. &lt;/p&gt;<br>
 
<p>A dot plot labeled "Sample 1" with the numbers 15 through 75, in increments of 5, indicated. The data are as follows: 15, 1 dot. 17, 1 dot. 19, 1 dot. 20, 10 dots. 21, 7 dots. 22, 8 dots. 23, 6 dots. 24, 8 dots. 25, 7 dots. 26, 14 dots. 27, 11 dots. 28, 8 dots. 29, 7 dots. 30, 5 dots. 32, 2 dots. 34, 3 dots. 36, 1 dot.</p>  

Sample 2:

&lt;p&gt;A dot plot titled "Sample 2" with the numbers 15 through 75, in increments of 5, indicated. The data are as follows: 29, 1 dot. 31, 2 dots. &lt;/p&gt;<br>
 
<p>A dot plot titled "Sample 2" with the numbers 15 through 75, in increments of 5, indicated. The data are as follows: 29, 1 dot. 31, 2 dots. 35, 2 dots. 36, 1 dot. 38, 4 dots. 39, 2 dots. 41, 2 dots. 42, 2 dots. 43, 1 dot. 44, 5 dots. 45, 3 dots. 46, 4 dots. 47, 3 dots. 48, 9 dots. 49, 7 dots. 50, 4 dots. 51, 6 dots. 52, 4 dots. 53, 4 dots. 54, 8 dots. 55, 5 dots. 56, 5 dots. 57, 1 dot. 58, 2 dots. 59, 4 dots. 60, 3 dots. 63, 4 dots. 64, 1 dot. 73, 1 dot.</p>  

  1. Estimate the mean of each population using these samples.
  2. Based on the dot plots, which estimate is more likely to be accurate? Explain your reasoning.
Show Solution
  1. Correct responses should be close to 25 and 50 respectively.
  2. The estimate for Sample 1 is probably more accurate since there is much less variability in the data.
Lesson 16
Estimating Population Proportions

Sometimes a data set consists of information that fits into specific categories. For example, we could survey students about whether they have a pet cat or dog. The categories for these data would be {neither, dog only, cat only, both}. Suppose we surveyed 10 students. Here is a table showing possible results:

option number of responses
neither dog nor cat 2
dog only 4
cat only 1
both dog and cat 3

In this sample, 3 of the students say they have both a dog and a cat. We can say that the proportion of these students who have a both a dog and a cat is 310\frac{3}{10} or 0.3. If this sample is representative of all 720 students at the school, we can predict that about 310\frac{3}{10} of 720, or about 216 students at the school, have both a dog and a cat.

In general, a proportion is a number from 0 to 1 that represents the fraction of the data that belongs to a given category.

More than 48 Grams (1 problem)

A chemical engineer is trying to increase the amount of the useful product in a reaction. She performs the reaction with her new equipment 10 times and gets the following amounts of the useful product in grams:

47.1

48.2

48.3

47.5

48.5

48.1

47.2

48.2

48.4

48.3

  1. What proportion of the reactions are above the 48 grams threshold?
  2. Other chemists typically get 65% of their reactions to produce more than 48 grams. Should the engineer say that she is able to increase the useful product when compared to the other chemists?
Show Solution
  1. 0.7, since 7 of the 10 reactions have more than 48 grams of the useful product
  2. Sample response: She could be optimistic, but her proportion does not seem far from what others have done. She should run more reactions to be more sure of the improvement. With only 10 values in her data set, 0.7 (and 0.6) is as close to 0.65 as she could get. 
Lesson 18
Comparing Populations Using Samples

Sometimes we want to compare two different populations. For example, is there a meaningful difference between the weights of pugs and beagles? Here are histograms showing the weights for a sample of dogs from each of these breeds:

A histogram for two different populations: On the horizontal axis, the numbers 6 through 11, in increments of zero point 5, are indicated.<br>
 
A histogram for two different populations: On the horizontal axis, the numbers 6 through 11, in increments of zero point 5, are indicated. The label “pug weights in kilograms” is indicated for the numbers 6 through 8 and “beagle weights in kilograms” is indicated for the numbers 9 through 11. On the vertical axis, the numbers 0 through 8 are indicated. The data represented by the bars are as follows: Pug weights in kilograms: Weight from 6 up to 6 point 5, 5. Weight from 6 point 5 up to 7, 5. Weight from 7 up to 7 point 5, 7. Weight from 7 point 5 up to 8, 3. A triangle is located at 6 point 9 kilograms. Beagle weights in kilograms: Weight from 9 up to 9 point 5, 3. Weight from 9 point 5 up to 10, 3. Weight from 10 up to 10 point 5, 8. Weight from 10 point 5 up to 11, 6. A triangle is located at 10 point 1.  

The red triangles show the mean weight of each sample, 6.9 kg for the pugs and 10.1 kg for the beagles. The red lines show the weights that are within 1 MAD of the mean. We can think of these as typical weights for the breed. These typical weights do not overlap. In fact, the distance between the means is 10.16.910.1-6.9, or 3.2 kg, over 6 times the larger MAD! So we can say there is a meaningful difference between the weights of pugs and beagles.

Is there a meaningful difference between the weights of male pugs and female pugs? Here are box plots showing the weights for a sample of male pugs and a sample of female pugs:

Two box plots labeled “male pug weights in kilograms” and “female pug weights in kilograms” are indicated<br>
 
Two box plots labeled “male pug weights in kilograms” and “female pug weights in kilograms” are indicated. The numbers 4 through 8 point 5, in increments of zero point 5, are indicated. The five-number summary for the box plots are as follows: Male pug weights in kilograms: Minimum value, 6 point 4. Maximum value, 8 point 3. Q1, 7 point 2. Q2, 7 point 6. Q3, 7 point 9. Female pug weights in kilograms: Minimum value, 6 point 2. Maximum value, 8. Q1, 6 point 4. Q2, 6 point 9. Q3, 7 point 3.  

We can see that the medians are different, but the weights between the first and third quartiles overlap. Based on these samples, we would say there is not a meaningful difference between the weights of male pugs and female pugs.

In general, if the measures of center for two samples are at least two measures of variability apart, we say the difference in the measures of center is meaningful. Visually, this means the ranges of typical values do not overlap. If they are closer, then we don't consider the difference to be meaningful.

Teachers Watching Movies (1 problem)

Noah is interested in comparing the number of movies watched by students and teachers over the winter break. He takes a random sample of 10 students and 10 teachers and makes a dot plot of their responses.

students
A dot plot for “movies watched over break” titled "Students." The numbers 0 through 8 are indicated. The data are as follows: 4 movies, 1 dot. 5 movies, 3 dots. 6 movies, 4 dots. 7 movies, 2 dots.

teachers
&lt;p&gt;dot plot for “movies watched over break” titled "Teachers.” 0 to 8, by 1’s.&lt;/p&gt;<br>
 
<p>A dot plot for “movies watched over break” titled "Teacher." The numbers 0 through 8 are indicated. The data are as follows: 1 movie, 1 dot. 2 movies, 4 dots. 3 movies, 3 dots. 4 movies, 1 dot. 5 movies, 1 dot.</p>  

Noah then computes the measures of center and variability for each group:

  • Students: mean: 5.7 movies, MAD: 0.76 movies
  • Teachers: mean: 2.7 movies, MAD: 0.9 movies

Should Noah conclude that there is a meaningful difference in the mean number of movies watched over winter break between the two groups? Explain your reasoning.

Show Solution

Yes. Sample reasoning: Because the difference in the means is greater than 2 MADs, there is a meaningful difference in the mean number of movies watched. (5.72.7)÷0.93.33(5.7 - 2.7) \div 0.9 \approx 3.33

Section D Check
Section D Checkpoint
Unit 8 Assessment
End-of-Unit Assessment