The Empirical Rule

Section 10.2 The Empirical Rule

Objectives

Use the empirical rule to find probabilities

If the price per pound of USDA Choice Beef is normally distributed with a mean of $4.85/lb and a standard deviation of $0.35/lb, what is the estimated probability that a randomly chosen sample (from a randomly chosen market) will be between $5.20 and $5.55 per pound?

This lesson on the Empirical Rule is an extension of the previous lesson Section 10.1. In the prior lesson, the goal was to develop an intuition of the interaction between decreased probability and increased distance from the mean. In this lesson, we will practice applying the Empirical Rule to estimate the specific probability of occurrence of a sample based on the range of the sample, measured in standard deviations.

The graphics below describe the important features of normal distributions. Note how the graph resembles a bell? Now you know why the normal distribution is also called a “bell curve”.

50% of the data is above, and 50% below, the mean of the data

Figure 10.2.1. Image Credit: RRCC
Approximately 68% of the data occurs within 1 SD of the mean

Figure 10.2.2. Image Credit: RRCC
Approximately 95% occurs within 2 SD’s of the mean

Figure 10.2.3. Image Credit: RRCC
Approximately 99.7% of the data occurs within 3 SD’s of the mean

Figure 10.2.4. Image Credit: RRCC

It is due to the probabilities associated with 1, 2, and 3 SD’s that the Empirical Rule is also known as the “68-95-99.7 rule”. The graphic from Section 10.1 is a rather concise summary of the vital statistics of a Normal Distribution.

Figure 10.1.1

Example 10.2.5.

If the diameter of a basketball is normally distributed, with a mean ($\mu$) of 9", and a standard deviation ($\sigma$) of 0.5", what is the probability that a randomly chosen basketball will have a diameter between 9.5" and 10.5"?

Solution.

Since $\sigma = 0.5$ in and $\mu = 9$ in, we are evaluating the probability that a randomly chosen ball will have a diameter between 1 and 3 standard deviations above the mean. The graphic below shows the portion of the normal distribution included between 1 and 3 SDs:

A bell curve has the region from 9.5 inch to 10 inch labeled as 13.5% and the region from 10 inch to 10.5 inch labeled as 2.35% — Figure 10.2.6. Image Credit: RRCC

The percentage of the data spanning the 2nd and 3rd SDs is 13.5% + 2.35% = 15.85%

The probability that a randomly chosen basketball will have a diameter between 9.5 and 10.5 inches is 0.1585, which means there is a 15.85% chance the basketball will have a diameter between 9.5 and 10.5 inches.

Example 10.2.7.

If the depth of the snow in my yard is normally distributed, with $\mu = 2.5$ inches and $\sigma = .25$ inches, what is the probability that a randomly chosen location will have a snow depth between 2.25 and 2.75 inches?

Solution.

2.25 inches is $\mu - 1\sigma\text{,}$ and 2.75 inches is $\mu + 1\sigma\text{,}$ so the area encompassed approximately represents 34% + 34% = 68%.

The probability that a randomly chosen location will have a depth between 2.25 and 2.75 inches is .68.

Example 10.2.8.

If the height of women in the U.S. is normally distributed with a mean of 5'8" and a standard deviation of 1.5", what is the probability that a randomly chosen woman in the U.S. is shorter than 5'5"?

Solution.

This one is slightly different, since we aren’t looking for the probability of a limited range of values. We want to evaluate the probability of a value occurring anywhere below 5'5". Since the domain of a normal distribution is infinite, we can’t actually state the probability of the portion of the distribution on ‘that end’ because it has no ‘end’! What we need to do is add up the probabilities that we do know and subtract them from 100% to get the remainder.

Here is that normal distribution graphic again, with the height data inserted:

A bell curve has the region from 5 ft 5 inch to 5 ft 6.5 inch labeled as 13.5% and the region from 5 ft 6.5 inch to 5 ft 8 inch labeled as 34%. The region above 5 ft 8 inch is labeled as 50%. — Figure 10.2.9. Image Credit: RRCC

Recall that a normal distribution always has 50% of the data on each side of the mean. That indicates that 50% of U.S. females are taller than 5'8", and gives us a solid starting point to calculate from. There is another 34% between 5'6.5" and 5'8" and a final 13.5% between 5'5" and 5'6.5". Ultimately that totals: $50\%+34\%+13.5\%=87.5\%\text{.}$

Since 87.5% of U.S. females are 5'5" or taller, that leaves 12.5% that are less than 5'5" tall.

Returning to the problem we started with:

If the price per pound of USDA Choice Beef is normally distributed with a mean of $4.85/lb and a standard deviation of $0.35/lb, what is the estimated probability that a randomly chosen sample (from a randomly chosen market) will be between $5.20 and $5.55 per pound?

$5.20 is $\mu + 1 \sigma\text{,}$ and $5.55 is $\mu + 2\sigma\text{,}$ so the probability of a value occurring in that range is approximately 0.135. There is a 13.5% chance that a value will occur in that range.

Problem 10.2.10. Try It Now.

A normally distributed data set has $\mu = 10$ and $\sigma = 2.5\text{,}$ what is the probability of randomly selecting a value greater than 17.5 from the set?

Answer.

$17.5 = \mu + 3\sigma\text{.}$ Since we are looking for all data above that point, we can subtract $100\% - 99.7\% = 0.3\%$ to find the percent of data that is outside of 3 standard deviation from the mean. We are only looking for the data that is above 3 standard deviations, so $0.3\% \div 2 = 0.15\%\text{.}$ There is a 0.15% chance the value will be great than 17.5, or a probability of 0.0015.

Problem 10.2.11. Try It Now.

A normally distributed data set has $\mu = .05$ and $\sigma = .01\text{,}$ what is the probability of randomly choosing a value between .05 and .07 from the set?

Answer.

0.05 is the mean, and 0.07 is 2 standard deviations above the mean, so the chnace of a value in that range is 34% + 13.5% = 47.5%. The probability of a value in that range would be 0.475.

Problem 10.2.12. Try It Now.

A normally distributed data set has a mean of 514 and an unknown standard deviation, what is the probability that a randomly selected value will be less than 514?

Answer.

514 is the mean, so the probability of a value less than that is 0.5.