Chapter 8 - Statistical Inference

Statistical Inference: Working with Confidence Interval

Content

Terms / Definition

Objectives

Confidence Interval

Statistical Estimates

Example of Confidence Interval

Terms / Definitions Top

TERMS	DEFINITION
Confidence interval	An interval or range computed from the sample data that has a known probability (e.g. 95%) of containing the population unknown parameter
Parameter	A number that describes the population , e.g. a mean or percent
Sample proportion	The proportion, p-hat of the member of a sample with certain characteristic(s)
Statistic	A number that describes a sample, an estimate of the population parameter
sqrt	square root

Chapter Objectives: Top

This chapter answers the question of how sure are we about

a certain statistics calculated from a sample of sample size n.

With some percent assurance or confidence say 95% we make a statement like:

"We are 95% sure that the true value we seek from the population is within a certain range of a statistics

we have calculated"; for example, we may say that the mean of 60% of those who will vote for the next

national leader is really 40 to 70% and that we are almost certain of this, at least we are 95% sure.

Confidence Intervals: Top

The confidence interval (CI): is an interval or range obtained from the sample study
that tells us that 95% of the samples will fall or be included in this interval.

CI = estimate or statistics +/- margin of error

or CI = statistics like the mean +/- 2 x standard deviation

example if the sample mean is 45% and the standard deviation is 5 % from a certain sample size of 500.

Then CI = 45 +/- 2x5 = 45 +/- 10 or 35 to 55 %

What does this mean: we are 95% confident that the true mean is between 35 and 55%

Estimates Top

Since the population parameters or some mathematical statement about a population is typically unknown

or rather than studying the entire population we take a sample for this is save on COST or TIME.

Depending on how we take our sample we will get a different estimate of the "true" population value;
therefore, the goodness of our estimate is dependent on following factors:

1. Sample size, n: How big or large is the sample size - The larger the better for our estimate

2. How random is our sample?: The more random or homogeneous the sample the better
we feel about our sample representing the population.

3. What is the question we are trying to answer?: Depending on how we ask our question
will determine how we calculate our statistic.

4. What is the error or standard deviation of our sampling? This gives us along with our sample size, n
our confidence interval.

Table of comparisons for x-bar

Sample statistics	Population parameters
x-bar	mu or m
s or standard deviation	sigma or sigma/sqrt(n), i.e. the population standard deviation divided by the square root of the sample size .. in this class we will give you s.
distribution of x-bar is a normal distribution

Table of comparisons for proportion, p-hat

Sample statistics	Population parameters
*p hat*	p
sigmap	sqrt(p(100-p)/n): 1. convert proportion to %, e.g. p =.12 = 10% 2. subtract p from 100 3. multiple step 2 by p in percent 4. divide step 3 by sample size n 5. take the square root of step 4 6. show answer as percent see examples below
the sample distribution of p-hat is approximately normal and gets closer to a normal distribution as sample size, n get larger

Examples of Confidence Interval calculations: Top

Example 1: From a random study of 5, 000 household we found that the average income is
$40,000. If the standard deviation was calculated to be $5,000, what can we say about
the average income in the community that was studied?

Give an interval that show the average income of most of the people in the community.

Answer: x-bar or the mean income is $40,000 and s or standard deviation is $5,000

CI = 40,000 +/- 2(5,000) = 40,000 +/- 10,000 or $30,000 to $50,000

We can say that most of the household of this community make an income between $30,000 and $50,000.

Example 2: From a random study of 1500 adults 600 said that they fear going out at night.

Give an interval that show the percent of adults who fear going out at night.

Answer: p-hat or proportion of adult who fear going out at night = 600 / 1500 =0.4

or 40% (p converted to percent)

the sigmap is equal sqrt(p(100-p)/n)

or sqrt((40x60)/1500)) = sqrt(1.6) = 1.265 %

CI = 40 +/- 2(1.265) = 40 +/ 2.53 or 37.47 to 40.253

So we are 95% certain that between 37.47 to 40.253 of most adults fear going out at night.

Example 3. From a random study of the 200 rolls of a dice it is observed that 105 times
of the total rolls we get even numbers. What can we say about the chance or
probability of getting even numbers from this study?

Answer: proportion of even numbers = 105 / 200 = 0.525

or % p -hat is 52.5%

sigmap is equal sqrt(p(100-p)/n) = sqrt((52.5(47.5)/200) = 3.53%

CI = 52.5 +/- 2(2.53) or 45.44 to 59.56%

So we are 95% confident that getting even numbers with the roll of this dice is between 45.44 and 59.56 %