Chapter 7
Sampling Distribution
Recall that the population mean μ represents the average of all individuals or things under study. But typically, not all individuals can be measured. Rather, we have only a small subset of all individuals available to us, and the average response based on this sample, ̅ , is used to estimate the population mean, μ. An issue of fundamental importance is how well the sample mean, ̅ , estimates the population mean, μ. If the sample mean is ̅ 23 , we estimate that the population mean is 23, but generally this estimate will be wrong. So what is needed is some method that can be used to assess the precision of this estimate. A key component when trying to address these problems is the notion of a sampling distribution. =
7.1
Population and Sampling Distribution
The popul is the probability distribution of the population data. is popul ation distribution
Suppose there are only five students in an advanced statistics class and the midterm scores are 70 78 80 80 95 Let X denote denote the score of a student, we can have the frequency distribution of scores as x
70 78 80 95
f 1 1 2 1
P(X = x )
The probability distribution of a sample statistic is called its sampling distribution.
Sampling distribution of X The probability distribution of X is called its sampling distribution. It lists the various values that X can assume and the probability of each value of X
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Example 7.1: For the data in Example 7.1 , lists all possible samples of three scores that can be selected, selected, without replacement. Calculate the sample sample mean X for each sample and the sampling distribution of X . Solution: Suppose we assign A, B, C, D and E to the scores of five students so that A = 70, B = 78, C = 80, D = 80, E = 95
All possible samples and their means when the sample size is 3. Sample scores in the sample X ABC 70, 78, 80 76.00 ABD 70, 78, 80 76.00 ABE 70, 78, 95 81.00 ACD 70, 80, 80 76.67 ACE 70, 80, 95 81.67 ADE 70, 80, 95 81.67 BCD 78, 80, 80 79.33 BCE 78, 80, 95 84.33 BDE 78, 80, 95 84.33 CDE 80, 80, 95 85.00
Sampling distribution of X when the sample size is 3 Relative f X Frequency 76.00 2 2/10=0.2 76.67 1 1/10=0.1 79.33 1 1/10=0.1 81.00 1 1/10=0.1 81.67 2 2/10=0.2 84.33 2 2/10=0.2 85.00 1 1/10=0.1
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
7.2
Sampling and nonsampling nonsampling errors
is the difference between the value of the sample statistic and the is Sampli Sampli ng err err or value of the corresponding population parameter.
In the case of mean, sampling sampling error x . Assuming that the sample is random and no nonsampling error has has been made. is the error that occurs in the selection, recording and Nonsampling error tabulation of data.
Example 7.2: Reconsider the data in Example 7.1, now suppose we take a random sample of three scores from this population. Assume that this sample includes includes the scores 70, 82 and 95, calculate the sampling error. Solution:
x
70 78 80 80 95 5 70 80 95 3
80.60
81.67
Sampling error = x 81.67 80.60 1.07 Now suppose, when we select the above mentioned sample, we mistakenly record the second score as 82 instead of 80, calculate the nonsampling error. x
70 82 95 3
82.33
Nonsampling error = Incorrect x - Correct x = 82.33-81.67 =0.66 Sampling error = 1.07
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
7.3
Mean and Standard Deviation of X The mean of of X is always equal to the mean mean of th e sampli ng distri buti on mean of the . Thus, X . population
For a sample of size n, if the sampling is done from a finite population (of size N ), ), the standar d deviati of of X is given by deviati on
n N n n N 1
X
n if 0.05 or sampling sampling is done with replacemen t N n if 0.05 and sampling sampling is done without replacemen t N
and if the sampling is done from an infinite population, we have X
n
Remark
1.
N n N 1
is called the finite population correction factor and
is large and N is 2.
The value of
n N
X
0.05 .
decreases as n increases.
N n N 1
1 when
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Example 7.3: The mean wage per hour for all 5000 employees working at a large company is RM27.50 and the standard deviation is RM3.70. Let X be the mean wage per hour for a random sample of certain employees selected from this company. Find the mean and standard deviation of X for a sample size of (a) 30 (b) 75 (c) 300 Solution: N 5000 ,
27.50 ,
3.70
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
7.4
Shape of the sampling distribution of X .
7.4.1
When the population from which sampling distribution of X relates to the following two cases. 1. The population from which samples are drawn d rawn has a normal distribution. 2. The population from which samples are drawn does not have a normal distribution.
Sampling from a normally distributed population If the population from which the samples are drawn is normall y distri with with distri buted buted mean, µ and standard deviation, σ , then the sampli ng distri buti on of th e sample with the following mean and mean , X , will also be normall y distri distri buted buted standard deviation, irrespective of the sample size: X and X
2
That means, if X N ( µ µ, σ ), then X N ( X
n
.
,
2
X
2
n
).
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Example 7.4: In a recent STAT test, the mean score for all examinees was 1016. Assume that the distribution of STAT scores of all examinees is normal with a mean of 1016 and a standard standard deviation of 153. Let X be the mean STAT score of a random sample of certain examinees. Calculate the mean and standard deviation of X and describe the shape of its sampling distribution when the sample size is (a) 16 (b) 50 (c) 1000 Solution: Let be the mean of SAT scores of all examinees be the standard deviation of SAT scores of all examinees 1020 and
153
a) mean and standard deviation of X are 153 38.250 1020 and X X n 16
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
c) mean and standard deviation of X are 153 1020 and 4.838 X X n 1000
7.4.2
Sampling from a population that is NOT normally distributed
Centr Centr al L imit T heore heorem m
For a relatively large sample size, the sampling distribution of X is , regardless of the distribution of the population under approximately normal consideration. The mean and standard standard deviation of the sampling distribution of X are
X and X
n
.
That means, for all distribution of X , if n is large
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Example 7.5: The mean rent paid by all tenants in a large city is RM1550 with a standard deviation of RM225. However, the population distribution of rents for all tenants in this city is skewed to to the right. Calculate the mean and standard deviation of X and describe the shape of its sampling distribution when the sample size is (a) 30 (b) 100 Solution: Although the population distribution of rents paid by all tenants is not normal, in each case the sample size is large (n 30) . Hence, the central limit theorem can be applied to
infer the shape of the the sampling distribution distribution of X . a) Let X be the mean rent paid by a sample of 30 tenants, then mean and standard deviation of X are 225 X 1550 and X 41.079 30 n
b) Let X be the mean rent paid by a sample of 100 tenants, then mean and standard deviation of X are 225 X 1550 and 22.5 X 100
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
7.5
Application of the sampling distribution of X
Example 7.6: Assume that the weights of all packages of a certain brand of cookies are normally distributed with a mean of 32 ounces and a standard deviation of 0.3 ounce. Find the probability that the mean weight, X , of a random sample of 20 packages of this brand of cookies will be between 31.8 and 31.9 ounces. Solution:
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Example 7.7: According to CardWeb, consumers in the United States owned an average of $7868 on their credits cards in 2004. Suppose the shape of the probability distribution of the current credit card debts of all consumers in the United States is unknown but its mean is $ 7868 and the standard deviation is $2160. Let x be the mean credit card debt of a random sample of 81 US consumers. a) What is the probability that the mean of the current credit card debts for this sample of within $440 of the population mean? b) What is the probability that the mean of the current credit card debts for this sample is lower than the population mean by $320 or more? Solution:
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
7.6
Population and Sample Proportions
The population and and sample proporti ons , denoted by p and p , respectively, are X x calculated as p and p , N n where N = total number of elements in the population n = total number of elements in the sample the population that possess a specific characteristic characteristic X = number of elements in the characteristic. x = number of elements in the sample that possess a specific characteristic. ˆ
ˆ
Example 7.8: Suppose a total of 789,654 families live in a city and 563,282 of them own homes. A sample of 240 families families is selected from from this city, and 158 of them own homes. Find the proportion of families who own homes in the population and in the sample. Solution: N = population size = 789,654 X = families in the population who own homes = 563,282
The proportion of all families in this city who own homes ho mes is X 563282 p 0.71 N 789654
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
7.7
Mean, Standard Deviation and Shape of the sampling distribution of p ˆ
Sampling distribution of the sample proportion, p The probability distribution of p is called its sampling distribution. It gives the various values that p can assume and their probabilities. ˆ
ˆ
ˆ
Associates has five employees. The following table gives Example 7.9: Boe Consultant Associates the names of these five employees and information concerning their knowledge of statistics. Name Ally, A John, B Susan, C Lee, D Tom, E
Knows Statistics yes no no yes yes
Solution: If we define the population proportion, p, as the proportion of employees who know
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
The mean mean of th e samplin g distr distr ibut ion of p is always equal to the population proportion . Thus p p . ˆ
ˆ
The standar d deviati of of p is given by deviati on ˆ
p ˆ
pq n
,
if
n N
0.05
and p ˆ
pq N n n
N 1
, if
n N
0.05
where q 1 p .
Centr Centr al L imit Theorem Theorem for Sample ample Proportion Proportion :
According to the central limit theorem, the sampli of of p is ampli ng distri distri bution approximately normal for for a sufficiently sufficiently large sample size. size. In the case of proportion, the sample size is considered to be large if np and nq are both greater than 5, that is if np > 5 and nq > 5. ˆ
That means, if np > 5 and nq > 5, p ˆ
2 N ( p p , p
ˆ
ˆ
pq n
)
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
7.8 Applications of the Sampling Distribution Distribution of p ˆ
When we conduct a study, we usually take only one sample and make all decisions or inference on the basis of the results of that one sample. We use the concepts of the mean, standard deviation, and shape of the sampling distribution of p to determine the probability that the value of p computed from one sample falls within a given interval. ˆ
ˆ
Example 7.11: According to an Associated Press poll, circumstances such as income, education, and marital status affect whether or not Americans feel satisfied with their lives. In this poll conducted during August 16-18, 2004, 38% of adult Americans said that they were very satisfied with the way things were going in their lives at that time. Suppose this result is true for the current population of adult Americans. Let p be the proportion in a random sample of 1000 adult Americans who will say that they are very ver y satisfied with the way things are going in their lives at this time. Find the probability that the value of p is between 0.40 and 0.42 ˆ
ˆ
Solution: