Columbia College
STAT-177 Practice
Final Exam
Question 1
A website specializing in computer employment opportunities did a salary survey of 10,000 computer professionals. These were the results:
|
Class |
Frequency |
Relative Frequency |
Cumulative Relative Frequency |
|
Under
$50,000 |
215 |
2.15% |
2.15% |
|
$50,000
to under $75,000 |
1398 |
13.98% |
16.13% |
|
$75,000
to under $100,000 |
3392 |
33.92% |
50.05% |
|
$100,000
to under $150,000 |
4778 |
47.78% |
97.83% |
|
$150,000
or more |
217 |
2.17% |
100.00% |
a) What percentage of the respondents earn less than $100,000 per year?
b) What percentage earn at least $50,000 per year?
c) What percentage earn at least $50,000 per year but less than $100,000 per year?
d) In which class would you find the median annual salary?
Question 2
The salaries were further broken down by years of experience. These were the results:
|
|
Under 3 years |
3 years to under 5 years |
5 years to under 10 years |
10 years or more |
Total |
|
Under
$50,000 |
58 |
87 |
62 |
8 |
215 |
|
$50,000
to under $75,000 |
278 |
664 |
324 |
132 |
1398 |
|
$75,000
to under $100,000 |
197 |
493 |
2135 |
567 |
3392 |
|
$100,000
to under $150,000 |
78 |
376 |
3852 |
472 |
4778 |
|
$150,000
or more |
2 |
24 |
95 |
96 |
217 |
|
Total |
613 |
1644 |
6468 |
1275 |
10000 |
a) What percentage of those who earn less than $50,000 per year have less than 5 years experience?
b) What percentage of those who have at least 5 years experience earn at least $100,000 per year?
c) What percentage of those who earn at least $75,000 per year but less than $150,000 per year have at least 3 years experience but less than 10 years experience?
d) Does a person’s salary depend on the amount of experience the person has? Test the hypothesis at 5%. State the p-value and conclusion.
e) To what degree what salary depend on experience?
Question 3
Here are 6 scenarios. For each one, state what type of test to conduct. Be specific. For example, don’t just say Z-test; say Z-test for one mean or Z-test for two means.
a) A store pulls samples of 100 daily sales from 4 of its locations. The days were independently chosen for each location. The data for each location is normally distributed.
b) A TV manufacturer receives picture tubes from 2 different suppliers. Samples of 500 tubes were sampled from each supplier to see if there was any significant difference in the percentage of defective tubes. For the first supplier, 16 tubes were defective while for the second supplier, 27 were defective.
c) At a focus group with 10 participants, each was asked to rate a new laptop computer on ease of use from a scale from 1 to 10. The results were examined by gender to see if there was any significant difference between the 2 genders in the ratings.
d) A store pulled a sample of 60 sales after running an ad campaign to see if the average sale per customer increased.
e) In a survey of 400 respondents, each respondent was asked to rate on a scale from 1 to 5 how much they liked 3 popular TV programs.
f) A company conducted an interval survey of its salespeople. One question asked how much time per week the salesperson spent meeting with clients. The next question asked what would be the ideal amount of time. The results were normally distributed.
Other basic hypothesis testing questions:
g) Suppose you tested a hypothesis at a 5% level of significance and the p-value was 0.0109. Would you reject or accept the null hypothesis?
h) What type of error could have been committed in the hypothesis in part g)?
Question 4
In a computer software help desk department, 96% of help calls from customers are successfully resolved within 1 hour on average.
a) If the help desk receives 12 calls in an hour, what is the probability that at least 11 of them will be successfully resolved within 1 hour?
b) Suppose the help desk processes 200 help desk calls in the course of an 8-hour shift. What is the probability that fewer than 95% of these 200 calls will be successfully resolved within 1 hour?
c) On average, 1 in 50,000 help desk calls cannot be successfully resolved at all. If the help desk receives 180,000 calls in the course of a month, how many should they expect will not be successfully resolved?
d) Quality control measures were taken to see if the percentage of calls successfully resolved within 1 hour could be significantly increased beyond 96%. After the measures were instituted, 500 calls were sampled; 494 were successfully resolved within 1 hour. Were the quality control measures successful? Test at 5%. State the p-value and conclusion.
Question 5
One of the other measures taken in the help desk department was to re-organize the help desk materials in an attempt to reduce the amount of time required for each call. In the past, the average amount of time per call was 18 minutes. After the measures were implemented, 12 calls were sampled from a shift. These were the results:
|
14 |
18 |
7 |
19 |
14 |
12 |
|
12 |
17 |
14 |
23 |
13 |
19 |
a) Based on this sample, were the measures successful? Test at a 5% level of significance. State the p-value and conclusion.
b) Suppose the level of significance were 1% instead of 5%. Why would the opposite conclusion be reached?
Question 6
The software manufacturer sold its products through 3 different outlets: large franchises, small independent chains and its website. It pulled 6 samples from each of the months of January through June to see if there was any significant difference in the average amount spent by a customer at the 3 types of outlets. These were the results:
|
|
Large franchise |
Small chain |
Website |
|
January |
$28.26 |
$17.82 |
$36.73 |
|
February |
$30.01 |
$25.49 |
$37.14 |
|
March |
$30.33 |
$26.17 |
$40.23 |
|
April |
$32.77 |
$28.68 |
$41.17 |
|
May |
$42.28 |
$31.39 |
$43.15 |
|
June |
$47.63 |
$34.02 |
$45.49 |
a) Is there any significant difference in the average sale at the 3 types of outlets? Test at 5%. State the p-value and conclusion.
b) Which 2 types of outlets show the largest difference on average?
c) Based on the 95% confidence intervals, if a customer spends $30 on average at a small chain store, in what range would be the average sale for a large franchise with 95% confidence? Round the limits to the nearest cent.
Question 7
The software manufacturer wanted to see if there was a relationship between the amount people spend on software and annual household income. A sample of 10 warranty cards returned by customers were selected in which the annual household income was stated. These were the results:
|
Software purchase price |
Annual household income (in
thousands of dollars) |
|
46.44 |
84.5 |
|
21.25 |
42.5 |
|
38.60 |
71.8 |
|
38.33 |
85.7 |
|
46.62 |
87.8 |
|
47.43 |
86.5 |
|
20.50 |
49.9 |
|
41.77 |
81.0 |
|
22.89 |
55.0 |
|
27.15 |
44.5 |
a) Construct a model in which the purchase price depends on annual household income. State the model.
b) What percentage of the variability in software purchase price depends on annual household income?
c) Is the model significant? Test at 5%. State the p-value and conclusion.
d) If a household earns $150,000 per year, what would be the average price of software this household purchases? Round to the nearest dollar.
e) For an annual household income of $150,000, what is the range of the average price for software for 95% of the time? Round to the nearest dollar.
Question 8
A land developer wanted to see to which degree land sale prices depended on land value, improvement value and area. Twenty properties were randomly selected. These were the results:
|
Sale Price (in thousands of
dollars) |
Improvement Value (in thousands of
dollars) |
Land Value (in thousands of
dollars) |
Area (in hundreds of sq. ft.) |
|
68 |
6 |
45 |
19 |
|
45 |
9 |
28 |
29 |
|
55 |
10 |
31 |
11 |
|
62 |
10 |
40 |
13 |
|
116 |
18 |
73 |
22 |
|
45 |
9 |
27 |
9 |
|
38 |
8 |
30 |
12 |
|
83 |
23 |
48 |
18 |
|
59 |
8 |
39 |
12 |
|
47 |
9 |
29 |
17 |
|
40 |
7 |
40 |
11 |
|
41 |
8 |
32 |
15 |
|
97 |
20 |
59 |
27 |
|
45 |
8 |
23 |
12 |
|
41 |
8 |
21 |
12 |
|
80 |
11 |
56 |
20 |
|
56 |
4 |
21 |
17 |
|
37 |
5 |
23 |
10 |
|
50 |
3 |
36 |
11 |
|
22 |
2 |
6 |
15 |
a) Construct an initial model using all the variables. What percentage of the variability in sale price is explained by the model?
b) In this model, which variables are significant? Test at 5%.
c) State why there are no collinearity problems with this model.
d) Construct a new model using just the significant variable(s) from the first model. Which model is better, the first or the second?
Question 9
At a movie
concession stand, a clerk can serve 2 customers per 5 minutes on average.
a) What is the probability a clerk can serve at
least 3 customers in a 5-minute period?
b) What is the probability a clerk can serve no more than 1 customer in a 2-minute period?
c) If there are 3 clerks working, what is the probability they can collectively serve at least 6 customers in a 3-minute period?
d) If there are 6 clerks working, what is the most number of customers they can be expected to serve in a 10-minute period based on m + 3s? Round to the nearest whole number.
Question 10
A survey of two cities was taken to determine if there is any significant difference in the percentage of people who recycle milk jugs. These were the results:
|
|
City A |
City B |
|
# surveyed |
400 |
500 |
|
# recycle milk jugs |
102 |
93 |
The test was conducted at a 5% level of significance.
a) State the p-value and conclusion.
b) Based on the 95% confidence interval, what is the largest difference between the two cities in the percentage of people recycling milk jugs?
Question 11
A courier company is concerned that with increased construction in the city there will be more variability in its delivery times. Based on its records, the times have been normally distributed with a mean of 24 minutes and standard deviation of 3.5 minutes. For a sample of 10 deliveries, these were the results:
|
28 |
24 |
31 |
36 |
36 |
|
39 |
19 |
29 |
35 |
25 |
a) State the null and alternative hypotheses.
b) State the value of the test statistic and the conclusion.
c) If a level of significance had not been chosen, why would the same conclusion be reached?
Question 12
When the courier company has 3 drivers working, customers wait 2 hours on average for a delivery.
a) If there are 3 drivers working, what is the probability a customers waits more than 3 hours for a delivery?
b) If there are 5 drivers working, what is the probability a customer waits less than 1.5 hours for a delivery?
c) If there are 2 drivers working, what is the probability a customer waits more than 5 hours for a delivery if the customer has been waiting 3 hours and 45 minutes already?
Question 13
Two focus groups were held, one aged 18-24, the other 45-54, to see if the average cell phone bill was higher for the 18-24 group. These were the results:
|
18-24
group |
78.17 |
56.87 |
79.65 |
80.29 |
83.26 |
89.99 |
60.75 |
74.81 |
|
45-54
group |
51.82 |
68.59 |
54.72 |
62.47 |
64.43 |
61.99 |
51.60 |
68.05 |
The test was conducted at a 5% level of significance.
a) State the null and alternative hypotheses.
b) State the value of the test statistic and the conclusion.
c) Based on the 95% confidence interval, what is the largest difference in the average cell phone bill between the two age groups? Round to the nearest cent.
Question 14
Suppose cell phone bills are normally distributed with a mean of $75 and standard deviation of $11.25.
a) What is the probability a cell phone bill is less than $60?
b) What is the probability a cell phone bill is more than $80?
c) What is the probability a cell phone bill is between $70 and $85?
d) What is the cutoff for a cell phone bill to place in the top 1%? Round to the nearest cent.
e) For a sample of 25 cell phone bills, what is the probability their average is less than $80?
Question 15
Two focus groups were held, one of executives, the other of regular office workers to see if there was any significant difference in how they rated a new PDA on a scale from 1 to 10, 1 = poor, 10 = excellent. These were the results:
|
Executives |
4 |
5 |
7 |
8 |
5 |
8 |
5 |
5 |
6 |
6 |
|
Regular |
7 |
7 |
7 |
7 |
9 |
6 |
8 |
7 |
6 |
8 |
a) What is the appropriate test for this data? State the reasons.
b) State the test statistic and conclusion.
Question 16
Based on historical records, 2.4% of households have annual household income of $100,000 or more and 1.8% of households watch Bravo at least once a week. 96.4% of households have annual household income below $100,000 and watch Brave less than once a week.
a) A theatre company will send literature to households that either have annual household income of at least $100,000 or watch Bravo at least once a week. What percentage of households will it send literature to?
b) What percentage of households with annual household income of $100,000 or more watch Bravo at least once a week?
c) What percentage of households that watch Bravo less than once a week have annual household income below $100,000? Round the percentage to 2 decimals.