MGMT2263

Worksheet #6

 

An educational researcher wanted to see what factors influenced school grades and examined the average number of hours students studied for a test, average number of hours of sleep per night, gender and household income level. For the purposes of doing regression, male=0 and female=1 and low income = 0 and medium/high income = 1. Twenty subjects were randomly chosen. These were the results:

grade

study

sleep

gender

income

71

4

6

0

0

75

3

9

0

0

61

6

7

1

0

63

8

6

1

1

63

6

10

0

0

58

7

7

1

0

60

5

9

1

0

90

10

7

0

0

93

9

10

1

0

83

11

8

0

1

73

9

5

0

1

75

12

6

0

0

87

12

5

0

1

88

12

4

1

0

90

14

6

0

0

47

4

4

1

0

98

11

8

1

1

96

11

10

1

0

64

7

7

1

1

45

5

6

1

1

1)      Construct a model with grade as the dependent variable and the other variables as the independent variables. State what the model is (grade = 24.3923 + 3.8911*study + 2.9052*sleep – 3.8384*gender – 2.6087*income)

2)      What percentage of the variation in grade is explained by the model? (70.16%)

3)      Is the model significant? (F = 8.8164; yes, the model is significant)

4)      What individual variables are significant? (study and sleep; these are the only variables in which the p-value is less than 5%)

5)      Are there any collinearity problems among the independent variables? (no – all the VIF values are less than 10)

6)      Construct a new model using only study and sleep as the independent variables. State what the model is. (grade = 20.0198 + 3.9576*study + 3.0188*sleep)

7)      If someone studies for 10 hours on average and sleeps for 7 hours, what would you expect the person’s grade to be, rounding to the nearest whole number? (81)

8)      Based on 10 hours of study and 7 hours of sleep, in what grade range would you expect the average grade to fall 95% of the time, rounding to the nearest whole number? (76 to 86)

9)      For an individual who studies for 10 hours and sleeps for 7 hours, in what grade range would you expect the person’s grade to fall 95% of the time? (60 to 100 since the mark cannot be above 100)

10)  Using the criteria of adjusted r2, ANOVA p-value and t-test p-values (using a 5% level of significance), which of the two models is best? (Model #2)

 

A family resource centre wanted to see if a client’s age depends on their gender. These were the results for one particular day:

 

Child

Adolescent

Adult

Total

Male

24

9

12

45

Female

30

15

20

65

Total

54

24

32

110

11)  Test the hypothesis at 5%. (test stat = 0.548; conclude age does not depend on gender)

12)  To what degree does a person’s age depend on their gender? (7.06%)

 

A non-profit organization wanted to see if the number of volunteer hours depended on a person’s work status. These were the results:

Contingency Table

 

Hours group

 

Group

< 10

10 to 19

20+

Grand Total

Working

10

4

0

14

Semi-retired

13

23

2

38

Retired

15

30

8

53

Grand Total

38

57

10

105

13)  Which categories need to be collapsed? (10 to 19 hours and 20+ hours)

14)  After collapsing categories, test the hypothesis at 5%. (test stat = 9.019; conclude the number of volunteer hours depends on a person’s work status)

15)  To what degree does the number of hours depend on a person’s work status? (29.31%)

 

In a survey at a mall during August, people were asked how much they spent that day. For a sample of 8 people these were the results:

50

75

100

120

140

150

240

1350

16)  Is the data normally distributed? Test at a 5% level of significance. (test stat = 0.4109; data is not normally distributed)

17)  Clearly 1350 is an outlier. If we remove this value, show why the remaining data is normally distributed.