Anonymous

Register for more FREE stuff!

my subscriptions

Bivariate Statistics

Question 1

[Maximum mark: 10]



Consider the table below including the minutes played by a basketball player and the points he scored.

Minutes played 25 18 29 31 35
Points scored 20 13 21 36 30

a) Which variable do you think will be independent, and which one will be dependent? Why?


b) Draw a scatterplot representing the data above.


c) How would you describe the association?


d) Find the equation for the best-fit line.


e) Hence, what score would you predict for a player who played for 16 minutes?

Answers and Explanations

Show Answer

Question 2

[Maximum mark: 6]



We are given the data of a couple of students, and how many sit-ups they could do depending on how many push-ups they could do prior to the test.


Push-ups Sit-ups
1 5
3 20
5 30
8 35
30 120
56 200

a) Plot the data in the form of a scatterplot.


b) Find the average of the number of push-ups and sit-ups that were done.


c) Hence, draw the line of best fit for the data by eye.

Answers and Explanations

Show Answer

Question 3

Which correlation type should be used when dealing with outliers?

  • A) Pearson's correlation coefficient
  • B) Spearman's correlation coefficient
  • C) None of them
  • D) Both are equally good

Answers and Explanations

Question 4

[Maximum mark: 11]



The table below shows data regarding months of experience and weekly salaries of company's employees.


Round all your answers to three decimal points.

Experience (months) 10 12 18 15 21 24 35
Weekly salary ($) 600 650 600 800 950 1200 1500

a) Calculate Pearson's correlation coefficient.


b) Rank all observations and calculate the Spearman's correlation coefficient.


c) The regression equation is in the form \( y = mx + a \). Find the values of \( m \) and \( a \).


d) Predict the weekly salary for an employee with 25 months of experience.


e) The actual salary of that employee is $950. Calculate the percentage error.

Answers and Explanations

Show Answer

Question 5

Which assumption is NOT a problem when dealing with linear regression models?

  • A) Extrapolation
  • B) Low correlation
  • C) Large values of Y values compared to X
  • D) Predicting X using Y

Answers and Explanations

Question 6

[Maximum mark: 10]



In the table below you can find the distance covered by Jack in his car (in km) and the temperature outside (in degrees Celsius).


Round all your answers to two decimal points.

Distance (km) 220 200 140 145 100 90 100
Temperature (C) 15 19 22 16 21 25 28

a) Calculate the Pearson's correlation coefficient


b) Calculate the mean values of \( x \) and \( y \).


c) On the 15th day of the trip, the temperature outside was 21 degrees. Apart from traveling by car, Jack decided to cover additional 15km on foot. Estimate the total distance covered by Jack using the regression equation.


d) The total distance covered by Jack on the 15th day was 175km. Calculate the percentage error.

Answers and Explanations

Show Answer

Question 7

[Maximum mark: 14]



The following table shows the data gathered regarding the height of athletes (in meters) and the number of push-ups they could do.


Person Height in meters Number of push-ups
1 1.50 20
2 1.89 48
3 1.75 35
4 1.64 30
5 1.87 44
6 1.96 55
7 1.51 22
8 1.60 27

a) What is the range of push-ups?


b) For the data:

i. Determine the Pearson's correlation coefficient, \( r \).

ii. Describe the relationship between the two variables.


c) Use your GDC to find the line of regression, in the form \( y = mx + b \).


d) Another athlete wants to be included in the test, and he has a height of 1.78 meters.

i. What is the expected number of push-ups he will do?

ii. He actually did 45 push-ups. What is the percentage error in our predicted value?

Answers and Explanations

Show Answer

Question 8

[Maximum mark: 14]



Consider the data below which describe the number of weeks students prepared for an SAT exam and their respective scores obtained.


Round all your answers to two decimal points.

Studying time (weeks) 22 24 23 23 35 37 42
Score 1400 1430 1450 1350 1500 1520 1540

a) Calculate the Pearson's correlation coefficient.


b) Calculate the regression equation and give your answer in the form \( y = a + bx.\)


c) Interpret the value of \( a \).


d) What score should be expected by a student who studied for 40 weeks?


e) Would it be reasonable to make predictions for a student who studied for 3 weeks?

Answers and Explanations

Show Answer

Question 9

[Maximum mark: 18]



A marketing company is conducting an experiment to check how much advertising money affects the revenue obtained by their clients. After performing the experiment for 30 days, here are the results:


Round all your answers to two decimal points.

Advertising money spent ($) 520 511 356 679 823 765 1100
Sales revenue ($) 2310 2600 2246 3129 3840 3561 4129
Advertising money spent ($)
Sales revenue ($)

a) Find:

i. Mean advertising money spent \( (\overline{x}) \).

ii. Mean sales revenue \( (\overline{y}) \).


b) Calculate the Pearson's correlation coefficient.


c) In the missing rows of the table fill in the ranks of each variable and calculate the Spearman correlation coefficient.


The company wants to estimate by how much each additional dollar spent on advertising affects the sales revenue.


d) Calculate the regression equation, giving your answer in the form \( y = a + bx.\)


e) Thus, find by how much each dollar spent on advertising increases the sales revenue.


One of their clients spent 850$ on advertising and obtained the revenue of 3615.82$.


f) Show that this point lies on the regression line.


g) Estimate the sales revenue of a company which spent 930$ on advertising.

Answers and Explanations

Show Answer

Question 10

[Maximum mark: 12]



At a local sports fest a 100m run took place. The age of each participant and their respective time is shown in the table below.


Round all your answers to two decimal points.

Age (years) 18 21 19 25 33 41 28 38
Time (seconds) 11.3 12.1 12.0 14.5 15.7 16.1 14.3 15.6

a) Calculate the Pearson's correlation coefficient.


b) Calculate the regression equation of \( y \) on \( x \), giving your answer in the form \( y = a + bx.\)


c) Calculate the regression equation of \( x \) on \( y \), giving your answer in the form \( x = a + by.\)


The latest participant of the race finished it in the time of 13.7 seconds.


d) What age should we expect the participant to have?

Answers and Explanations

Show Answer

Question 11

[Maximum mark: 12]



The dataset below shows the exam score of both boys and girls in a Math test, as well as their respective study times.


Round all your answers to two decimal points.

Studying time (boys) 2 1 3 5 8 6 10
Score (boys) 60 66 72 80 85 71 92
Studying time (girls) 3 3 4 2 5 7 6
Score (girls) 65 55 70 60 76 81 81

a) Assume the regression equation for girls is in the form \( y = ax + b \) and for the boys in the form \( y = cx + d \). Find the values of \( a, b, c, d \).


b) By analyzing both regression equations, who do you think benefits more from an additional hour of studying?


c) Find the intersection point of these two regression equations.

Answers and Explanations

Show Answer

Question 12

[Maximum mark: 7]



Bob ran laps around the field, always keeping track of his heart rate after each lap. Below are his results.


Lap Heart Rate
1 80
2 85
3 95
4 100
5 120
6 140
7 140
8 140
9 140
10 140

a) Find a linear model that fits the data, taking the natural log of the heart rates.


b) Hence find the exponential model connecting the original two values.


c) Using this newly derived model, predict Bob's heart rate after 20 laps. Suggest why this might be inaccurate.

Answers and Explanations

Show Answer

Question 13

[Maximum mark: 7]



The income in hundreds of dollars of a food store in a small town is gathered in the table below, for each second month of the year.


Month Income
2 99
4 420
6 850
8 1500
10 2600
12 3000

a) Plot the given data on a scatter plot diagram.

b) By analyzing the graph obtained in part (a):

i. Identify the possible outlier, and suggest why it could be an outlier.

ii. Suggest why graphing the months versus the natural log of the income, excluding the outlier, would produce a straight-line graph.


c) Linearize the data, finding the model that predicts the monthly income of the store.

Answers and Explanations

Show Answer