a) This is a cumulative frequency graph, so the total number of customers is the last value on the y-axis, which is 200.
b) Since the limit is $10, it can be seen from the graph that this value corresponds to 50 people. Thus, since we know from the previous part that there were 200 customers, the answer is \( \frac{50}{200} \) so 25%.
c) Highest 10% of customers means that 20 top spending people have to be considered (10% of 200 is 20). Therefore, the minimum amount to qualify for that will be the amount spent by the 180th person. By looking at the graph, we can see that this value is close to $90.
Closea) This is just a usual mean calculation where we sum up all the numbers and divide it by 7, so the answer is 110642.86.
b) After ordering the salaries in ascending order and counting how many of them we have (there is 7 observations in total), it can be clearly seen that the 4th observation is 111000.
c) Since the previous mean was 110642.86, adding a new observation higher than the mean will lead to its increase.
a) As it can be seen from looking at the graph, 40km/h corresponds to cumulative frequency of 60, and 60km/h corresponds to cumulative frequency of 100. So, 40 drivers drove between 40km/h to 60km/h.
b) (i) Since there was 150 observations, the median will be at \( \frac{150 + 1}{2} \), so between the 75th and 76th observation. From the graph we can see that it will be around the speed of 45km/h.
b) (ii) Q3 occurs for the \( \frac{3}{4} * (n+1) \), so between the 113th and 114th observation. From the graph we can see that it will be around the speed of 65km/h.
b) (iii) Q1 similarly occurs for the \( \frac{1}{4} * (n+1) \), so between the 37th and 38th observation, so for the speed of around 33km/h.
c) Based on the answers from part (b), the IQR is equal to \( 65 - 33 = 32 \).
d) We can see that the speed of 90km/h occurs for the cumulative frequency of 135, so \( p = 150 - 135 = 15 \).
Closea) Mean and median can be easily obtained in the GDC and they are equal to:
\[ \text{mean} = 78.4 \]
\[ \text{median} = 82 \]
The only repeating value is 68, so this is our mode.
b) The boxplot is shown below including all required values.
c) Using GDC the variance can be quickly calculated:
\[ \sigma = 10.13622... \]
\[ \text{variance} = \sigma^2 \]
\[ \text{variance} \approx 103 \]
Closea) To find the value of \( x \) we need to go backwards using the knowledge we have about the mean. In a normal case, to caluclate the value for the mean we first have to sum up all values.
Therefore, we know that the value of 0 happened 7 times, the value of 1 happened 11 times, the value of 2 happened 18 times, and so on. This can be written as: 0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,2... Then, all these values have to be summed up and divided by the total number of observations:
\[ \text{mean} = \frac{0*7 + 1*11 + 2*18 + 3*x + 4*3 + 5*1}{7+11+18+x+3+1} \]
\[ \text{mean} = \frac{3x+64}{x+40} \]
Now, knowing that the mean is equal to 2:
\[ \text{2} = \frac{3x+64}{x+40} \]
\[ x = 16 \]
b) The modal value is the one which happens most often. As we can, 2 is our modal value, since it occurred 18 times.
Closea) (i) To be able to calculate the mean of such data we first have to obtin their midpoints:
Unemployment Rate (%) | Frequency | Midpoint |
---|---|---|
\( 0 \leq x < 2 \) | 2 | 1 |
\( 2 \leq x < 5 \) | 5 | 3.5 |
\( 5 \leq x < 9 \) | 8 | 7 |
\( 9 \leq x < 11 \) | 13 | 10 |
\( 11 \leq x < 15 \) | 10 | 13 |
\( 15 \leq x < 20 \) | 4 | 17.5 |
Then, those midpoints are treated as single observations, so if we were to "ungroup" the data, we would have the value of 1 happening 2 times, value of 3.5 happening 5 times, and so on. So, it would look like this: \( 1,1,3.5,3.5,3.5,3.5,3.5,7,7... \). Then, all these values have to be summed up and divided by the total number of observations:
\[ \text{mean} = \frac{2*1 + 5*3.5 + 8*7 + 13*10 + 10*13 + 4*17.5}{2+5+8+13+10+4} \]
\[ \text{mean} = 9.65 \]
You can also directly input the midpoints and frequencies into the GDC 😉
a) (ii) To get the modal we have to find the class which happens most often. Frequency of 13 is the highest for the class \( 9 \leq x < 11 \). Hence, the mid-interval value of our modal class is 10.
b) The standard deviation is also calculated using midpoints. All values can be input into the GDC together with their frequencies, so we get:
\[ \text{standard deviation} = 4.24 \]
Closea) To find \( a \), we need the total frequency to be 130:
\[ 2 + 4 + 8 + a + 21 + 20 + 37 + 28 + 0 + 1 = 130 \]
Thus, \( a = 9 \).
b)
The midpoints of each score interval are:
Score | Frequency | Midpoint |
---|---|---|
0 ≤ x < 10 | 2 | 5 |
10 ≤ x < 20 | 4 | 15 |
20 ≤ x < 30 | 8 | 25 |
30 ≤ x < 40 | 9 | 35 |
40 ≤ x < 50 | 21 | 45 |
50 ≤ x < 60 | 20 | 55 |
60 ≤ x < 70 | 37 | 65 |
70 ≤ x < 80 | 28 | 75 |
80 ≤ x < 90 | 0 | 85 |
90 ≤ x < 100 | 1 | 95 |
By plugging it into the GDC we get that:
\[ \text{mean} = 55.6 \]
c) To find the maximum and minimum scores which are not considered outliers, we need to calculate the interquartile range (IQR). First, we calculate the cumulative frequencies:
Score | Frequency | Cumulative Frequency |
---|---|---|
0 ≤ x < 10 | 2 | 2 |
10 ≤ x < 20 | 4 | 6 |
20 ≤ x < 30 | 8 | 14 |
30 ≤ x < 40 | 9 | 23 |
40 ≤ x < 50 | 21 | 44 |
50 ≤ x < 60 | 20 | 64 |
60 ≤ x < 70 | 37 | 101 |
70 ≤ x < 80 | 28 | 129 |
80 ≤ x < 90 | 0 | 129 |
90 ≤ x < 100 | 1 | 130 |
\[ Q_1 = \frac{1}{4} \times 130 = 32.5th \ observation \]
\[ Q_3 = \frac{3}{4} \times 130 = 97.5th \ observation \]
\[ Q_1 \approx 45 \]
\[ Q_3 \approx 65 \]
\[ IQR = Q_3 - Q_1 = 65 - 45 = 20 \]
\[ \text{Lower Boundary} = Q_1 - 1.5 \times IQR = 45 - 1.5 \times 20 = 15 \]
\[ \text{Upper Boundary} = Q_3 + 1.5 \times IQR = 65 + 1.5 \times 20 = 95 \]
The scores between 15 and 95 are not considered outliers.
d)
There were 29 students who scored 70 points or more.
\[ \text{Percentage} = \left( \frac{29}{130} \right) \times 100 \approx 22.3\% \]
Closea)
\[ \text{Mean} = \frac{S}{N} \]
\[ S = 18 \times 9 + 19 \times 11 + 20 \times 13 + 21 \times 10 + 22 \times x + 23 \times 7 \]
\[ S = 162 + 209 + 260 + 210 + 22x + 161 = 1002 + 22x \]
\[ N = 9 + 11 + 13 + 10 + x + 7 = 50 + x \]
The mean age is given as 21:
\[ 21 = \frac{1002 + 22x}{50 + x} \]
\[ 21 \times (50 + x) = 1002 + 22x \]
\[ 1050 + 21x = 1002 + 22x \]
\[ 1050 - 1002 = 22x - 21x \]
\[ 48 = x \]
So, \( x = 48 \).
b)
Total number of students between ages 19 and 22:
\[ N_{19-22} = 11 + 13 + 10 + 48 = 82 \]
Total sum of ages for this range:
\[ S_{19-22} = 19 \times 11 + 20 \times 13 + 21 \times 10 + 22 \times 48 \]
\[ S_{19-22} = 209 + 260 + 210 + 1056 = 1735 \]
\[ \text{Average Age} = \frac{1735}{82} \approx 21.2 \]
c)
By inputting the values into the GDC we get that:
\[ \sigma \approx 1.48 \]
Closea) (i) & (ii) There is 366 days in a leap year, so:
\[ 366 = 141 + 93 + b + 22 + a \]
\[ 366 = 141 + 93 + b + 22 + 10b \]
\[ b = 10 \]
\[ a = 1 \]
b) The standard deviation can be found by inputting the values into the GDC, using the midpoints:
\[ \sigma \approx 1.82 \]
c) There will be in total 23 days with rainfall of at least 6mm, which means that the government will have to pay 30000$.
Closea) We need to rearrange the data.
\[ 1 \ \ 3 \ \ 3 \ \ 4 \ \ 5 \ \ 5 \ \ 6 \ \ 7 \ \ 8 \ \ 10 \]
Since we have 10 values here, the middle value will be between the fifth and sixth values. Since they are both 5, the median is also 5.
b) The mode is the most frequently appearing value. Here, both 3 and 5 appear twice, so they are both the mode.
c) \(Q_1\) is the median of the lower half of the values, thus the median of:
\[ 1 \ \ 3 \ \ 3 \ \ 4 \ \ 5 \]
The median of this set is 3, thus \(Q_1 = 3\). Similarly, we can find \(Q_3 = 7\).
c) (i) \(IQR = Q_3 - Q_1 = 7 - 3 = 4\)
Closea) Discrete.
b) The formula for the mean of a set of values is:
\[\frac{1}{n}\sum_{i=1}^n f_i x_i\]
The total number of employees is:
\[ 19 + 15 + 10 + a + 3 + 2 = 49 + a \]
Then, we can calculate:
\[\sum_{i=1}^n f_i x_i = 1 \cdot 19 + 2 \cdot 15 + 3 \cdot 10 + 4a + 5 \cdot 3 + 6 \cdot 2 = 106 + 4a\]
Thus, the mean is:
\[\frac{106+4a}{49+a} = 2.33\]
\[ a = 5 \]
c) Random sampling.
Closea) The median is the 50% value, which is 76 here.
b) The leftmost end of the diagram represents the minimum mass, so 60kg.
c) The interquartile range (IQR) is calculated as follows:
\[ IQR = Q_3 - Q_1 = 79.5 - 72.25 = 7.25 \]
d) The IQR spans from \(Q_1\) to \(Q_3\), which accounts for 50% of the students. Since there are 10 students, 50% of that is 5 students.
e) A student is considered an outlier if their mass is outside the range:
\[ \text{Lower fence} = Q_1 - 1.5 \times IQR \]
\[ \text{Upper fence} = Q_3 + 1.5 \times IQR \]
Calculating the boundaries:
\[ Q_1 - 1.5 \times IQR = 72.25 - 1.5 \times 7.25 = 65 \]
\[ Q_3 + 1.5 \times IQR = 79.5 + 1.5 \times 7.25 = 86.75 \]
Since 92 is greater than 86.75, he is indeed an outlier.
Closea) (i) 3.33
a) (ii) 1.19
b) To find the median, we can either order the data or read it off the GDC. It is equal to 3.
c) We first need to find \(Q_1\) and \(Q_3\). Then, the difference will be \( IQR \). \(Q_1\) corresponds to the \(30 \cdot 0.25 = 7.5\), so we take the 8th person, and \(Q_3\) corresponds to the 23rd person. They used 3 and 4 hits, respectively.
\[ IQR = Q_3 - Q_1 = 4 - 3 = 1 \]
This is also shown by the GDC.
d) A data point is considered an outlier if it falls outside the range:
\[ Q_1 - 1.5 \times IQR = 3 - 1.5 = 1.5 \]
\[ Q_3 + 1.5 \times IQR = 4 + 1.5 = 5.5 \]
Thus, anyone outside \( (1.5, 5.5) \) hits is considered an outlier.
e) The probability is calculated as:
\[ Probability = \frac{8 + 4 + 1}{30} = 0.433 \]
f) This is a conditional probability question:
\[ Pr(P_1 = 5, P_2 = 5 | P_1 \ge 4) = \frac{ Pr(P_1 = 5, P_2 = 5)}{Pr( P_1 \ge 4)} \]
\[ = \frac{\frac{4}{30} \cdot \frac{3}{29}}{0.433} \approx 0.0318 \]
CloseNumber of travels abroad | 0 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|
Number of people | 7 | 11 | 18 | x | 3 | 1 |
Unemployment Rate (%) | Frequency |
---|---|
\( 0 \leq x < 2 \) | 2 |
\( 2 \leq x < 5 \) | 5 |
\( 5 \leq x < 9 \) | 8 |
\( 9 \leq x < 11 \) | 13 |
\( 11 \leq x < 15 \) | 10 |
\( 15 \leq x < 20 \) | 4 |
Score | Frequency |
---|---|
\( 0 \leq x < 10 \) | 2 |
\( 10 \leq x < 20 \) | 4 |
\( 20 \leq x < 30 \) | 8 |
\( 30 \leq x < 40 \) | a |
\( 40 \leq x < 50 \) | 21 |
\( 50 \leq x < 60 \) | 20 |
\( 60 \leq x < 70 \) | 37 |
\( 70 \leq x < 80 \) | 28 |
\( 80 \leq x < 90 \) | 0 |
\( 90 \leq x < 100 \) | 1 |
Number of coffees per day | Number of employees |
---|---|
1 | 19 |
2 | 15 |
3 | 10 |
4 | \( a \) |
5 | 3 |
6 | 2 |
Age | Number of students |
---|---|
18 | 9 |
19 | 11 |
20 | 13 |
21 | 10 |
22 | x |
23 | 7 |
Rainfall (mm) | Frequency |
---|---|
\( 0 \leq x < 2 \) | 141 |
\( 2 \leq x < 4 \) | 93 |
\( 4 \leq x < 6 \) | \( b \) |
\( 6 \leq x < 8 \) | 22 |
\( 8 \leq x < 10 \) | \( a \) |
Number of hits needed | Number of people |
---|---|
1 | 2 |
2 | 5 |
3 | 10 |
4 | 8 |
5 | 4 |
6 | 1 |