A Level Maths and Further Maths: the importance of accurate vocabulary in statistics
20 November 2024
Amy Dai, Maths Subject Advisor
In this blog, I’ll look at some of the common misconceptions and errors that students are prone to making when using statistics vocabulary. With insight from the examiners’ reports it will help to give you some areas to keep in mind when next teaching statistics.
Statistics forms part of A Level Maths A (H240) and A Level Maths B (MEI) (H640) and in unit Y542 in A Level Further Maths A and unit Y422/Y432 in A Level Further Maths B (MEI) (H645).
Why is vocabulary important?
In statistics, the precise use of vocabulary is essential – there are many scenarios where explanation or conclusions are required. Having a clear understanding on what different key terms means helps students when they need to communicate their solutions. These key terms are also required in order to make sense of the question: they will help students to overcome one of the most common exam question fears – not knowing what the question is asking for.
Hypothesis tests
“Candidates who did less well spoiled otherwise correct solutions with a response that was too assertive, or contradictory.”
Examiners’ report for H640/02 Q13 (d), June 2023
When thinking about vocabulary and precise terminology in statistics, most students and teachers will think of hypothesis tests, and specifically the conclusion required at the end of the question. This is a clear example of when overly generic or vague conclusive statements are insufficient and unable to gain marks. On the other hand, statements which are too definite, therefore asserting a conclusion which isn’t supported by the hypothesis test, are also unable to gain marks. To read more about this, see my colleague Steven Walker’s detailed blog about hypothesis tests and the art of being non-assertive.
Another aspect of hypothesis testing which requires candidates to be precise and show clear understanding of the context is when they are defining their parameters. The fundamental use of a hypothesis test is to evaluate the appropriateness of a single model. Many candidates can hold the misunderstanding that a hypothesis test compares two different models – this may seem like a small misconception, but it impacts on how they set up their null and alternate hypothesis. For example, the examiners’ report for H240/02 Q11, June 2022, highlighted examples of candidates using H0: µ = 3300, H1: µ = 3360 instead of the correct H0: µ = 3300, H1: µ > 3300.
Central limit theorem
In previous examiners’ reports, there have been a number of misconceptions identified that students have held about the central limit theorem (CLT).
- Some students think that CLT can’t be applied when the parent distribution isn’t normal. This may come from a mix up of facts, where the mean of any number of samples taken from a normal population is itself distributed normally. In fact, this is the opposite, CLT’s primary usefulness is being able to have this approximation to normalness when the parent population is not normal.
Examiners’ report for Y542/01 Q3 (b), June 2023
- In direct contrast to the above misconception, students also wrongly make the assumption that having the CLT in place means that the parent distribution can be assumed to be normal. It should be stressed to students that in these contexts, there are two variables: the value of a single observation, X (which is from the parent distribution), and the mean, X bar of a sample of observations. The CLT is applied to X bar, and not to X. As summarised by one of our expert examiners, “No sample, however large, can turn a non-normal distribution into a normal one.”
Examiners’ report for Y542/01 Q3 (b), June 2024
Random sampling
Random is a term that is used in everyday conversation, not just in the statistics classroom. This means students will have familiarity with using the word, and it can prove to be a stumbling block at understanding its precise definition in maths. In commonplace English, random stands for unspecified, without conscious decision or unusual. However, in statistics, it describes a sense of systematic unpredictability – each sample has an equal chance of being chosen. Here are some common misconceptions that examiners have picked up on in the past.
- Students assuming that a sample which is random is therefore unbiased. These features, although both desirable, are not the same. It is possible for a sample to be random but also biased (and vice versa). For example, taking a sample of students from a local university to survey about employment statuses of young people. This is random as all of those students were equally likely to be chosen, but biased as the sample is unlikely to include any non-students.
Examiners’ report for Y432/01 Q2, June 2023
- Assuming that a random sample will be representative of a population. Although it is true that a random sample will avoid systematic bias, there’s no mechanisms in place for it to be representative, unlike, for example, stratified sampling.
Examiners’ report for Y432/01 Q5 (d), June 2022
These have come specifically from Further Mathematics, so is more nuanced than what is required in A Level Maths, but similar principles apply. A common theme in A Level Maths is when candidates refer to ‘random’ without checking whether the sampling approach actually is random. Generally, it is essential that these types of answers are also given in context.
Correlation or association?
Another term that often gets confused by students is correlation. Although this word is not used often in everyday conversation, students can often have a vague definition of it, thinking it describes any general relationship. This misunderstanding can be quite embedded, as they first learn about correlation back in KS3. It would be worthwhile to emphasise that the term correlation applies to trends that are based on a specific algebraic relationship – in the OCR specifications it only looks at linear relationships.
On the other hand, association describes a more general relationship, where variables can give information about each other. There can occasionally be confusion from students about when to use the different tests for relationships between variables – here they are listed explicitly:
- PMCC measures correlation
- Spearman’s rank correlation coefficient measures linear correlation between the ranks of the variables (this may involve association or non-linear correlation in the raw data)
- Chi-squared tests measure association.
Whether or not it’s association or correlation, it’s important for students to remember that neither of those are necessarily indicators of causality. Although this common statistics misconception is often mentioned both in and out of the classroom, it still shows up in candidate responses. It might also be useful to talk to your students about “confounding variables”. For example, there may seem to be a causal relationship between number of ice creams eaten and the rate of sunburns – but actually the causal variable is hot weather. Having open discussions in the classrooms where students can try and come up with their own scenarios like this will help to drive home the separation of association/correlation and causality.
The nuance between these commonly used terms is complex and there’s quite a bit of misunderstanding around it. If you would like to research this further, beyond the scope of the A Level, here are some other articles:
Stay connected
Share your thoughts in the comments below. Don’t forget to join us for our Teacher Network events. If you have any questions, you can email us at maths@ocr.org.uk, call us on 01223 553998 or message us @OCR_Maths. You can also sign up for monthly email updates to receive information about resources and support.
If you are considering teaching any of our qualifications, use our online form to let us know, so that we can help you with more information.
About the author
Amy joined OCR in 2023 after teaching for five years in both state and independent schools. She provides support across all the OCR Maths qualifications, but with a focus on GCSE, A Level Maths and Further Maths. She graduated from the University of York with a degree in Mathematics and Economics before gaining a PGCE in Secondary Mathematics and an MA in Education.