The APM syllabus contains the learning outcome:
‘Advise on the common mistakes and misconceptions in the use of numerical data used for performance measurement’.
The mistakes and misconceptions can be divided into two causes:
- The quality of the data: what measures have been chosen and how has the data been collected?
- How has the data been processed and presented to allow valid conclusions to be drawn?
Inevitably, these two causes overlap because the nature of the data collected will influence both processing and presentation.
The collection and choice of data
What to measure?
What data to measure is the first decision and the first place where wrong conclusions can be either innocently or deliberately generated. For example:
- A company boasts about impressive revenue increases but downplays or ignores disappointing profits.
- A manager wishing to promote one of two mutually exclusive projects might concentrate on its impressive IRR whilst glossing over which project has the higher NPV.
- An investment company with 20 different funds advertises only the five most successful ones.
Not only might inappropriate amounts be measured, but they might be deliberately undefined. For example, a marketing manager in a consumer products company might report that the company’s new toothbrush is reported by users to be 20% better.
But what’s meant by that statement? What is ‘better’? Even if that quality could be defined, is the toothbrush 20% better than, for example, using nothing, competitors’ products, the company’s previous products, or better than using a tree twig?
Another potential ruse is to confuse readers is with relative and absolute changes. For example, you will occasionally read reports claiming something like eating a particular type of food will double your risk of getting a disease. Doubling sounds serious but what if you were told that consumption would change your risk from 1 in 10m to 1 in 5m? For most people doubling the risk does not look quite so serious now. The event is still rare and the risk very low.
Similarly, if you were told that using a new material would halve the number of units rejected by quality control, you might be tempted to switch to using it. But if the rate of rejections is falling from 1 in 10,000 to 1 in 20,000, the switch does not look so convincing – although it would depend on the consequences of failure.
Sampling
Many statistical results depend on sampling. The characteristics of a sample of the population are measured and, based on those measurements, conclusions are drawn about the characteristics of the population. There are two potential problems:
- For the conclusions to be valid, the sample must be representative of the population. This means that random sampling must to be used so that every member of the population has an equal chance of being selected for the sample. Other sorts of sampling are liable to introduce bias so that some elements of the population are over or under represented so that false conclusions are drawn. For example, a marketing manager could sample customer satisfaction only at outlets known to be successful.
- Complete certainty can only be obtained by looking at the whole population and there are dangers in relying on samples which are too small. It is possible to quantify these dangers and, in particular, you need to know information like to a 95% confidence level, average salaries are $20,000 ± 2,300. This means that, based on the sample, you are 95% confident (the confidence level) that the population’s average salary is between $17,700 and $22,300 (the confidence interval). Of course, there is a 5% chance that the true average salary lies outside this range. Conclusions based on samples are meaningless if confidence intervals and confidence levels are not supplied.
The larger the sample the greater the reliance that can be placed on conclusions drawn. In general, the confidence interval is inversely proportional to the square size of the sample. So, to halve the confidence interval the sample size has to be increased four times – often a significant amount of work and expense.
More on small samples
Consider a company that has launched a new advert on television. The company knows that before the advert 50% of the population recognises its brand name. The marketing director is keen to show to the board that the advert has been effective in raising brand recognition to at least 60%. To support this contention a small survey has been quickly conducted by stopping 20 people at ‘random’ in the street and their brand recognition was tested. (Note that this methodology can introduce bias: which members of the population are out and about during the survey period? Which street was used? What are the views of people who refuse to be questioned?)
Even if the advert were completely ineffective, it can be shown that there is a 25% chance that at least 12 out of the 20 selected will recognise the brand. So, if the director didn’t get a favourable answer in the first sample of 20, another small sample could be quickly organised. There is a good chance that by the time about four surveys have been carried out one of the results will show the improved recognition that the marketing director wants. It’s rather like flipping a coin 20 times – you intuitively know that there is a good chance of getting an 8:12 split in the results. If instead of just 20 people being surveyed, 100 were asked, then the chance of getting a recognition rate of at least 60% would be only 1.8%. (Note: these results make use of the binomial distribution, which you do not need to be able to use.)
In general, small samples:
- Increase the chance that results are false positives
- Increase chance that important effects will be missed.
Always be suspicious of survey results that do not tell you how many items were in the sample.
Another example of a danger arising from small samples is that of seeing a pattern where there is none of any significance.
Imagine a small country of 100km x 100km. The population is evenly distributed and that four people will suffer from a specific disease. In the graphs below, the locations of the sufferers have been generated randomly using Excel and plotted on the 100 x 100 grid. These are actual results from six consecutive recalculations on the spreadsheet data and represent the six possible scenarios
Now imagine you are a researcher who believes that the disease might be caused high-speed trains. The dark diagonal line represents the railway track going through the country.