Defend Truth


Lies, damn lies and statistics: Revision of South Africa’s GDP figures a reason to start effectively harnessing big data


Professor Tshilidzi Marwala is the seventh Rector of the United Nations (UN) University and UN Under Secretary-General.

By definition, statistics are usually an estimation and are never exact. The recent revision of South Africa’s Gross Domestic Product (GDP), indicating that the economy is actually 11% bigger than previously thought, is a glaring example. We need to harness big data more effectively in our statistical analyses.

Benjamin Disraeli, who was prime minister of the United Kingdom first in 1868 and later between 1874 and 1880, was considered a gifted man. He was a good judge of character. It is often speculated that the decisive military defeat that the British suffered at the hands of the Zulu Kingdom in 1879, in the Battle of Isandlwana, hastened his death, which was two years after this defeat. King Cetshwayo was the leader of the Zulu Kingdom at the time, and the duet between Cetshwayo and Disraeli effectively ended Disraeli’s political career.

Disraeli said in his frank characterisation of the Zulu people, “The Zulus are remarkable people, they defeat our generals, they convert our bishops.” Here, Disraeli was referring to John Colenso, a bishop of Natal who was stationed there to use religion to colonise the Zulu people but ended up being sympathetic to their culture and beliefs. The frankness of Disraeli did not end with his assessment of the Zulu people but extended to the field of statistics. He once remarked that there are three types of lies in the order of severity: “lies, damn lies and statistics”. In this regard, Disraeli considered statistics the highest form of lies.   

A few weeks back, the statistics on the Gross Domestic Product (GDP) of South Africa were revised, indicating that South Africa’s economy is actually 11% bigger than previously thought. Accordingly, the GDP of South Africa is now estimated at R5.521-trillion from the previous R4.973-trillion. The 2020 GDP reduction of 7% has now been revised to 6.4%. The problem with relying on these statistics is that many people planned and executed their plans based on the wrong GDP numbers. The number of bad decisions in our society, politics and economy based on the wrong numbers is incalculable. This revision reinforces Disraeli’s mistrust of statistics. Statistics are sometimes associated with a bad omen. At the height of his reign of terror, Joseph Stalin once remarked, “The death of one man is a tragedy. The death of millions is a statistic.” Recently, in the US, Chris Murphy, the Democratic Senator of Connecticut, wrongly claimed that eight out of 10 US drones miss their targets. Whether this was a deliberate fabrication or just incorrectly calculated statistics, we will never know.

But statistics should not be viewed in a negative light. In fact, by definition, statistics are usually an estimation and are never exact. For example, if one needs to know the average height in Johannesburg, there are two ways of doing this. First, it is to go and measure the height of everyone in Johannesburg and calculate the average height, which is impossible. The other is to measure a random selection of 1,000 people and find the average height. This is called statistical estimation.

While the need for statistics is apparent, there are several reasons statistics can be wrong. The first reason is if the data used are not comprehensive enough, which is called the small sample problem. The truth is that sample size is not an exact science, and even though some best practice guidelines guide statisticians, it is still subjective. To illustrate the sample size problem using the estimation of an average height in Johannesburg, do we sample 1,000 or 50,000 or 200,000 people?

The second is the problem of bias. In this regard, if we select our 1,000 samples from Soweto, then the average height will represent Soweto, not the whole of Johannesburg. To deal with this issue, we ought to select the samples randomly from across the entire city. Randomly selecting places to sample is a complex problem that is better handled by a machine than by a human being.

Coming back to the problem of South Africa’s GDP data, it is crucial to understand how GDP is measured. In economics, there are two approaches to estimate GDP: the expenditure and the income approaches. These approaches are intended to measure the amounts of goods and services produced in the economy. The expenditure approach calculates the GDP by adding all consumer and investor spending plus the difference between the exports and the imports in the economy.

The income approach estimates the GDP by adding all national income plus sales taxes plus depreciation plus total income generated by the country’s citizens overseas versus income by foreigners in the country. Theoretically, both these approaches are supposed to yield the same results. However, it must be noted that it is difficult to measure all these factors, and for South Africa, the relatively large underground and informal economies exacerbate this issue.

Given this context, why was the GDP of South Africa revised?

Firstly, Statistics South Africa (Stats SA) included new sources of information to estimate these numbers. Secondly, Stats SA added new compilation methods, which may have included facts such as sample sizes. Thirdly, it refined the classification of economic activities and revised the reference year from 2010 to 2015. These changes resulted in the size of our economy being R550-billion bigger.

The new numbers now indicate a significant growth in finance, business services and property, and a decline in mining, energy and transport. Despite all these changes, the GDP of South Africa is still grossly undervalued, with the informal economy being underestimated.

There is an argument here for us to augment our approach to collecting statistics even further. Injecting technology into the process ensures more accuracy. After all, machines do not suffer from bias and human error. 

Despite the flaws of statistics, it is apparent that it is still an essential instrument for evidence-based decision-making. I would argue that Disraeli was wrong – statistics are not the worst form of lying but just an imperfect instrument for a data-driven economy.

Now that we live in the era of big data, where the amount of data available is enormous and the technology to process the data, such as artificial intelligence and quantum computing, is enormous, let us use these tools to improve statistics and make it a more perfect tool for rational decision- making. DM 


Comments - Please in order to comment.

  • Michael Forsyth says:

    It’s “Lies, DAMNED lies and statistics”.

  • Bhekinkosi Madela says:

    In 2014 Nigeria “recalculated” their GDP and suddenly managed to “grow” their economy by 245 billion dollars. That made the country’s economy the biggest in Africa. Down south we “recalculated” ours and “gained” 38 billion dollars. I concur with the Tshilidzi’s advocacy for taking fuller advantage of AI and quantum computing to improve reliability of our statistics.

  • André van Niekerk says:

    Things get especially dodgy once you start using ratios to motivate arguments. In a sample of the super rich, the mere millionaire can be the poorest; and with 2 billionaires and 1 millionaire, the latter can be “very poor” in relation to the group. But that doesn’t make them poor. Amongst a group of poor people, the top earning ten percent is not necessarily rich.

    Measuring the Gini coefficient e.g., it is easy, and true, that there is a huge income gap and it should be reduced. But by only focusing on the reduction of the ratio to acceptable norms, it would be easy to solve. Just ask all the super rich to leave, and then the Gini will come down and everyone can be happy! So the focus should not be the reduction of the coefficient per se, it should the increase in earning of the lower tier of earners. While the result will still be the improvement of the ratio, that in itself should not be the focus, but rather the measurement tool. Politicians love to forget that.

    In South Africa, if you earn a barely livable income, you count among the top 10% of richest individuals. That does not make you rich. It just shows how widespread poverty is.

  • Charles Parr says:

    Well I guess it’s one way of achieving economic growth. Unfortunately we pursue cloud cuckooland economics and that’s all we’ll get. Wait until the statisticians recalculate Eskom’s output.

Please peer review 3 community comments before your comment can be posted