Some call it “the new gold”. Others call it “the new currency”. The conventional phrase is that “data is the new oil”.
Mathematician Clive Humby invented the latter in 2006, and it was later popularised by a 2017 report in The Economist. Humby went on to say that data is “valuable, but if unrefined, it cannot really be used. Oil has to be changed into gas, plastic, chemicals and so on, to create a valuable entity that drives profitable activity: so, data must be broken down and analysed for it to have value.”
Behind the vast digital technology is an army of invisible artificial intelligence (AI) workers. This comprises the human component of the Fourth Industrial Revolution (4IR). These invisible AI workers sift through the vast amounts of data generated by AI, the internet of things (IoT), social networks, or machine learning, for instance. The process is called data annotation, which is essentially the process of labelling data, or adding context information, to train machine-learning models. Just as humans learn patterns, you can train AI to do certain tasks.
As an example, if you were to step on a sharp nail, your immediate reaction would be to pull your foot away quickly. The lesson is usually learnt.
This sequence of events – and a pierced foot – are stored in your brain, reminding you not to repeat this action. This knowledge means that the next time you see a sharp object, you are unlikely to step on it. This is how human intelligence works. Much the same, AI is based on machines learning patterns and mimicking human intelligence and, in some instances, even surpassing it. Of course, mimicry calls for a human element to be part of the process. The technology is used in various instances such as self-driving vehicles or to detect TB in X-rays, for example.
An algorithm used in a self-driving vehicle has to be taught the meaning of road signs; how to detect if there is a human or an object in the way, and how to react based on the footage. The process is tedious, and an hour of video takes eight hours to annotate.
In 2018, a McKinsey report listed data labelling as the biggest obstacle to AI adoption in the industry.
The data annotation industry behind this is vast. According to Grandview research, the global data annotation tools’ market size was valued at $390.1-million in 2019, and is projected to see a growth of 26.9% from 2020 to 2027. Early estimates, according to Global Market Insights, is that the industry will balloon to $5-billion by 2026.
As the industry sees rapid growth, there is a demand for third-party companies that employ this AI army. While Africa plays a significant role by way of providing cheap labour, the continent reaps little from the industry. The likes of Google, Microsoft and Yahoo use tech labourers in Kenya through US firm Samasource, for example. Workers are paid about $13-$16 a day, higher than the country’s average of $3 a day.
From Silicon Valley’s point of view, there is attractiveness in tapping into inexpensive labour sources. The annotation aspect of data represents more than 80% of the time consumed in AI and machine-learning projects. Yet, while this creates employment, Kenya sees little of the technology labourers are powering.
In a 2019 article for Forbes, Adi Gaskell writes, “it’s helped to create a world in which the haves are increasingly well off, while the have-nots make do with insecure and poorly paid work. Nowhere is this exchange more evident than in the data annotation industry, where people from around the world help to prepare and tidy up the data used by the tech giants to train the algorithms upon which their fortunes increasingly rest.”
While the annotation industry could be an essential employment creator in South Africa, we must ensure that we are reaping the benefits of the digital technologies it powers. There are some instances of this.
For example, aerobotics, which helps farmers with crop protection by building pest- and disease-monitoring solutions, derives insights from drone and satellite imagery. Yet, according to Aerobotics CEO Benjamin Meltzer, there are lessons to be learnt. There is a specific skill set needed to annotate data, and the focus needs to be on education and training.
The tools used in annotation often prove more expensive than the engineering of digital technologies. These are factors to consider as we look for ways to tap into this industry.
As we begin to implement our national 4IR strategy, these are factors to think about. Among the recommendations made are to invest in human capital and to build 4IR infrastructure. The annotation industry certainly speaks to that.
If we are to tap into this soon-to-be $5-billion industry, we have to look at it holistically.
While it could provide employment opportunities for the mass of unskilled workers in the country, we have to be willing to fund training and ultimately ensure that we are deploying the digital technologies created for our own benefit.
The human element of AI will continue to be relevant, but we must ensure that our own AI narrative speaks to this. DM