Defend Truth


The dual faces of algorithmic bias — avoidable and unavoidable discrimination


Professor Tshilidzi Marwala is the seventh Rector of the United Nations (UN) University and UN Under Secretary-General.

Addressing avoidable biases is feasible, but requires diligence and a commitment to diversity and transparency. It starts with ensuring the data fed into algorithms is as diverse and representative as possible.

With the rise of artificial intelligence (AI), algorithms have become the invisible threads that weave through our daily lives, impacting decisions ranging from the ordinary to the extraordinary.

However, as AI’s popularity develops, so does the acknowledgement of its flaws, including biases and discriminations. These biases, some controllable and others seemingly uncontrollable, jeopardise the integrity of algorithmic choices and represent more significant societal fractures.

We need to explore the complex environment of algorithmic biases, advocating for a nuanced understanding and a proactive approach to resolving these digital representations of social, political, economic and technological faults.

Avoidable algorithmic biases are the result of oversight or neglect. They emerge when the data feeding the algorithms is unrepresentative or when the creators of these systems unintentionally incorporate their unconscious biases into the code.

What are the consequences? Stereotypes are perpetuated, and social injustices are reinforced.

The un-explainability of complex algorithms, mainly those driven by deep learning, adds more difficulty. Deep learning is a type of neural network, a branch of AI that is trained to represent through multiple layers relationships between input variables, e.g., an x-ray image of a person’s lung, to an output, i.e. whether that person has lung cancer or not. The “black box” nature of these deep learning systems makes it difficult to identify biases, making the quest for justice challenging to achieve using these tools.

Addressing avoidable biases is feasible, but requires diligence and a commitment to diversity and transparency. It starts with ensuring the data fed into algorithms is as diverse and representative as possible. It also includes creating diversity among the teams that develop these algorithms, ensuring that diverse perspectives are considered and inherent biases are identified.

However, these solutions are imperfect, and residual biases and discriminations remain. Therefore, the realistic goal to deal with algorithmic bias is to minimise it.

Confronting the bias

Consider language recognition technology for low-resourced languages, and here, we use the Ju/’Hoansi San language as an illustration. Potential algorithmic bias against the Ju/’Hoansi San, an indigenous ethnic group of southern Africa that numbers between 50,000 and 75,000 people, exemplifies the more significant issue of how AI systems might unavoidably discriminate against minority populations.

Due to population size, the Ju/’Hoansi San’s distinctive language is inherently underrepresented in the digital archives. This results in AI language systems that are ill-equipped to recognise the Ju/’Hoansi San language, misinterpreting the nuances of their click languages.

To mitigate this, transfer learning from related but more extensive languages, such as IsiXhosa, can aid in developing more inclusive AI systems despite the limited availability of massive datasets.

Sadly, it is essential to note that despite decreasing algorithmic discrimination, this strategy does not eliminate it. Therefore, the crux of the matter is data representativity.

Data representativity depends on the political economy. The political economy of data representation for AI training is a complex subject that intersects with power dynamics, economic interests, and social structures. Accordingly, the data that AI systems consume is more than just a collection of neutral bits and bytes; it reflects the sociological, political, and economic conditions from which it arises.

These sociological, political, and economic conditions are challenging to fix, especially in the short term, but more so in the life cycle of algorithm development. Therefore, we continue to develop algorithms under these imperfect conditions.

Entities with more resources and influence can frequently acquire, manipulate and curate massive datasets, thereby moulding AI models trained on this data to represent their opinions and interests. This dynamic might result in a representativity gap, in which marginalised communities are either underrepresented or misrepresented in AI training datasets.

As a result, the emerging AI systems may reinforce existing biases, exacerbate structural inequities, and fail to meet the different requirements of the global community.

Diverse datasets

Addressing this disparity necessitates a concerted effort to democratise data gathering and curation, ensuring that AI systems are trained on datasets that are not only large, but diverse and representative of the complex tapestry of human experiences.

This undertaking is a technological, political and economic problem necessitating a collaborative strategy in which policymakers, engineers, and communities work together to design a more fair and inclusive AI ecosystem.

Addressing this disparity is critical for developing AI systems that are truly global and inclusive. Collaborative activities in data collection and curation, including community engagement, must ensure the data is linguistically accurate and culturally representative.

Furthermore, it advocates for novel AI training methods, such as transfer learning or unsupervised learning techniques, to maximise learning from minimal data. Bridging this gap is more than just a technical issue; it is a commitment to linguistic diversity and cultural inclusivity, ensuring that the benefits of AI are available to everyone, regardless of language.

While some biases are avoidable, others are unavoidable and ingrained in the fabric of our cultural and technical frameworks. These unavoidable biases stem from the intricacies of social phenomena, the diverse nature of justice, and the ever-changing fabric of society standards.

Fairness, a concept as old as humanity, is fundamentally subjective. What one person considers fair may not be fair to another. In their pursuit of justice, algorithms frequently come into competing definitions. Optimising for one sort of fairness may unintentionally lead to biases toward another, illustrating the paradoxical nature of our quest for equity.

Furthermore, societal norms are constantly changing. If social attitudes and understandings shift, an algorithm that exemplifies fairness today may become a relic of bias tomorrow. This dynamic terrain transforms the quest for fairness into a journey rather than a destination, a continuous evolution rather than a single achievement.

Social and technological paradigm shift

Addressing inherent biases necessitates a paradigm shift in our approach to algorithmic fairness. It represents a move from a static, one-time solution to a dynamic, continuous process. It entails ongoing monitoring and upgrading of algorithms to reflect changing societal standards and perceptions of fairness.

It also calls for a deeper interaction with stakeholders, particularly those from underprivileged communities. Their perspectives and experiences are crucial in comprehending the varied nature of fairness and bias, elevating algorithmic development from a technical exercise to a societal conversation.

Finally, it advocates for solid ethical frameworks and governance mechanisms to guide the development and deployment of algorithms by social values and standards. These frameworks are more than just guidelines; they serve as guardrails to ensure that our pursuit of technical innovation does not outstrip our commitment to equity and justice.

As we stand at the crossroads of technology and society, algorithmic biases and discrimination are both a challenge and an opportunity. It is a challenge to the integrity of our technological accomplishments and a chance to reflect, correct, and progress.

By tackling avoidable discrimination through care and openness and navigating unavoidable biases through continual growth and inclusive discourse, we can harness the potential of algorithms to reflect humanity’s best qualities, not its flaws.

The route is complicated, but the final result — a society where technology serves as a bridge rather than a barrier to equity — is unquestionably worthwhile. DM


Comments - Please in order to comment.

  • Ben Harper says:

    Algorithms don’t discriminate, they use raw data and produce the appropriate results, it’s human bias and emotions that see discrimination in cold hard facts

    • Niki Moore says:

      I have to challenge you there, Ben. Algorithms can discriminate, depending on the instructions they have been given. And raw data can be extremely discriminatory, depending on what is left out. What the columnist is trying to say, is that programmes written by a single cultural, population or race group can reflect the biases of that group, depending on how the algorithms are written and what data is processed. Here’s a simple example…. most online content is generated by Americans. As a result, here in South Africa we are starting to use American terms like diaper, trunk and elevator. If algorithms are written to recognise only those words, anyone consuming online content will think those are the only words that exist.

  • EK SÊ says:

    BEE for computers.

  • chris butters says:

    Harper you just don’t get it (again). As he clearly explains, it depends WHAT hard facts one puts in. (And not necessarily very hard facts at all, of course). Choice of facts can very swiftly become biased – as all researchers know – even if the intent is fair. And to what extent is the intent of large vested financial or political interests “fair”?
    A further key issue is, as he explains, the need to revisit, update and modify algorithms over time – dynamically. This may seldom be done, just as software is often NOT upgraded with necessary patches. Or, the “patches” may be discriminatory too.
    And he rightly underlines the need for stakeholder involvement- such as civil society participation. Hoe often do big corporations or authoritarian regimes really promote that?
    Pitfalls all the way. As everyone agrees, AI needs to be VERY tightly and well regulated. But how, and by whom? More bias is again possible if not probable. The “black box” nature of this problem makes it inherently beyond democratic control. SO: as he says: solvable, yes – but how likely??

    • Ben Harper says:

      The YOU don’t get it, the algorithm produces what it’s designed to produce, it doesn’t jut arbitrarily decide on its own to go a different way and choose to “discriminate” against anything. Humans and particularly underachievers who don’t like what the data tells them see the discrimination where most times there is none.

  • Lucius Casca says:

    “*Gasp*…just not the language recognition of the Ju/’Hoansi San.” This ou should stick to what he’s good at being the mismanagement of academic institutions like what he did at UJ.

Please peer review 3 community comments before your comment can be posted