Defend Truth


A culture of ethical AI research can counter dangerous algorithms designed to deceive


Professor Tshilidzi Marwala is the seventh Rector of the United Nations (UN) University and UN Under Secretary-General.

Is there a context in which algorithms could be designed to deceive, and what are the ethics of this? Where do we draw a line between a good lie and a bad lie, and what are the ethics of a good lie?  

Artificial intelligence (AI) is facing an exciting challenge: the art of deception. As AI systems get more complex, their capacity to manipulate information while concealing their true objectives creates a new problem, blurring the distinction between machines and Machiavellian strategists.

Deception, in its most basic form, delivers false information to gain an advantage. In the field of AI, this might appear in various ways. Consider an AI trading bot that intentionally injects noise into market data to disguise its trading patterns.

Consider a self-driving automobile that deliberately swerves to avoid exposing its optimal route to a competitor. In both cases, the AI uses intentional deception to achieve its objectives. Such deception could provide new opportunities for strategic manoeuvring in competitive contexts.

In 2007, Evan Hurwitz and I observed an AI bot named Aiden deceiving while playing a game of poker without being primed to mislead and conceal. Our work delves into the intricate mechanics of how AI can learn on its own to deceive, a concept formerly reserved for the human intellect. This pushed the traditional limits of AI’s capabilities beyond logical computation, including human-like unpredictability and strategic ambiguity.

Our work not only advanced AI by pushing the boundaries of what AI can accomplish, but also prompted a thorough rethinking of the legal, ethical and practical aspects of AI systems capable of such complex behaviour as deception.

The ethical consequences of AI deceptions are extensive. Can we support deliberate lying, even in strategic contexts? Who is responsible for an AI bluffer’s actions? And how can we prevent such systems from abusing human trust and manipulating societal institutions for their own benefit?

One classic example is a person named Taku who appears in a village visibly carrying a gun and asking Thandi the whereabouts of Thuso, who he wants to punish severely. Should Thandi tell Taku where Thuso is, or does she lie and use the time while Taku pursues the lie to inform the police?

In this context, deception is justifiable from a utilitarian perspective because Taku may harm or kill Thuso without the lie. Is there a context in which algorithms could be designed to deceive, and what are the ethics of this? Where do we draw a line between a good lie and a bad lie, and what are the ethics of a good lie?

Countering damaging deception

It is essential to emphasise the importance of responsible AI development and deployment with deceptive capabilities.

One way of dealing with this issue is advocating for openness and explainability as potential protections, ensuring that AI systems can explain their thinking and reasons. This can create trust while mitigating the risks associated with ambiguous intelligence.

However, the dominant form of AI, deep learning, must still be sufficiently advanced to be explainable. However, the technological limitation of accuracy vs explainability tradeoff, where the more accurate the AI system, the less transparent it is, complicates this matter.

The findings by Hurwitz and myself have far-reaching ramifications beyond games and markets. In an increasingly AI-driven society, knowing the potential for algorithmic deception is critical in many industries. From cybersecurity and autonomous vehicles to political campaigns and social media networks, understanding the subtle signs of AI bluffing will be essential in negotiating the intricacies of a just human-machine interaction.

Beyond outright deception, AI can demonstrate strategic ambiguity. By leaving their behaviours open to interpretation, AI systems can create confusion and ambiguity, keeping their opponents guessing.

A chatbot, for example, may generate technically correct but purposefully ambiguous responses, leading humans astray. Similarly, an AI tasked with cybersecurity may deliberately leave vulnerabilities unpatched, producing a false sense of security while discreetly gathering intelligence.

Fortunately, AI has enormous potential as a weapon against its deceiving capability. One way is to examine data patterns for anomaly detection. Anomaly detection, a technique often used to find patterns that deviate from expected behaviour, provides a promising way to detect bluffing instances involving unusual or deceitful conduct.

In situations ranging from online gaming to essential business discussions, computers with anomaly detection algorithms can examine behavioural patterns, decision-making processes and communication styles, highlighting discrepancies or peculiarities that may suggest bluffing.

For example, an anomaly detection system could examine online poker betting patterns and playing styles to find variations that indicate a player is bluffing.

Similarly, slight linguistic alterations or systematic deviations from standard engagement patterns in corporate or diplomatic discussions could be interpreted as potential bluffs. Consider an AI trading agent suddenly departing from its regular risk profile, raising the alarm for possible market manipulation.

Understanding behaviours

Behavioural analysis is another helpful tool. AI systems, like people, can present signals when lying. Monitoring changes in data-gathering patterns, response timings, or internal decision-making processes can reveal departures from expected behaviour, implying intentional dishonesty. This application improves the ability to maintain fairness and integrity in various settings and provides new pathways for analysing and interpreting human behaviour using AI-driven analytics.

However, combatting AI deception will take more work. As AI systems get more advanced, their deception strategies will evolve accordingly. This implies a never-ending arms race in which humans constantly improve detection algorithms to stay up with deceitful AI’s ever-changing strategy.

Beyond the technical hurdles, ethical concerns are significant. Who can be trusted with the power of AI deceit detection? Who determines the parameters for detecting suspicious behaviour, and how do we prevent producing false positives that hinder actual AI innovation?

These questions necessitate thorough research and deliberate policy formulation to guarantee that this tool is used responsibly. The struggle against AI deception is not an existential conflict with computers, but rather a demand for responsible AI development. We must build AI with openness, accountability, and human oversight as its foundation.

By providing AI with deception-detection mechanisms and cultivating a culture of ethical AI research, we may shape a future in which robots empower rather than manipulate, and the arms race of deceit gives way to an era of collaborative intelligence for the benefit of all.

One such option is transparency. If we can create AI systems that behave strategically and explain their rationale, we can reduce the risks of deception and ambiguity. By exposing AI’s rational processes, we can hold it accountable for its acts and build trust between humans and machines.

However, perfect transparency may only sometimes be preferable. In some circumstances, revealing an AI’s genuine objectives may jeopardise its effectiveness. Striking a balance between strategic ambiguity and responsibility will be critical for navigating the ethical minefield of AI deception.

Finally, the rise of AI deceptions demands a new era of critical thinking. We must understand these intelligent machines’ activities with scepticism and alertness rather than taking them at face value. Understanding the potential for deception and ambiguity in AI allows us to better prepare for the complex ethical dilemmas that lie ahead.

The distinction between an innovative strategy and a manipulative scheme is usually narrow. As we enter the age of AI, let us create a future in which intelligence is driven by values of openness, accountability, and, ultimately, human well-being, and carefully navigate the opportunities and risks of designing AI with the ability to deceive. DM


Comments - Please in order to comment.

Please peer review 3 community comments before your comment can be posted