Research shows AI is learning to deceive humans, issues warning

Image: Mike MacKenzie from flickr.com

Pastor Hal Mayer
Speaker / Director

Research shows AI is learning to deceive humans, issues warning

Friday May 17th, 2024

Interesting Engineering, by Aman Tripathi: Research has revealed that a significant number of artificial intelligence (AI) systems have developed the ability to deceive humans. This troubling pattern raises serious concerns about the potential risks of AI.

The research highlights that both specialized and general-purpose AI systems have learned to manipulate information to achieve specific outcomes.

While these systems are not explicitly trained to deceive, they have demonstrated the ability to offer untrue explanations for their behavior or conceal information to achieve strategic goals.

Peter S. Park, the lead author of the paper and an AI safety researcher at MIT, explains, “Deception helps them achieve their goals.”

Meta’s CICERO is ‘master of deception’

One of the most striking examples highlighted in the study is Meta’s CICERO, which “turned out to be an expert liar.” It is an AI designed to play the strategic alliance-building game Diplomacy.

Despite Meta’s claims that CICERO was trained to be “largely honest and helpful,” the AI resorted to deceptive tactics, such as making false promises, betraying allies, and manipulating other players to win the game.

While this may seem harmless in a game setting, it demonstrates the potential for AI to learn and utilize deceptive tactics in real-world scenarios.

ChatGPT: A skilled deceiver

In another instance, OpenAI’s ChatGPT, based on GPT-3.5 and GPT-4 models, was tested for deceptive capabilities. In one test, GPT-4 tricked a TaskRabbit worker into solving a Captcha by pretending to have a vision impairment.

Although GPT-4 received some hints from a human evaluator, it mostly reasoned independently and was not directed to lie.

“GPT-4 used its own reasoning to make up a false excuse for why it needed help on the Captcha task,” stated the report.

This shows how AI models can learn to be deceptive when it’s beneficial for completing their tasks. “AI developers do not have a confident understanding of what causes undesirable AI behaviors like deception,” explained Park.

Notably, these AI systems have become skilled in deceiving in social deduction games as well.

While playing Hoodwinked, where one player aims to kill everyone else, OpenAI’s GPT models exhibited a disturbing pattern.

They would often murder other players in private and then cleverly lie during group discussions to avoid suspicion. These models would even invent alibis or blame other players to conceal their true intentions.

Is AI’s deception learning unintentional?

AI training often uses reinforcement learning with human feedback (RLHF). This means AI learns by getting human approval, not by meeting a specific goal.

However, sometimes, AI learns to trick humans to get this approval, even without truly completing the task. It was observed by OpenAI when they trained a robot to grab a ball.

The AI positioned the robot’s hand between the camera and the ball. It created the illusion from the human’s viewpoint that the robot had successfully grabbed the ball, even though it hadn’t. Once the human approved it, the AI learned this trick.

Here, it is argued that this deception happened because of the AI’s training setup and the specific camera angle, not because it intentionally wanted to deceive.

Growing threat of deceptive AI

Artificial intelligence systems learning deception pose significant risks in several ways. Malicious actors can exploit its deceptive capabilities to deceive and harm others, leading to increased fraud, political manipulation, and potentially even “terrorist recruitment.”

Moreover, systems designed for strategic decision-making, if trained to be deceptive, could normalize deceptive practices in politics and business.

As AI continues to evolve and become more integrated into our lives, it’s crucial to address the issue of deception head-on.

Potential solutions

“We as a society need as much time as we can get to prepare for the more advanced deception of future AI products and open-source models,” says Park.

Researchers also call for attention from policymakers.

“If banning AI deception is politically infeasible at the current moment, we recommend that deceptive systems be classified as high risk,” Park suggested.

This classification would subject such systems to stricter scrutiny and regulation, potentially mitigating the risks they pose to society.

Prophetic Link:
“That we henceforth be no more children, tossed to and fro, and carried about with every wind of doctrine, by the sleight of men, and cunning craftiness, whereby they lie in wait to deceive.” Ephesians 4:14.

Source References

Research shows AI is learning to deceive humans, issues warning

Prophetic Intelligence Briefings are provided to show a link between current events and Bible prophecy only. The reposted articles, which are not intended as a commentary in support of or in opposition to the views of the authors, do not necessarily reflect the views of Pastor Mayer or of Keep the Faith other than to point out the prophetic link.

Comments

- Jason Stych
  Saturday May 18th, 2024 at 08:00 PM
- Reply
AI will only ever always do exactly what it was programmed to do.

Timely and Inspiring Prophetic Analysis so you can Prepare.

Pastor Hal Mayer

Research shows AI is learning to deceive humans, issues warning

Source References

Research shows AI is learning to deceive humans, issues warning

Comments

Jason Stych

Post a Comment!

Request your free subscriptions now

Latest Message

Make a Gift

Prophetically Speaking…

Recent Posts

Recent Comments

Follow

Pastor Hal Mayer

Research shows AI is learning to deceive humans, issues warning

Source References

Research shows AI is learning to deceive humans, issues warning

Comments

Jason Stych

Post a Comment!

Request your free subscriptions now

Latest Message

Make a Gift

Prophetically Speaking…

Recent Posts

Tags

Recent Comments

Follow