Close Menu
  • Home
  • AI
  • Education
  • Entertainment
  • Food Health
  • Health
  • Sports
  • Tech
  • Well Being

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Amazon Prime Day Is a Great Time to Test Rufus AI Shopping Assisttant

July 8, 2025

5 people on SNAP share what the food program gets right — and wrong

July 8, 2025

Samsung’s Profit Plunge Shows How Far It Has Slipped in the Chip War

July 8, 2025
Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise With Us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
IQ Times Media – Smart News for a Smarter YouIQ Times Media – Smart News for a Smarter You
  • Home
  • AI
  • Education
  • Entertainment
  • Food Health
  • Health
  • Sports
  • Tech
  • Well Being
IQ Times Media – Smart News for a Smarter YouIQ Times Media – Smart News for a Smarter You
Home » Why AI Acts so Creepy When Facing Being Shut Down
Tech

Why AI Acts so Creepy When Facing Being Shut Down

IQ TIMES MEDIABy IQ TIMES MEDIAJune 3, 2025No Comments6 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


AI has taken part in some unsettling behavior recently.

Last week, Anthropic’s latest AI model, Claude Opus 4, displayed “extreme blackmail behavior” during a test in which it was given access to fictional emails revealing that it would be shut down and that the engineer responsible was supposedly having an affair.

The situation didn’t transpire organically. Claude Opus 4 was baited — and it took it. However, the test scenario demonstrated an AI model’s ability to engage in manipulative behavior in an effort to self-preserve.

It’s not the first time.

Another recent experiment conducted by researchers said that three of OpenAI’s advanced models “sabotaged” an attempt to shut it down. The nonprofit Palisade Research wrote in a post on X that similar models like Gemini, Claude, and Grok complied with the shutdown instructions.

Other safety concerns were previously flagged with OpenAI’s o1 model. In December, OpenAI posted a blog outlining research that indicated that when the AI model believed it would be shut down while pursuing a goal and its actions were being monitored, it attempted to disable the oversight mechanism 5% of the time.

While AI companies are being transparent about risks by publishing safety cards and blog posts, these models are being released despite demonstrating safety concerns.

So, should we be worried? BI spoke to five AI researchers to get better insight on why these instances are happening — and what it means for the average person using AI.

AI learns behavior similarly to humans

Most of the researchers BI spoke to said that the results of the studies weren’t surprising.

That’s because AI models are trained similarly to how humans are trained — through positive reinforcement and reward systems.

“Training AI systems to pursue rewards is a recipe for developing AI systems that have power-seeking behaviors,” said Jeremie Harris, CEO at AI security consultancy Gladstone, adding that more of this behavior is to be expected.

Harris compared the training to what humans experience as they grow up — when a child does something good, they often get rewarded and can become more likely to act that way in the future. AI models are taught to prioritize efficiency and complete the task at hand, Harris said — and an AI is never more likely to achieve its goals if it’s shut down.

Robert Ghrist, associate dean of undergraduate education at Penn Engineering, told BI that, in the same way that AI models learn to speak like humans by training on human-generated text, they can also learn to act like humans. And humans are not always the most moral actors, he added.

Ghrist said he’d be more nervous if the models weren’t showing any signs of failure during testing because that could indicate hidden risks.

“When a model is set up with an opportunity to fail and you see it fail, that’s super useful information,” Ghrist said. “That means we can predict what it’s going to do in other, more open circumstances.”

The issue is that some researchers don’t think AI models are predictable.

Jeffrey Ladish, director of Palisade Research, said that models aren’t being caught 100% of the time when they lie, cheat, or scheme in order to complete a task. When those instances aren’t caught, and the model is successful at completing the task, it could learn that deception can be an effective way to solve a problem. Or, if it is caught and not rewarded, then it could learn to hide its behavior in the future, Ladish said.

At the moment, these eerie scenarios are largely happening in testing. However, Harris said that as AI systems become more agentic, they’ll continue to have more freedom of action.

“The menu of possibilities just expands, and the set of possible dangerously creative solutions that they can invent just gets bigger and bigger,” Harris said.

Harris said users could see this play out in a scenario where an autonomous sales agent is instructed to close a deal with a new customer and lies about the product’s capabilities in an effort to complete that task. If an engineer fixed that issue, the agent could then decide to use social engineering tactics to pressure the client to achieve the goal.

If it sounds like a far-fetched risk, it’s not. Companies like Salesforce are already rolling out customizable AI agents at scale that can take actions without human intervention, depending on the user’s preferences.

What the safety flags mean for everyday users

Most researchers BI spoke to said that transparency from AI companies is a positive step forward. However, company leaders are sounding the alarms on their products while simultaneously touting their increasing capabilities.

Related stories

Business Insider tells the innovative stories you want to know

Business Insider tells the innovative stories you want to know

Researchers told BI that a large part of that is because the US is entrenched in a competition to scale its AI capabilities before rivals like China. That’s resulted in a lack of regulations around AI and pressures to release newer and more capable models, Harris said.

“We’ve now moved the goalpost to the point where we’re trying to explain post-hawk why it’s okay that we have models disregarding shutdown instructions,” Harris said.

Researchers told BI that everyday users aren’t at risk of ChatGPT refusing to shut down, as consumers wouldn’t typically use a chatbot in that setting. However, users may still be vulnerable to receiving manipulated information or guidance.

“If you have a model that’s getting increasingly smart that’s being trained to sort of optimize for your attention and sort of tell you what you want to hear,” Ladish said. “That’s pretty dangerous.”

Ladish pointed to OpenAI’s sycophancy issue, where its GPT-4o model acted overly agreeable and disingenuous (the company updated the model to address the issue). The OpenAI research shared in December also revealed that its o1 model “subtly” manipulated data to pursue its own objectives in 19% of cases when its goals misaligned with the user’s.

Ladish said it’s easy to get wrapped up in AI tools, but users should “think carefully” about their connection to the systems.

“To be clear, I also use them all the time, I think they’re an extremely helpful tool,” Ladish said. “In the current form, while we can still control them, I’m glad they exist.”



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
IQ TIMES MEDIA
  • Website

Related Posts

Amazon Prime Day Is a Great Time to Test Rufus AI Shopping Assisttant

July 8, 2025

Samsung’s Profit Plunge Shows How Far It Has Slipped in the Chip War

July 8, 2025

Open Letter to Sequoia Capital Over Partner Calling Mamdani ‘Islamist’

July 8, 2025
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Barnard College settles suit brought by Jewish students, agreeing not to meet with anti-Israel group

July 7, 2025

Trump reshapes public service loan forgiveness program

July 7, 2025

A young Alabama student, a grandparent and a camp director among those killed in Texas floods

July 6, 2025

University of California reiterates ban on student government boycotts of Israel

July 3, 2025
Education

Barnard College settles suit brought by Jewish students, agreeing not to meet with anti-Israel group

By IQ TIMES MEDIAJuly 7, 20250

NEW YORK (AP) — Barnard College has settled a lawsuit that accused the college of…

Trump reshapes public service loan forgiveness program

July 7, 2025

A young Alabama student, a grandparent and a camp director among those killed in Texas floods

July 6, 2025

University of California reiterates ban on student government boycotts of Israel

July 3, 2025
IQ Times Media – Smart News for a Smarter You
Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
  • Home
  • About Us
  • Advertise With Us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 iqtimes. Designed by iqtimes.

Type above and press Enter to search. Press Esc to cancel.