Close Menu
  • Home
  • AI
  • Education
  • Entertainment
  • Food Health
  • Health
  • Sports
  • Tech
  • Well Being

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

How Evelyn Ha, Ben Pasternak Breakup Became a Trial by Internet

May 11, 2026

The Sneaky Rise of Shadow AI in the Workplace

May 11, 2026

Pediatricians say schools need to make time for recess for kids of all ages

May 11, 2026
Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise With Us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
IQ Times Media – Smart News for a Smarter YouIQ Times Media – Smart News for a Smarter You
  • Home
  • AI
  • Education
  • Entertainment
  • Food Health
  • Health
  • Sports
  • Tech
  • Well Being
IQ Times Media – Smart News for a Smarter YouIQ Times Media – Smart News for a Smarter You
Home » Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts
AI

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

IQ TIMES MEDIABy IQ TIMES MEDIAMay 10, 2026No Comments2 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.

Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system. Anthropic later published research suggesting that models from other companies had similar issues with “agentic misalignment.”

Apparently Anthropic has done more work around that behavior, claiming in a post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.”

The company went into more detail in a blog post stating that since Claude Haiku 4.5, Anthropic’s models “never engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time.”

What accounts for the difference? The company said it found that training on “documents about Claude’s constitution and fictional stories about AIs behaving admirably improve alignment.”

Related, Anthropic said that it found training to be more effective when it includes “the principles underlying aligned behavior” and not just “demonstrations of aligned behavior alone.”

“Doing both together appears to be the most effective strategy,” the company said.

Techcrunch event

San Francisco, CA
|
October 13-15, 2026



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
IQ TIMES MEDIA
  • Website

Related Posts

Get ready for the whisper-filled office of the future

May 10, 2026

We’re feeling cynical about xAI’s big deal with Anthropic

May 10, 2026

Voice AI in India is hard. Wispr Flow is betting on it anyway.

May 10, 2026
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Pediatricians say schools need to make time for recess for kids of all ages

May 11, 2026

UCLA’s new digital textbook challenges Asian American stereotypes

May 9, 2026

Trump administration again suspends UC Berkeley research grants

May 8, 2026

Canvas outage wreaks havoc for students during college finals

May 8, 2026
Education

Pediatricians say schools need to make time for recess for kids of all ages

By IQ TIMES MEDIAMay 11, 20260

Recess isn’t just a fun break for grade schoolers. It’s crucial to good health and…

UCLA’s new digital textbook challenges Asian American stereotypes

May 9, 2026

Trump administration again suspends UC Berkeley research grants

May 8, 2026

Canvas outage wreaks havoc for students during college finals

May 8, 2026
IQ Times Media – Smart News for a Smarter You
Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
  • Home
  • About Us
  • Advertise With Us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 iqtimes. Designed by iqtimes.

Type above and press Enter to search. Press Esc to cancel.