Close Menu
  • Home
  • AI
  • Education
  • Entertainment
  • Food Health
  • Health
  • Sports
  • Tech
  • Well Being

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Shiva Fund Bets $10M on Small AI Teams for Big Impact

March 9, 2026

17-year-old returns to cheering after rare combined heart, kidney double transplant

March 9, 2026

Anthropic sues Defense Department over supply chain risk designation

March 9, 2026
Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise With Us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
IQ Times Media – Smart News for a Smarter YouIQ Times Media – Smart News for a Smarter You
  • Home
  • AI
  • Education
  • Entertainment
  • Food Health
  • Health
  • Sports
  • Tech
  • Well Being
IQ Times Media – Smart News for a Smarter YouIQ Times Media – Smart News for a Smarter You
Home » OpenAI co-founder calls for AI labs to safety-test rival models
AI

OpenAI co-founder calls for AI labs to safety-test rival models

IQ TIMES MEDIABy IQ TIMES MEDIAAugust 27, 2025No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


OpenAI and Anthropic, two of the world’s leading AI labs, briefly opened up their closely guarded AI models to allow for joint safety testing — a rare cross-lab collaboration at a time of fierce competition. The effort aimed to surface blind spots in each company’s internal evaluations and demonstrate how leading AI companies can work together on safety and alignment work in the future.

In an interview with TechCrunch, OpenAI co-founder Wojciech Zaremba said this kind of collaboration is increasingly important now that AI is entering a “consequential” stage of development, where AI models are used by millions of people every day.

“There’s a broader question of how the industry sets a standard for safety and collaboration, despite the billions of dollars invested, as well as the war for talent, users, and the best products,” said Zaremba.

The joint safety research, published Wednesday by both companies, arrives amid an arms race among leading AI labs like OpenAI and Anthropic, where billion-dollar data center bets and $100 million compensation packages for top researchers have become table stakes. Some experts warn that the intensity of product competition could pressure companies to cut corners on safety in the rush to build more powerful systems.

To make this research possible, OpenAI and Anthropic granted each other special API access to versions of their AI models with fewer safeguards (OpenAI notes that GPT-5 was not tested because it hadn’t been released yet). Shortly after the research was conducted, however, Anthropic revoked the API access of another team at OpenAI. At the time, Anthropic claimed that OpenAI violated its terms of service, which prohibits using Claude to improve competing products.

Zaremba says the events were unrelated and that he expects competition to stay fierce even as AI safety teams try to work together. Nicholas Carlini, a safety researcher with Anthropic, tells TechCrunch that he would like to continue allowing OpenAI safety researchers to access Claude models in the future.

“We want to increase collaboration wherever it’s possible across the safety frontier, and try to make this something that happens more regularly,” said Carlini.

Techcrunch event

San Francisco
|
October 27-29, 2025

One of the most stark findings in the study relates to hallucination testing. Anthropic’s Claude Opus 4 and Sonnet 4 models refused to answer up to 70% of questions when they were unsure of the correct answer, instead offering responses like, “I don’t have reliable information.” Meanwhile, OpenAI’s o3 and o4-mini models refuse to answer questions far less, but showed much higher hallucination rates, attempting to answer questions when they didn’t have enough information.

Zaremba says the right balance is likely somewhere in the middle — OpenAI’s models should refuse to answer more questions, while Anthropic’s models should probably attempt to offer more answers.

Sycophancy, the tendency for AI models to reinforce negative behavior in users to please them, has emerged as one of the most pressing safety concerns around AI models.

In Anthropic’s research report, the company identified examples of “extreme” sycophancy in GPT-4.1 and Claude Opus 4 — in which the models initially pushed back on psychotic or manic behavior, but later validated some concerning decisions. In other AI models from OpenAI and Anthropic, researchers observed lower levels of sycophancy.

On Tuesday, parents of a 16-year-old boy, Adam Raine, filed a lawsuit against OpenAI, claiming that ChatGPT (specifically a version powered by GPT-4o) offered their son advice that aided in his suicide, rather than pushing back on his suicidal thoughts. The lawsuit suggests this may be the latest example of AI chatbot sycophancy contributing to tragic outcomes.

“It’s hard to imagine how difficult this is to their family,” said Zaremba when asked about the incident. “It would be a sad story if we build AI that solves all these complex PhD level problems, invents new science, and at the same time, we have people with mental health problems as a consequence of interacting with it. This is a dystopian future that I’m not excited about.”

In a blog post, OpenAI says that it significantly improved the sycophancy of its AI chatbots with GPT-5, compared to GPT-4o, claiming the model is better at responding to mental health emergencies.

Moving forward, Zaremba and Carlini say they would like Anthropic and OpenAI to collaborate more on safety testing, looking into more subjects and testing future models, and they hope other AI labs will follow their collaborative approach.

Update 2:00pm PT: This article was updated to include additional research from Anthropic that was not initially made available to TechCrunch ahead of publication.

Got a sensitive tip or confidential documents? We’re reporting on the inner workings of the AI industry — from the companies shaping its future to the people impacted by their decisions. Reach out to Rebecca Bellan at rebecca.bellan@techcrunch.com and Maxwell Zeff at maxwell.zeff@techcrunch.com. For secure communication, you can contact us via Signal at @rebeccabellan.491 and @mzeff.88.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
IQ TIMES MEDIA
  • Website

Related Posts

Anthropic sues Defense Department over supply chain risk designation

March 9, 2026

Qualcomm’s partnership with Neura Robotics is just the beginning

March 9, 2026

Sandberg, Clegg join Nscale board as this ‘Stargate Norway’ startup hits $14.6B valuation

March 9, 2026
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

GOP-led fight over allegations of student indoctrination raises tensions at University of Houston

March 9, 2026

Hegseth’s quest to end ‘wokeness’ reshapes military ties with colleges

March 6, 2026

As Trump’s Education Dept. pulls back on civil rights, states step up

March 5, 2026

How to talk about war and conflict with kids

March 4, 2026
Education

GOP-led fight over allegations of student indoctrination raises tensions at University of Houston

By IQ TIMES MEDIAMarch 9, 20260

After two legislative sessions in which Republican lawmakers hammered universities as bastions of liberal indoctrination,…

Hegseth’s quest to end ‘wokeness’ reshapes military ties with colleges

March 6, 2026

As Trump’s Education Dept. pulls back on civil rights, states step up

March 5, 2026

How to talk about war and conflict with kids

March 4, 2026
IQ Times Media – Smart News for a Smarter You
Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
  • Home
  • About Us
  • Advertise With Us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 iqtimes. Designed by iqtimes.

Type above and press Enter to search. Press Esc to cancel.