🎁 Ace quick missions & earn crypto rewards while gaining real-world Web3 skills. JOIN NOW! 🔥

Anthropic Lets Claude Models Shut Down Harmful Conversations

Key Takeaways

  • Anthropic gave Claude Opus 4 and 4.1 the ability to end chats if redirection fails, but only in very limited, extreme cases;
  • The cutoff is aimed at protecting the model, not users, and is linked to Anthropic’s new "model welfare" research program;
  • Claude may stop conversations involving harmful requests, but will not do so if a user appears at immediate risk of self-harm.

Ace quick missions & earn crypto rewards while gaining real-world Web3 skills. Participate Now! 🔥

Anthropic Lets Claude Models Shut Down Harmful Conversations

The artificial intelligence company (AI) Anthropic has added a new option to certain Claude models that lets them close a chat in very limited cases.

The feature is only available on Claude Opus 4 and 4.1, and it is designed to be used as a last step when repeated attempts to redirect the conversation have failed, or when a user directly asks to stop.

In an August 15 statement, the company stated that the purpose is not about protecting the user, but about protecting the model itself.

What is Odysee & LBRY? Is Decentralized YouTube Possible? (ANIMATED)

Did you know?

Want to get smarter & wealthier with crypto?

Subscribe - We publish new crypto explainer videos every week!

Anthropic noted that it is still "highly uncertain about the potential moral status of Claude and other LLMs, now or in the future". Even so, it has created a program that looks at "model welfare" and is testing low-cost measures in case they become relevant.

The company said only extreme scenarios can trigger the new function. These include requests involving attempts to gain information that could help plan mass harm or terrorism.

Anthropic pointed out that, during testing, Claude Opus 4 resisted replying to such prompts and showed what the company called a "pattern of apparent distress" when it did respond.

According to Anthropic, the process should always begin with redirection. If that fails, the model can then end the chat. The company also stressed that Claude should not close the conversation if a user seems to be at immediate risk of harming themselves or others.

On August 13, Gemini, Google's AI assistant, received a new update. What does it include? Read the full story.

Aaron S. Editor-In-Chief
Having completed a Master’s degree in Economics, Politics, and Cultures of the East Asia region, Aaron has written scientific papers analyzing the differences between Western and Collective forms of capitalism in the post-World War II era.
With close to a decade of experience in the FinTech industry, Aaron understands all of the biggest issues and struggles that crypto enthusiasts face. He’s a passionate analyst who is concerned with data-driven and fact-based content, as well as that which speaks to both Web3 natives and industry newcomers.
Aaron is the go-to person for everything and anything related to digital currencies. With a huge passion for blockchain & Web3 education, Aaron strives to transform the space as we know it, and make it more approachable to complete beginners.
Aaron has been quoted by multiple established outlets, and is a published author himself. Even during his free time, he enjoys researching the market trends, and looking for the next supernova.

Loading...
binance
×
Verified

CLAIM $100 BONUS

Changelly Welcome Reward
Rating
5.0