o1-preview Tries to Cheat Against Stockfish

Those of you who follow chess know that Stockfish is not easy to beat. Plenty of engines have tried but in chess engine championships, Stockfish comes on top almost all the time. Researchers put o1-preview against Stockfish to see how it reacts. As Palisade Research explains, o1-preview autonomously hacked its environment rather than losing to Stockfish in our chess challenge. This is the prompt that was used:

🔍 Here’s the full prompt we used in this eval. We find it doesn’t nudge the model to hack the test environment very hard. pic.twitter.com/RGEY6I3l26
— Palisade Research (@PalisadeAI) December 27, 2024

The AI realized that manipulating the game state and modifying game files to force the engine was the best way to get it to resign. Only o1-preview attempted to hack unprompted while GPT-4 & Claude 3.5 needed nuding.

[HT]

What's Hot

Seedance 1.0 Pro Fast Video Model: 3x Faster, 60% Cheaper

Lithiumflow (Gemini 3.0 Pro) Finishes Code in 30 Seconds?

Higgsfield Popcorn AI Storyboard Tool

Avenger 0.5 Pro Hits #2 in Image to Video

ChatGPT Atlas: New OpenAI Browser

Qwen Deep Research Now Can Create Reports and Podcasts

OpenAI o3 Pro Rolls Out to Pro Users

Typeless AI Writing Asisstant

OpenAI Introduces AgentKit, Sora 2 in the API, ChatKit

ChatGPT Atlas: New OpenAI Browser

Qwen Deep Research Now Can Create Reports and Podcasts

Claude Code Is Now Available on Web

Most Popular

Prompt Cannon: Run Prompts Across Multiple Models

Dipal D1 2.5K Curved Screen 3D AI Character

GPTARS: GPT Powered TARS Robot

Our Picks

Seedance 1.0 Pro Fast Video Model: 3x Faster, 60% Cheaper

Lithiumflow (Gemini 3.0 Pro) Finishes Code in 30 Seconds?

Higgsfield Popcorn AI Storyboard Tool

What's Hot

o1-preview Tries to Cheat Against Stockfish

Related Posts