ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

Lifecoach5000@lemmy.world · 1 month ago

IsaamoonKHGDT_6143@lemmy.zip · 1 month ago

They used ChatGPT 4o, instead of using o1 or o3.

Obviously it was going to fail.

wizardbeard@lemmy.dbzer0.com · edit-2 1 month ago

Other studies (not all chess based or against this old chess AI) show similar lackluster results when using reasoning models.

Edit: When comparing reasoning models to existing algorithmic solutions.