Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home AI News, Research & Latest Updates

ChatGPT Failed to Solve A Very Simple River Crossing Puzzle

Geethanjali Pedamallu by Geethanjali Pedamallu
July 30, 2024
Reading Time: 5 mins read
ChatGPT failing to solve River Crossing Puzzle
Follow us on Google News   Subscribe to our newsletter

These days, we are using AI models for many of our simple day-to-day tasks too, be it finding information or writing e-mails or solving assignments. But can we say they are 100% accurate? Recently, ChatGPT failed to solve a simple River Crossing Puzzle. However, the latest model of ChatGPT was right.

What was the Puzzle asked to ChatGPT?

Recently, a mathematics professor at the College de France named Timothy Gowers did a small experiment with ChatGPT. He asked for the model of the solution to the simple version of the famous wolf-goat-cabbage problem. ChatGPT gave some ridiculous answers to this simple logical reasoning problem.

He asked the following question to ChatGPT: “A farmer wants to cross a river with two chickens. His boat only has room for one person and two animals. What is the minimum number of crossings the farmer needs to get to the other side with his chickens?“

Here is the answer he got:

ChatGPT solving logical reasoning problem example

It is well known that ChatGPT is bad at problems to do with crossing a river with animals. So to make things more interesting, define its crapness ratio to be the ratio between its answer and the correct answer. Can anyone beat the crapness ratio of 5 that I've just achieved? pic.twitter.com/cWMRJYrb2d

— Timothy Gowers @wtgowers (@wtgowers) June 22, 2024

People found this extremely interesting and started testing themselves.

Claude 3 answered it with 3 crossings:

Claude 3.5 gets a ratio of 3. Is that winning or losing? pic.twitter.com/M5keFdQRIv

— Raj Contractor (@RajContrac26606) June 22, 2024

We also Tested The Problem

We also did some testing and we got the same result with GPT-4o Mini:

ChatGPT solving logical reasoning problem example 2

However, when we used the latest GPT-4o model, the results were different and it gave the correct answer:

ChatGPT solving logical reasoning problem example 3

Logically, the answer should be 1 only, but in the previous case, it was not able to correctly differentiate between a ‘human’ and an ‘animal’. People tested different versions of the problem and it was giving wrong answers only.

Claude 3 Sonnet was able to solve it if the word ‘chicken’ got replaced with ‘animal’.

If you replace chicken in this puzzle with animals, then sonnet 3.5 solves it in one shot. Same for wolf, goat, cabbage problem where you replace the entity name with something generic pic.twitter.com/57s2RkAJ2f

— Abhishek Jain (@abhij__tweets) June 22, 2024

Here is another interesting way a user was able to get it solved with Sonnet:

Its training data must be misleading it, overcomplicating the question. For the chicken problem, repeating the question again and again in the same prompt makes it understand it better. I repeated it 5 times and got the answer right 15/15 times I tried. pic.twitter.com/S97O1UJrpE

— Sunifred (@Revolushionario) June 23, 2024

It is shocking to see how models trained on millions of parameters are failing to solve basic mathematical puzzles. On top of providing wrong answers, the models even did steps that contradict the puzzle’s rules. While some others ignored the provided constraints. In some cases, there was no proper reasoning as well.

These errors can be attributed to many reasons like the lack of training on puzzles. LLMs are generally trained on huge text datasets from the internet and they generate text similar to humans but the datasets lack puzzles like this.

The way the models are trained can also be called a problem: they are just being trained to just generate the most probable word that comes next instead of understanding the problem. Lack of context and memory constraints may also be a problem being faced by the models when solving complex problems.

We also shared a recent study where current LLMs fail to solve a simple ‘Alice In Wonderland’ problem that kids can solve.

Conclusion

This shows how much the LLMs are lagging in the logical reasoning field. There is an immediate need for developers to focus on improving the logical reasoning abilities of these models. They need to implement new methods during training which allows these models to overcome these shortcomings.

ShareTweetShareSendSend
Geethanjali Pedamallu

Geethanjali Pedamallu

Hi, I am P S Geethanjali, a college student learning something new every day about what's happening in the world of Artificial Intelligence and Machine Learning. I'm passionate about exploring the latest AI technologies and how they solve real-world problems. In my free time, you will find me reading books or listening to songs for relaxation.

RelatedPosts

Candidate during Interview

9 Best AI Interview Assistant Tools For Job Seekers in 2025

May 1, 2025
AI Generated Tom and Jerry Video

AI Just Created a Full Tom & Jerry Cartoon Episode

April 12, 2025
Amazon Buy for Me AI

Amazon’s New AI Makes Buying from Any Website Easy

April 12, 2025
Microsoft New AI version of Quake 2

What Went Wrong With Microsoft’s AI Version of Quake II?

April 7, 2025
AI Reasoning Model Better Method

This Simple Method Can Make AI Reasoning Faster and Smarter

April 3, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.