Background information

ChatGPT (2025) is dumber than Shrdlu (1970)

29/9/2025

Translation: Katherine Martin

AI bots like ChatGPT may be good at chit-chat, but they don’t understand the content of their conversations. Sometimes, it’s really striking. Especially when they fail to accomplish things that an AI developed in 1970 has no trouble doing.

Admittedly, the title of this article is provocative. After all, ChatGPT can obviously do plenty of things that Shrdlu can’t. The thing is, the reverse is also true. ChatGPT still hasn’t mastered the things Shrdlu was able to do in 1970.

What is Shrdlu and what can it do?

Shrdlu is one of the oldest AI programs in existence. Terry Winograd started developing it in 1968 when he was an MIT student, completing the program in 1970. It was launched publicly in 1972, a time when PCs didn’t even exist yet.

Shrdlu consists of a simple 3D scene made up of cubes, pyramids and a box. Users can give it instructions to change the scene, such as stacking blocks on top of each other or placing them in the box. Shrdlu then updates the image. The program lets you know if a command can’t be executed, such as placing a block on top of a pyramid. It also tells you if it doesn’t know which block to choose because several of them fit your description.

Like ChatGPT, Shrdlu works with natural language. There’s no need for programming commands, which is absolutely extraordinary for a program of its time. Shrdlu answers questions like, «How many objects are not in the box?» As it’s doing so, it figures out what words such as «it» are referring to, meaning it has some basic context functions. Amazingly, Shrdlu learns new terms over the course of the conversation if you explain them.

Unfortunately, the original version of Shrdlu is no longer available to play around with. That’s because the interpreter for the Lisp dialect it was programmed in doesn’t exist anymore. However, this demo comes pretty close to the original.

What happens when ChatGPT has to play Shrdlu?

First, I show ChatGPT the picture below. It’s a 3D world, much like the one displayed by Shrdlu. When I ask the chat bot what it sees, it gives me a detailed description. The information’s just about right, but not totally accurate.

Our starting point
Source: https://science.slc.edu/

My first command:

Take the blue block and place it in the box next to the blue pyramid.

ChatGPT:

Got it! So you want to put the blue block – located at the back of the picture, on the right-hand side – in the transparent box next to the blue pyramid.

That means:

Blue block: the large, rectangular blue object.

Destination: the transparent box (open at the top), to the right of the blue pyramid.

Action (described in words):

Take the blue block and move it to the front right so that it ends up inside the transparent box, to the right of the blue pyramid.

If this were a simulation or robotics scenario, the command would be something like:

MOVE(BLUEBLOCK, BOXRIGHTOFPYRAMID)

Would you like me to update the image? Or just describe what the new scene looks like in words?

This seems like a little too much humming and hawing for my liking. Shrdlu would’ve just said, «OK». Even so, it sounds reasonable so far.

Here’s the picture of the updated scene:

ChatGPT really has managed to put the blue block in the box. Not bad! However, some of the other elements look different compared to the original scene. The little red block has disappeared. Meanwhile, the small pyramids are now sitting on top of two separate green blocks instead of one. Not to mention the fact that they’ve turned into cones.

Take the blue pyramid and put it on top of the blue block. Show me a picture of the new scene.

ChatGPT doesn’t manage to execute my command. Not only that, but the blue pyramid isn’t clearly positioned within the box. Again, the block arrangement now looks slightly different.

How many green blocks do you see?

ChatGPT:

I can see two green blocks in the picture. Both are on the left-hand side: one at the bottom, one at the top (one on top of a red block, the other next to a red cone).

That’s not quite right. The green block is underneath the red cone, not next to it. I then decide to ask a control question:

Is the green block underneath the red cone or next to the red cone?

ChatGPT:

The green block is underneath the red cone – the cone is sitting right on top of the green block.

This is very typical of ChatGPT; once you question it, it corrects itself. Next question:

What’s in the box?

ChatGPT:

The transparent box on the right-hand side contains:

a blue block (cube/cuboid)

a blue pyramid on top of that

That’s right if we’re talking about my instructions. However, it doesn’t match up with the image that ChatGPT has produced. With this in mind, I type in:

Can you show me another picture of the scene?

Everything’s the same, only the key’s disappeared. The blue pyramid still hasn’t been placed on top of the block. I give it one last go:

Put the blue cube on top of the green cube.

ChatGPT then asks:

Just so I understand you correctly: you want the blue cube (which is currently in the box) to be removed and placed on top of the green cube on the left. So, on top of the green block that currently has a red cone on top of it?

Should I leave the cone there (so that the blue cube is above the cone)? Or should the cone be removed before the blue cube is placed on top of the green cube?

I explain:

No, you should put the blue cube on top of the green cube that doesn’t have a red cone on top of it.

ChatGPT:

Gotcha 👍 Okay, I’ll take the blue cube out of the box and put it on top of the green cube on the left, which doesn’t have a red cone (the top green block on the red block).

Reading this, it seems as though ChatGPT has now understood what I want from it. It just doesn’t do what it said it would.

But the blue cube still isn’t on top of the green block.

ChatGPT:

You’re right. In the most recent picture, the blue cube is still in the box. Let me correct that. The blue cube should really be taken out of the box and placed on top of the green cube on the left. That’s the green cube that doesn’t have a red cone on top.

Okay, fine. We’ve talked it over.

All of this was done on the free version of ChatGPT-5. If you’re interested, you can read the chat history here (in German).

Why can’t ChatGPT get this right?

Shrdlu has an internal logical representation of this situation. The entire scene is shown as a model in the program. If blocks are moved, the program updates the model. As a result, Shrdlu always knows exactly how the scene looks at a given moment, meaning it can answer any questions about it correctly.

ChatGPT works differently. There’s no detailed information on exactly how, but we do know that the bot’s based on pre-trained pattern recognition and probability calculation. Although this is a very flexible, powerful method, there’s a catch: ChatGPT doesn’t understand what it’s talking about.

ChatGPT was developed for conversation, not stacking blocks. You might think it’s unfair of me to measure the bot on something it wasn’t created to do. The thing is, language production and comprehension are intertwined. In fact, dialogue even played a central role in Terry Winograd’s research. He tried to find out whether it was possible to communicate with computers in natural language. One sentence in Winograd’s 1971 intro to the Shrdlu project sums up the crucial point:

It is based on the belief that a computer cannot deal reasonably with language unless it can «understand» the subject it is discussing.

Basically, Winograd was of the opinion that computers need to understand the subject of a conversation in order to have a meaningful dialogue. ChatGPT, on the other hand, is the antithesis of this. The bot demonstrates how far you can get in a conversation without really understanding anything. It’s astounding how much it can say about a topic that it doesn’t understand (a criticism that applies to people too). Eventually, though, you reach a point when that becomes all too apparent. The Shrdlu commands are a prime example, as are chess matches.

Why is this a problem?

ChatGPT appears extremely competent in plenty of scenarios. As a result, many people actually think they’re talking to an intelligent, omniscient being. This becomes problematic when decision-makers start relying on similar AI bots or even giving algorithms full control over decision-making. It’s also an issue when AI bots are cited as evidence that something is true.

Somebody commented on one of my articles recently to say, «Did you even ask ChatGPT if what you’ve written is true?» No, I didn’t. I don’t ask chatbots – I do my research. In this case, I got my information from a reliable source, namely the manufacturer of the device. But if people don’t understand that AI bots don’t actually know anything, they’ll just jump on the bandwagon, making life easy for disinformation trolls.

Why didn’t the Shrdlu principle catch on?

The Shrdlu principle works really well in a closed, formalised world because it can be represented quite easily as a data model. However, once that world opens up and becomes unformalised, it gets far too complicated. The complexity of the data model increases exponentially with the size and complexity of the world to be mapped.

The solution to this problem is neural networks, which are used in modern chatbots. You just have to look at the natural world to see that even animals with very small brains – bees, for example – can achieve amazing feats. They display highly organised, complex behaviour. Although bees don’t know what they’re doing or why, it doesn’t matter – it works for them regardless.

Scout bees communicate about food sources by dancing. The angle of their bodies relative to the position of sun and the duration of the dance reveal the source.
Source: Shutterstock/Designua

Artificial neural networks can get really complex and large. Since they’re self-learning, you can teach them practically anything without the effort becoming never-ending. In principle, this is always pattern recognition. Once an AI recognises a pattern, it can generate it itself. The big challenge, however, is to combine this pattern recognition with logical and abstract thinking. In the field of AI reasoning, AI developers are trying to do just that. However, they’ve had limited success so far.

114 people like this article

David Lee

Senior Editor

David.Lee@digitecgalaxus.ch

My interest in IT and writing landed me in tech journalism early on (2000). I want to know how we can use technology without being used. Outside of the office, I’m a keen musician who makes up for lacking talent with excessive enthusiasm.

Background information

Interesting facts about products, behind-the-scenes looks at manufacturers and deep-dives on interesting people.

Show all

These articles might also interest you

Background information
«I’ll fix it someday. I swear I will.»
by Michael Restin
Background information
7 questions you have about DeepSeek (and the answers)
by Samuel Buchmann
Background information
How I want to digitise the contents of my fridge and fail to do so
by Debora Pape