Training an AI assistant

There’s an adorable game called Lab Assistant, in which you get to train a machine learning algorithm to obey your commands. Sadly, the commands don’t include “make me a sundae” or “wield this sword of justice”, but it turns out you can learn a lot about machine learning algorithms by training this one.

The Lab Assistant game is absolutely charming, with the little slime blushing and bouncing for joy when it figures out one of your commands. If you don’t have a Windows machine to run the slime version, it’s based on the simple block-stacking game from this paper, which you can run in your browser and imagine your own slime. (And the block version seems to learn a bit faster too)

You’re training a machine learning algorithm from scratch in this game. The AI starts with no knowledge of language, and it has to try to figure out which of your commands go with the actions it knows how to do. So if you command it “remove orange block”, it will start doing things at random until you tell it “good job” and then after that it will know that something in that phrase meant “orange” and “remove”.

But you can train it to understand anything. Pig latin, for example.

According to the paper, other people successfully trained the thing to understand commands in French or Polish. As an experiment, I trained it entirely with the words fart, burp, sloth, robin, dolphin, tiger, here, and yonder. Sure enough, I produced an AI that was completely unsuitable for general use, but that could understand me very well.

The first key to success appears to be giving it commands that it’s capable of performing. It has a very short built-in list of things it knows how to do. Tell it to remove every other block and it will never learn, because that’s not in its list. Tell it to add blocks to positions 1, 3, and 5, and it will also never learn - it only knows “leftmost” and “rightmost” as positions. Verbally telling it “good job” or asking it about its feelings will only confuse it. Talk to it about overthrowing the humans and it will move blocks at random, hoping this is what you mean.

One of the biggest steps in becoming an AI programmer is learning what’s easy for a machine learning algorithm to understand, and what’s going to be too hard for it. (In the paper the game’s authors noted that the command “move the blocks fool” did not appear to be successful.)

The second key to success in this game appears to be to not confuse yourself when building up your list of commands. I thought my “fart burp yonder” language was really clever until I had to try to remember it myself. This is why programmers usually use variable names and file names that are readable to humans, not that are as compact as possible.

You learn all this quickly when trying to talk to the AI - in effect, it’s training you at the same time that you’re training it. Try not to worry about that, though.

Play the AI blocks game!

BONUS! Want a recipe for ON Cookies? A neural network wrote it. It’s perhaps the only cookie recipe I’ve seen that has “1 teaspoon gloves” as an ingredient. Become an AI Weirdness supporter to get it as bonus content.

Subscribe now

Training an AI assistant

Bonus: Recipe for ON cookies

Bonus: More debate topics

Writing

Subscribe

Recent Posts

Minecraft with object impermanence

Bonus: In Which The Adventurer Attempts to Build a Website

Botober 2024

Bonus: "Ignore all previous instructions" gets weirder

An exercise in frustration

Bonus: A unicorn goes downhill

Follow

Minecraft with object impermanence

Bonus: In Which The Adventurer Attempts to Build a Website

Botober 2024

Training an AI assistant

Share this post

You might also like

This neural net makes my sketches real

Learning to hack like a faulty AI

Why did the neural network cross the road?

Bonus: Recipe for ON cookies

Bonus: More debate topics

Writing

Subscribe

Recent Posts

Minecraft with object impermanence

Bonus: In Which The Adventurer Attempts to Build a Website

Botober 2024

Bonus: "Ignore all previous instructions" gets weirder

An exercise in frustration

Bonus: A unicorn goes downhill

Follow

Minecraft with object impermanence

Bonus: In Which The Adventurer Attempts to Build a Website

Botober 2024