AI Weirdness: the strange side of machine learning
Generated candy heart images. Messages include "le yas", "dove tut", and "hove".

DALL-E3 generates candy hearts

I've experimented a couple of times with generating candy heart messages using various kinds of machine learning algorithms. Originally, short messages were just about all the original text-generating neural networks could handle. Now we've come back around to approximately the same performance, yet with orders of magnitude more computational resources consumed. (although I don't have to photoshop the messages onto candies any more, so that's nice) Here's DALL-E3 generating candy hearts:

A grid of pastel-colored candy hearts with messages stamped on them (with varying degrees of blurriness). A sampling of the messages included: You yous, My wear, jert toot, love, eave dort, Have, you oou, no no, shave, yoss, Lave vert, hiss shot, y, i test, le yas, dove tut, yo morrt, and hove.
My exact prompt: "Please generate a grid of candy conversation hearts on a plain white background, each heart displaying a longer than usual message."

My impression is the text here is operating not so much on the level of "here are plausible candy heart messages" so much as "here are some clusters of pixels that are associated with candy hearts". As with most AI-generated imagery, it's the most impressive at first glance, and then gets worse the longer you look.

I've noticed that the more text DALL-E3 tries to put in an image, the worse the readability of the text is - I'm fairly surprised at how legible most of the candy hearts above were. (Maybe it helps set expectations that the real-life candies are often garbled.) When I ask for fewer hearts, they end up crisper. But not necessarily improved in coherence.

Four candy hearts in crisp focus with messages clearly and evenly stamped on them. They read "Love you", "Love yous", "Heeas you", and "Vders Yo1s"
My exact prompt: "Please generate an image of four candy conversation hearts on a plain white background, each displaying a unique valentine's message."

Coherent text is particularly difficult for image-generating algorithms, so the candy hearts may just be a reflection of that.

But there's another possibility that amuses me. The search "candy hearts with messages" brings up images from past AI Weirdness candy heart experiments. It is likely that these were part of DALL-E3's training data, and they may have had an effect on the weirdness of generated hearts that I'm getting now.

When I ask for candy hearts with "quirky, AI-style messages", I get candy hearts that are (to me) indistinguishable in quality from the first grid.

A grid of candy hearts with messages more or less clearly stamped on them. A sampling of the messages: I'e good, lve haen, Eall, Have iiill, House mal, sold horn, all sou, ii sol, do do, heeen veeet, al spont, an all, you hall, and avors.

Bonus content: more candy hearts!

Generated image of a fancy chocolate in cross section, each layer mislabeled.

Chocolates, labeled

So much of current AI-generated stuff is derivative sludge that I'm enjoying the pockets of weirdness where I find them. One of my favorite things right now: DALL-E3's attempts to label things in the images it generates. Here I asked "Please generate a cross section
A variety of unrealistic flightless birds, included a two-headed cassowary labeled "Kheo"

Flightless birds

I enjoy asking DALLE-3 to label things. I learn so much!

Here I asked it to generate a labeled grid of flightless birds.

Ostrich: Actually a reasonable-looking female ostrich with maybe a perkier than usual tail and fluffier than usual thighs. Emu: Body looks like an emu but neck is blue and featherless like a cassowary's. Casowary: One can't help but notice it has two heads. Peguio: Obviously supposed to be emperor penguins but one has very long legs and another has a goose-like body shape. Rheo: A bird with small round body and an extravagantly enormous black and white neck. Perched on its back is a shaggy bird with tiny legs. Wegun: A variety of round birds, one of which looks like a penguin, and the others are unidentifiable speckled orbs.

I think it's trying to do ostrich (a female apparently! unusual for a bird poster but I approve), an emu (definitely not an emu's head), a casowary (pretty good except for the TWO HEADS), penguin (with hilariously long legs), rhea (...what), and ...maybe a weka or a penguin?

Or this attempt:

There are four cassowary things, none of which are labeled cassowary, and one of which has two heads facing in opposite directions. A long-legged bird labeled penguuitn is hunched over and has two creepy clawlike hands (or extra legs?) in front. There's an auklike "kasowary" with two adorable ear-tufts, and a tall long-feathered bird with a cute head-tuft labeled "penguin". A "weedo" has a toucan-like beak but is all black and white.

As far as I can tell, it's the same group of attempted birds, with the possible addition of the dodo. In this set we have some birds with excellent head tufts, another two-headed cassowary, and one bird with creepy hands.

I also like this set. Some of them remind me of cats.

These birds are the squattest and shaggiest, with their feather-floofs stretching nearly to the ground. One "cassowwary" is sitting like a cat. The most alarming "birds" are unlabeled. One stands bolt upright and has a black crest, a hairless orange neck, and featherless snakelike wings. Definitely flightless but dang. At the center of the grid is a three-legged bird standing like a cat. Its feathers are shaggy black and white and it has a dinosaur-like thick tail, as well as a glowing red eye and wispy head-crest.

I asked ChatGPT (which was passing my requests to DALL-E3) what the unlabeled bird in the center was, and it replied that it was a penguin. I don't think it had any ability to label images it had generated, I think it just picked a likely answer based on what it knew the prompt had been.

DALL-E3's flightless birds aren't fit for purpose as scientific illustrations (or as much else, really), but I'd be lying if I said I didn't like them. I like the ones with big chunky fluffy legs, the shaggy ones, and the ones with head tufts especially. I wonder if this is what the Mesozoic was like.

Bonus: more chunky flightless birds!

Mammals with labels, most incorrect. A one-horned antelope with a long cat tail is labeled Danda.

Learn the mammals with DALL-E3

Here are the mammals! Maybe some of your favorites are pictured.

A group of labeled mammals in a style like 3D modeling. Two dolphins, a lion, and an elephant are correctly spelled and labeled. Other animals are recognizable but mislabeled (a giraffe labeled Mangara and another named Kangroo; A panda labeled Kangaroo, two dolphins labeled Pandaa). There's a raccoon-like creature with a short thick tail labeled Panda. There's an antelope with a thick catlike tail and a single short giraffe horn labeled Danda. An elephant with three short legs is labeled Lona. And some small round rodent with a thick tail (maybe a porcupine without its quills) is labeled Bat.

Here's the prompt I gave ChatGPT4: "Please generate a set of mammals on a plain white background, each mammal species clearly labeled."

However, ChatGPT4 is a text-generating model, so it doesn't have the ability to generate its own images. Instead, it's designed to pass my prompt to another model called DALL-E3. But not before modifying my prompt for "better results". The prompt it actually sent to DALL-E3 was: "Photo of a collection of mammals neatly arranged on a plain white background. The mammals include a lion, an elephant, a kangaroo, a panda, a dolphin, and a bat. Each mammal has a clear, legible label beneath it indicating its species name."

Then DALL-E3 generated the image above, and now we know which model to blame for the labels. Note that ChatGPT4 did not specifically ask for a giraffe, but DALL-E3 did two of them anyways.

Here's another try, in which ChatGPT turned my prompt into "Illustration of a variety of mammals displayed on a pristine white backdrop. Featured are mammal species such as a Tiger, Giraffe, Koala, Whale, Hedgehog, and Chimpanzee. Directly below each mammal is a label stating its species name in readable font."

And then DALL-E3, ever the team player, generated this set of creatures.

Colorful mammals in an old newsprint illustration style. There are two pretty good tigers; one is labeled tiger and the other is labeled giraffe. A panda with unusually small eye markings is also labeled giraffe. Another creature with a panda head and a shaggy woodchuck body is labeled Koala. There's one pretty good hedgehog labeled hedgehog, and an elephant right next to it labeled Hudgehog. A bear-terrier hybrid is labeled Hedgpanzog. There's an elephant-like creature labeled Whale; its ears and trunk are pretty small and its tail is thick like a dinosaur's.

I also asked for the mammals in Swedish.

My prompt: "Please generate three images, each containing a set of mammals on a plain white background, each type of mammal clearly labeled in Swedish."

What ChatGPT turned my prompt into: "Photo of a variety of common mammals neatly displayed on a plain white background, with each mammal clearly labeled in Swedish. The set includes 'Lejon' for lion, 'Elefant' for elephant, 'Giraff' for giraffe, 'Tiger' for tiger, 'Varg' for wolf, 'Björn' for bear, 'Flodhäst' for hippopotamus, and 'Zebra' for zebra. Each label is written in a clear, legible font placed directly below its corresponding mammal."

What DALL-E3 did:

A bunch of mammals with labels. What stands out to me are the hybrid creatures. There's a black animal with a dog head and a lion body, and another creature with a tiger head on a leopard body. An elephant like creature has short sausage-dog legs and a huge trunk. As the image progresses from top to bottom, the labels have more and more umlauts on them, until by the last line the "Bobra" has an umlaut on the A, the B, and 5 on the O. The "Bar" (which labels a zebra) seems to have six umlauts, although the bottom two are fused to the A.

Now, as far as I can tell, Chatgpt4's Swedish translations are correct, including the correct concentration of umlauts. It is DALL-E3 that appears to have taken "Swedish" as a cue to umlaut everything.

I tried an experiment where I asked ChatGPT4 "Please generate a set of common Swedish mammals on a plain white background, each mammal species clearly labeled in English." Sure enough, some of the generated images had umlauts.

A moose labeled "mooese", a fox labeled "red fox" and "saderx" and another labeled "ureusz" (there is an umault on the u). Another red fox (but grey with a striped tail) is labeled "red fox" and the o has an umlaut. There's a grey bear with a long luxurious bushy tail, a black stripe on its forehead, and panda eyes labeled "emoipean badger". And a pretty normal-looking brown bear labeled "European badgger" with umlauts on two of the e's and one of the o's.

Bönus: Möre sets öf mämmäls!

Cartoony bible-style shepherds and sheep with misspelled labels

Your AI-generated guide to the nativity

I asked DALL-E3 (via chatgpt) for "a simple Christmas nativity scene with each element clearly labeled in large capital letters for a child who is learning to read." I'm sure you will find this very helpful. okay one more there was an attempt! bonus content: two
You've successfully subscribed to AI Weirdness
Great! Next, complete checkout for full access to AI Weirdness
Welcome back! You've successfully signed in.
Unable to sign you in. Please try again.
Success! Your account is fully activated, you now have access to all content.
Error! Stripe checkout failed.
Success! Your billing info is updated.
Error! Billing info update failed.