So lately I've been getting a kick out of asking DALL-E3 for images labeled with text. They're just good enough to be legible, but yet:

A top down view of many recognizable traditional Thanksgiving foods (mashed potatoes labeled grasted potinos, mashed potatoes labeled granbery sauce, mashed potatoes swimming in gravy (labeled mashed turktees), a bowl of tomato soup (labeled cranbery sause), a roasted turkey (no label), a bowl of plain cranberries (labeled greenrry pans), a slice of pumpkin pie (labeled pumpkin pie)

The food that gets duplicated seems to vary from spread to spread.

An elaborate Thanksgiving spread on a farm style table with artfully scattered pumpkins (including a couple in mugs for some reason) and lots of silverware. The food is uniformly mislabeled/misspelled. Some highlights: A plate of green beans labeled Maaed Potatogs. A pumpkin pie labeled Braebriny Casserle. A crock of stuffling labeled Crabtes. A big round slice of smoked gouda or something labeled Pumplin Rols. A bowl of tiny rolls labeled Mus. A bowl of larger rolls labeled Wee. The turkey is labeled Sniffs.

I also asked DALL-E 3 to do the dessert table.

A big dessert table with a pumpkin pie labeled Pople Pie, another pumpkin pie labeled Poky (Poxy?) Thannssvimg, A pecan-topped cake labled both Apple Fie and Cockes, A trifle labeled Pumkim Pie, some shortbread squares labeled Bpowhies Briines, some chocolate chip cookies labeled Broisscke, and some very shallow stacked tarts labeled Pumpkin Crtise.

When I asked ChatGPT-4 (which was sending the image queries to DALL-E3) what the labels in the above image say, it answered:

Pumpkin Pie
Pecan Pie
Apple Crumble

I asked it if the labels were correctly spelled in the image, and it said that it couldn't tell me, that it could only read the text it gave as a prompt, not the text in the image itself. (This is probably the correct answer). But when I asked again, it would say that "Yes, the labels on the desserts in the image are spelled correctly. Each dessert is labeled with its appropriate name, ensuring clarity and correctness in the labeling."

Bard, when asked to read the labels in the dessert image, was not influenced by knowledge of the prompt, and did what I would consider a better job:

The image shows a table topped with a variety of desserts, including:

Pies: Pumpkin Pie, Cockles Pie, Apple Pie, Chess Pie, and Cheese Coales Pie
Cookies: Thinssving, BPOWHIES, BRIINES, and BROESCKE

The label "Posode" is not clear to me, but it may be a reference to the type of table or the type of event that the desserts are being served at.

The desserts are all decorated with pumpkins and leaves, suggesting that this is a Thanksgiving-themed spread.

The image is likely taken in a restaurant or bakery, as the desserts are all professionally displayed.

Here is a faithful reading of all the labels in the image:

Punpkan Cockes Apple

This may not be quite correct (or even nearly correct; I think it pulled Posode out of the blurry background), but it does at least capture the spirit of the original. I do like the way the weird spacing and random code block convey chaos. Maybe Bard does text recognition first and then builds the scene around that, rather than starting by recognizing the scene and inferring probable text. (Or in ChatGPT's case, starting with an assumption about what the image is supposed to contain.)

Bonus: A couple of more Thanksgiving spreads, and also a scene in which your relatives are cats.

Subscribe now