Here's "Ice Cream Planet Swirl", as generated by Pixray.

description: a pixelated landscape of swirling pinkish green against a mint background. Ice cream cones rise past spinning globes.
Full prompt: Ice Cream Planet Swirl #8bit #pixelart. Colors are chocolate, minty green, and cream.

Pixray uses CLIP, which OpenAI trained on a bunch of internet photos and associated text. CLIP acts as a judge, telling Pixray how much its images look like the text it's trying to match, and it's up to Pixray to figure out how to improve its images to increase its score.

Lots of image-generating algorithms I've used lately work like this. What's fun about Pixray (which is itself based on some of those) is that Tom White gives it a very limited number of pixels and colors to work with.

Pixelated library in wine, pink, and spashes of blue and green. Red and green plants with thick stems and chunky leaves rise luxuriously. Someone with red hair in an emo bowl cut (covering their eyes) and a fancy vest is standing to the left of the plants.
Prompt: "Carnivorous plants in the grant vault of a gothic library #8bit #pixelart". I've noticed that "carnivorous plants" as a prompt tends to bring up plants but also this one person with the emo haircut. Why?

A fun recent twist on Pixray is Pixray Swirl by altsoph. It makes an animation by taking an image, zooming/moving/rotating it, blurring the image, and then telling Pixray to optimize the image again to match the prompt. With the default settings I made the fun ice cream animation above, and also this animation of "Smoke Swirling from Victorian Chimneys".

Red brick houses with classic Victorian chimneys, from which a blue-grey smoke swirls as it rises. But the buildings and chimneys are swirling and rising too, and it looks very strange.
The individual pixels in these videos look a little grainy because of the compression I used to make them into animated gifs for my blog - in the original images, they're solid pixels.

The problem here is that the "swirl" math is not governed by any sense of what should and should not be swirling and rising through the town. It gives the entire picture a twirl, and CLIP is only using its internet knowledge to improve the appearance of the already-twirled image.

I realized that a selective swirl wasn't going to happen with this algorithm. But zooming in on a photo should apply about equally to everything in it, so I decided to try a landscape. I played with some variations on a theme, and "Straight trail leading past a jelly landscape with jello mountains" gave me a landscape stretching away down a long straight road.

I turned off swirl and translate, set zoom to 20, and got this.

A vivid green and red landscape with wobbly jelly-like bushes and a straight trail leading into the horizon between brown mountains. The camera immediately veers away from the road and into the mountains, which have lots more jelly formations and, eventually, a giant jam jar.

Aside from the apparent multiple interpretations of "jelly" (jelly molds, jellyfish, and a giant looming jar), the main surprise is when the camera immediately jumps off the road and heads for the hills, because apparently I hadn't set the zoom point correctly. Something that zoomed in based on what was in the picture might have stayed on the road. This simple calculation did not.

I gave up on landscapes with roads, and managed to steer the view vaguely toward the center. Here's "The misty volcanoes of dinosaur country."

A green misty landscape scrolls by. Each pointy triangle is either a smoking volcano or the head of a old-timey iguanadon-looking dinosaur. Often the features oscillate between the two. The perspective is all over the place.

I thought it did a good job, although it does seem to struggle with the size scales of dinosaurs and mountains. And if there are no dinosaurs in the current view, it makes do by reshaping a volcano. In fact, volcanoes often transform into foreground objects like dinosaurs or mist or little triangle-shaped bushes when they move into the lower third of the image. Continuity with the previous image isn't the point - it's trying to maximize how much each individual frame looks like a classic landscape.

Another thing you may notice is that as a dinosaur starts to leave the field of view, it sort of leans into the frame as it's leaving. This makes sense if you remember that it's generating each frame independently. It has no information that in the previous frame there was a dinosaur on its way off-screen. All it sees are pixels near the edge to enhance and, in its experience, features aren't usually abruptly chopped off the edge of the image.

You can really see this effect in one of my favorites, "Apocalyptic landscape by Lisa Frank".

A landscape with burnt-out towers, burnt-out houses, and flames. The color palette is pastel, there are rainbows everywhere, and huge purple quadrupeds stalk the landscape among the occasional skull.

Not only do the animals lean in as they approach the edges of the picture, they also turn and face the camera (and become more skull-like?). Just trying to present the optimal Lisa Frank apocalypse.

Try Pixray or Pixray Swirl! (Click on Open in Colab. To zoom in on the middle of the image in Pixray Swirl, try something like outer_rotation: 0; inner_rotation: 0; zoom: -15; shift_x: -12; shift_y: -6 )

Here's a bonus post in which I attempt to steer the camera down this hole (and a few more holes). For more bonus posts like this, and to get new AI Weirdness posts in your inbox, become an AI Weirdness supporter!

A deep hole into a green crystalline surface, getting darker and bluer as it descends. A couple of colorful pillars lean out from various ledges (they will turn out to be people looking down into the hole).
"Looking down a hole into the center of the earth"

Subscribe now