One thing I've noticed with image-generating algorithms is that the more of something they have to put in an image, the worse it is. I first noticed this with the kitten-generating variant of StyleGAN, which often does okay on one cat: alternative for shocked_pikachu.pngbut is terrible at a
DALL-E (and other text-to-image generators) will often add text to their images even when you don't ask for any. Ask for a picture of a Halifax Pier and it could end up covered in messy writing, variously legible versions of "Halifax" as if it was quietly mumbling "Halifax... Halifax" to
Google's large language model, LaMDA, has recently been making headlines after a Google engineer (now on administrative leave), claimed to be swayed by an interview in which GPT-3 described the experience of being conscious. Almost everyone else who has used these large text-generating AIs, myself included, is entirely unconvinced. Why?
I recently started playing with DALL-E 2, which will attempt to generate an image to go with whatever text prompt you give it. Like its predecessor DALL-E, it uses CLIP, which OpenAI trained on a huge collection of internet images and nearby text. I've experimented with a few methods based
If you're going to open a late-night donut shop, you're going to need a unique set of over-the-top donuts to set the proper festive atmosphere. But how to keep the ideas coming? I decided to see what donut ideas I could get using OpenAI's GPT-3 text-generating models. I collected seven