There are upsides to working with a neural net that trained on a huge collection of internet images and text. One is that, instead of ominous grey geometric blobs when it doesn't understand your prompt (there is a free interactive demo of AttnGAN here and it is a lot of