Though my new biographical novel The Night Doctor of Richmond is primarily based on historical documents, I added a fantasy character who plays a key role in the book. This creature has no name, lurks about in the shadows, but at last when fully revealed, is described this way:
“…a mishmash monster with a hundred ogling eyeballs atop a many-armed body ringed with grinning skulls, all balanced atop furry legs tapering down to cloven hooves.”
The creature’s cavernous mouth, by the way, is situated in its torso.
Amidst the current excitement over consumer-level artificial intelligence, thought it might be interesting to task a few of the new AI text-to-image generators with sketching that creature based on my description. Here are the results. You can judge for yourself their success.
At first, Microsoft's highly-touted Copilot blocked my prompt, saying it "violates content policy." When I tried again, leaving out the human skulls, it came up with this leonine version that does a good, creepy job from the waist-down (except paws instead of hooves) and adds a whole lotta monstrous arms, along with ram's horns and a snaky tail (that seems to go off in two different directions) not included in my description. Total fail on the many eyes and belly mouth.
Dream Studio came up with a nightmarish, hairless, golum-like creature that, except for the multiple arms, in no way resembles my prompt:
Not sure what Image FX was trying to do:
But then Google's Gemini (formerly Bard) surprised me with a creature that incorporated nearly all of my parameters! The eyeballs, the belly mouth, the cloven hooves! Missing were the many arms (though the graceful horns that echo the creature's elbow spikes fit nicely) and no human skulls, but still. And, though not a part of my prompt, there's a sort of comic smile to that gaping mouth that accords with the leering character of my beast. Gemini is the Winner!
I’ve been wondering why the AI image generators had so much difficulty coming up with a monster that fully fit my description, my guess being that, as the engines combed the internet, they did their best, but since this is a new kind of image, they couldn’t cobble it together from old stock. So here we have a good example of the current limits of consumer-level AI. For the moment at least, it seems that we still need a human to make something new.
Comments