There are two schools of thought on generative AI and its contribution to our culture of creation.1 The pessimists say they'll replace humans and result in regurgitated artistic sludge. The optimists say they'll boost human creativity, democratize creation and supercharge the talented. While I understand and agree with points among the spectrum, I sit in the halls of the optimists. I see immense opportunity with these tools.2
Words are Important. Tools is not a word I chose without care. It's what these things are—exciting, bewildering, frightening tools. Place a chisel in the hands of an expert and they'll create a masterpiece. Place a circular saw in the hands of a careless amateur and they'll create a disaster. Tools can destroy as easy as they can create. And no tool fits all needs. Even if we sometimes treat them like they do.
The other evening, my wife had taken our son upstairs to get ready for bed. That left me with my daughter. On any other night I would've herded her upstairs. But I'd been experimenting with DALL·E 3 and I had an idea to bring her along for the ride. She's our little artist. I was curious what she'd decide to create given a new tool.
I told her that dad's computer can make any image she could imagine. While not true, she is only five years old. It intrigued her but she didn't quite understand. With a dose of skepticism she popped up, a curious look in her eye. I loaded ChatGPT and asked her what she wanted to see. Anything she wanted.
"Draw everything!"
It sputtered. Too broad. One of the tool's many limitations. I tweaked the prompt to generate an image that represented an artist "drawing everything."
Ignore the blatant typo—it's not a great image. There's no doubt we could explore this concept and hone it into something with an opinion. But I was more interested in discovering her idea.
After some discussion, she asked me for a Christmas tree, with ornaments, and stockings hanging on the wall. Guess where we were sitting? In our living room, with a Christmas tree, ornaments, and stockings. To be fair, I'd put her on the spot. It's notable that presents were missing in her vision (we hadn't wrapped any yet). She confirmed she'd like presents. Six presents. Under the tree.
She didn't notice, but there are far more than six presents under the tree. Another limitation.
The longer you use these tools, the more you begin to recognize a "sameness" in the images they generate.3 I don't need a watermark to tell me a generative AI created an image. These limitations will be overcome, but the mistakes, uniform texture, mixed perspectives, odd lighting, and muddiness are a dead giveaway.
After asking my daughter for tweaks, I realized she couldn't think outside of the room in front of her eyes. I decided if I wanted to see anything else I'd have to shift her mind elsewhere. I explained to her a massive dragon with a desert on its back. Her contribution?
"Make it a water dragon!"
It missed the mark. Not surprising given the minimal prompt. These tools can lose previous instructions and, if you aren't descriptive in your prompt, they have difficulty reflecting what's in your head. These tools are like personal assistants. Assistants with incredible capabilities and access to an unfathomable amount of information. If you give them poor instructions you'll get poor results. Just like reality.
I admit, the next prompt was mine alone. A desert on the back of a massive water dragon is an interesting concept—I wanted to see it.
Better, but it needed work. With enough effort you might get something special out of this. Ellen wanted to explore a different path.
"Add a Christmas tree and stockings!"
She still couldn't see past the living room. I don't blame her, Christmas is exciting. As a child, it's as close to magic as it gets.
Talk about a literal implementation. I question the feasibility of this creature taking flight. We got the Christmas tree and stockings, but it knocked the desert off the back in the process. Oh well, let's see where she goes with it.
"Let's put Christmas decorations on the cactuses!"
Yes—let's.
She liked it. With minimal prompting DALL·E 3 leans on itself for creative direction. A better tool—a better assistant—would ask you clarifying questions. Before I could dive too deep into this image my daughter had a spur.
"The dragon wings turn into crackers! Dragon crackers! For the dragons wings."
Guess what she was eating?
The water is back! What's with the the gigantic Ritz Crackers? She was eating saltines. Vague prompts with incremental revision statements are a recipe for inconsistent imagery. But the method has its uses. It will ingest ideas into your creative process and make connections you hadn't considered. It doesn't always do a great job, but it can knock you loose of your train of thought. Sometimes that's all you need. Ellen wasn't impressed.
"I only said the dragon's wings! And you lost the Christmas stuff!"
She was talking to me and not the tool. I set my ego aside.
ChatGPT was still referring to this as a "water dragon" in its prompts. It also mentions that "festive Christmas lights" decorate its back. Not so sure about that. Unexpected results are common. The built up context confuses it. An interesting feature of this image is its similarity to an I Spy. I dig it.
"Can you put some gum drops on him, and frosting, and a gingerbread house, and more stockings!"
I can.
Generating an image takes a varied amount of time. My daughter wanted instant feedback but she was leaning into the anticipation. We shared a visceral reaction when the image finally appeared.
We're a long way from a water dragon with a desert on its back. But like most pursuits4 it's not where you start, it's where you end. A competent artist could take this idea and make something great. I'm not a competent artist, so this is where I stop.
Ellen wanted to press on.
"Add more cactuses on him!"
Happy Holidays!
Joel Stein explores the two ends of the spectrum. His article covers optimistic and pessimistic viewpoints on AI and creativity.
Hamish McKenzie said it better than I could. His piece discusses the opportunity generative AI gives writers.
Keith Edwards does an excellent job of breaking down why generative art looks so similar.
I originally used the word "art" in place of “pursuits.” Whether generative AI is true art is a hot button issue. You're left to ponder that question on your own.
What a fantastic exercise with your daughter. And a great example of the limitless, yet limited, capabilities of AI today.
Like you, I'm in camp AI optimism. However, it needs some guardrails. We had to add brakes to cars in order to go faster. The same principle applies to AI.