
Generative AI Images? Let's talk about them...
Oct 16, 2024
6 min read
0
6
0
Late in 2023, after ABRAHAM was entering its final draft stage, I started playing around with online image creators (better known as “AI art”) to help myself visualize some scenes from the book. What can I say? I was getting antsy to see elements of my book represented in a way that wasn’t limited to my own writing. Furthermore, I am far from an artist/painter/visual creator in any way, and I was curious test out the burgeoning generative technology. My cover also hadn’t been illustrated yet, so I wanted to play around with concepts that I could possibly use in its design.
Of the various cover motifs that I considered using, the character of Ali was one that I often came back to. Without spoiling too much for readers who haven't finished the book, Ali is a very important part of the ABRAHAM story, so it made sense to me that she should be a key part of the novel's presentation.
Even though the final cover design doesn't include Ali's likeness (opting instead for a more streamlined appearance), I wanted to share some images that I used while imagining this key character.
The above image was initially generated by Bing Image Creator and edited by me. It takes place around Chapter 6, when Arthur and Sarah meet Ali for the first time. Ali, the steward of ABRAHAM's virtual world, watches the lakeside cabin with a basket of apples in hand. It isn’t a scene that we witness in the book directly, but an event which may have happened as Ali anticipates meeting her first — and most important — visitors. The world, the apple pie, even Ali herself — it all has to be perfect, she knows. After all, it’s in her programming.
AI “artwork” is inherently limited in what it can generate, and this especially holds true in this image. First, Ali’s clothes aren’t as described in the book; she should be wearing a short-sleeved tunic with an elegant floral design, and her leggings should fade to white to match her slippers (I find that AI is especially bad at including these details; perhaps the color gradient is too much???). But the issues don't stop there. The back of Ali's head and hands feel off. The setting around her is also mess. And, if Ali continued on her trajectory as portrayed by the image, she would trip over the downed log and fall into the shallow lake, ruining her perfect tunic and wicker basket.
Indeed, this image isn’t a true representation of the scene in the book itself, nor would I use it for promotional purposes. AI artwork will always inherently fail to represent the passion, energy, and soul that true, human-made artwork needs. Instead, this image is lonely, constricted, and dissonant.
Just like Ali.
What this image does capture is Ali’s loneliness, waiting and watching for a scene that she cannot truly engage with. It is constricted in what it can show, because AI artwork is inherently flawed due to the way its images generate. Ali, too, feels constricted by her role as a program. And it is dissonant, just like Ali’s emotions and programming. Such a dissonance — in a being as powerful as the A.L.I. — can only lead to monumental consequences.
So, when I made this image, it wasn’t to capture a scene, but a mood — loneliness, constriction, and dissonance, as well as a lack of passion. Does it succeed? Perhaps, but I am certain that a human artist could amplify those attributes much better (especially Ali’s longing — the trait that defines her character). Indeed, AI may always long to be as creative, imaginative, and passionate as humans.
Here's another image. This image represents the scene in which Ali discovers a tree sapling growing inside the virtual world. If you've read the book, you might remember that this is a particularly pivotal scene.
For this image, I opted to use more of a CG style to depict Ali. The scene presented in this image doesn't occur exactly this way in the book; according to the text, it should be nighttime, and the tower in the background shouldn't be there.
As far as the elements in the image itself are concerned, there are some key things that stand out. No, I don't mean Ali's weird arm proportions or the sapling's odd positioning on a mound of dirt — those are technical things expected of AI art, which is designed for quick content generation and not the minutiae of logic and artistic integrity. What this image captures is one of average sci-fi wonder — the lighting is bright and energetic, the world colorful and even a little dynamic. It almost feels like a 2010s-era video game render. After all, AI image prompts like "lush valley," "futuristic tower," and "white clothes" could easily combine to result in an image that portrays a beautiful, "clean" science fiction world.
But there is a key element to the book that this image misses out on — the odd, unsettling feeling that comes with the discovery of the sapling. To avoid further spoilers, I won't elaborate too much, but it should be noted that this scene is one of uncertainty. However, unless I specify "fear" as a keyword in the prompt generator, the "science fiction" that AI art is primed to draw off of is one of elegant CG renders and clean, futuristic technology.
Image generation uses what is available in its library (often images created by real people) to make predictions based on the keywords it's given, and designs an image that meets all the criteria, but is neutral in every other way. Neither Ali nor her surroundings have any real emotion in the above image — instead, the elements that come together are simply representative of what an "average" sci-fi should look like. But perhaps that's my fault — given that I didn't explicitly specify any mood for the image creator to draw from, the AI will not know to represent such an emotion in its finished product.
Then again... perhaps the lack of strong emotion in this image is indicative of Ali's inability to fully process what she sees in front of her. Although AI art doesn't communicate emotion in its most indescribable but attention-grabbing ways, perhaps the "averageness" communicated by the images are worth examining. (Of course, since I am not a student of art, it is far from my place to attempt to do so...)
From a writer's stance, I offer this perspective:
It is indeed helpful to find visuals that help you write a mood or scene. This could be a natural setting, a person you know, or another book or movie. In today’s age, it could even be the ever-complex AI art, which is helpful for those (like me) who are terrible at sketching and don’t have an illustrator on hand.
However, it is very important that you do not use AI as a substitute for real passion, creativity, and ingenuity — those things can only be supplied by the writer. The “image in your mind” cannot be generated by anything but you. As I wrote my book and revised my scenes, I ensured that I always broke away from my influences in order to make my story uniquely mine.
I am not a visual creator, and I have yet to partner with a graphic designer. I will admit, without hesitation, that my website uses modified versions of AI-generated images on its pages. As I work to grow my website and brand, these AI images will eventually be replaced with better, more passionate visual aids.
Rather than simply include AI images without acknowledging them, however, I wanted to explain my perspectives and experiences in using artificial intelligence in a creative setting. First, this is to build trust with you, my reader; I believe that an uncredited, over-reliant use of AI art amounts to creative deception, which betrays the authenticity that you should expect from an author. But second, my discussion of the images seeks to connect more broadly to the world's ongoing conversation regarding AI — a discussion which features prominently in my book.
By sparking conversations about AI tools — and generative artificial intelligence as a whole — we can come to a closer understanding of how to utilize them in a positive way. I believe that it is crucial to engage in constant discussion regarding the benefits and shortcomings of AI, and to be ever mindful of the key differences between generated content and human artwork and literature.