At this point, most of us are familiar with the latest wave of artificial intelligence innovation, namely LLMs and Generative AI. But are you aware that even experts and developers of these LLMs are not able to fully explain how they work or understand why certain decisions are made?
Sam Bowman laid this out clearly in a paper published last year: “Experts are not yet able to interpret the inner workings of LLMs”. And just over a year later, circumstances haven’t changed much. Ethan Mollick, a professor at Wharton focused on innovation and AI, recently said “fundamental about A.I. is the idea that… we don’t know how they work the way they do, or why they’re as good as they are.” Smart pundits like Ezra Klein have discussed this “interpretability” problem in AI at length. I’m here to highlight that it is a cause for concern.
The interpretability problem is very much interrelated with the hallucination of AI models, where they make things up that are untrue. We humans don’t understand why AI models hallucinate. Fine-tuning models and techniques like RAG (Retrieval Augmented Generation), where the model references domain-specific knowledge to retrieve and use in context for better responses, can help. And it is true that a few weeks ago, a breakthrough was made by Anthropic in this field of interpretability research. But these examples of progress are no panaceas, and the interpretability problem very much persists. Better understanding the “black boxes” of AI models is a crucial step to improve the performance of those models, as well as to build broader trust in AI.
A Bellwether for AI
We are still in early innings, relatively speaking, of AI and especially Generative AI. There is some indication of how LLMs will monetize given early subscription models, but that question is still largely an open one. Many of the earliest forms of AI in a commercial context were developed around digital advertising, which can serve as a leading indicator of how LLMs may progress as they mature.
For most of Big Tech, who make the majority of their revenue through advertising, the trajectory of their ad products has been moving for decades in the direction of more automation. With more automation comes less control, and less user understanding.
Now, you could argue that automation is just a symptom of technological innovation, or even capitalism, or perhaps a combination. Regardless of the root cause, there’s little doubt that technology companies, big and small, are working to automate more, rather than less.
Digital advertising is no exception. In a piece I wrote last year, I explained how the history of digital advertising is essentially a history of automation. First bidding was manual, then it became automatic. Targeting was the next major area to go algorithmic, and now, it’s all about Creative (text and visual of the content). Creative is the next major domain to be taken by the “black box”, which presents existential risks and degraded quality.
Creative Should Never Be Fully Automated
One of my intentions in this piece is to communicate nuance, which is harder and harder in this bite-sized and ever polarized media environment. I’m not here to claim that automation is bad. In fact, I’m a major proponent of automating tasks at work, as long as they are rote, repetitive, and painstaking. What’s more, in my own startup, Purposely.ai is working to automate exactly these annoying and repetitive tasks for digital media strategists and creatives at ad agencies.
However, what we’re not doing, and what I hope we never do, is fully automate creative work such that there is no more use for human roles like creative directors, designers, and copywriters. I believe that humans will always be best at creative work, because creative work is innately human. It is true that DALL-E, Midjourney, Sora, and other Generative AI models are advancing at a blinding clip, and that their output is often truly astonishing. I mean, just look how realistic and spectacular these videos below are.
However, if you look a little closer, you’ll likely notice things that can be described most simply as just plain “weird”. The shot of the California coastline with the red-roofed lighthouse is gorgeous and inviting at first glance. But on second thought, it’s neither depicting Big Sur (as it claims to), nor the famous Point Reyes Lighthouse to the north, but rather some uncanny combination of the two. Why did the Sora model do this? As Sam Bowman and Ethan Mollick said, we simply don’t know. Without a human overseeing the “production” of a video like this, it’s unsettling at best, and dangerous at worst, especially when used to purport reality. These oddities underscore the need for human oversight in AI-generated content.
Other examples abound. Check out what happened when KFC tried to use AI for it’s Creative in an ad campaign. And, just google “Google's ai blunders” (meta, I know :) and you’ll quickly get caught up on the latest unsettling examples of automated, black-boxed AI gone wrong: glue on pizza, and African American nazis, to name a few. This is why the biggest advertisers like P&G are exercising caution in this area, and why their CIO says that it is “paramount… that the employee reviews any content or draft that is produced by generative AI.”
Humans should always be in the loop, to some extent. If we humans can’t interpret the decisions AI models make and the output they “create”, can we trust them? My answer is no. And sure, it’s true you can’t always trust humans either, but at least we can poke at our underlying motivations and uncover our reasonings.
Final Thoughts
Keeping humans in the loop in digital advertising creative development is important to minimize risk and maximize high quality output. It’s not high quality if you regularly need to go back and fix something later.
I would also encourage the tech industry to err on the side of being open/transparent, rather than closed, to the extent possible. We can then all learn more from each other. Counterintuitively, Meta’s Llama model is open-source, and much more “open” than OpenAI’s models. Kudos to Anthropic, another major peer in the space, and their important work to demystify these LLMs and improve our understanding and interpretability of AI.
When models are opaque, the majority is worse off. But alas, “black box” models are the current state of play, and if the progression of advertising technology is any indication, that’s not going to change anytime soon.
So what should you do if you want to leverage these AI models for work? Work to understand them better, through tools like Purposely.ai, which delivers insights on exactly which text and visual elements perform best (and worst) in social media ads. We essentially “unbox” the black box of advertising creative, and the algorithms behind it all. We’re also developing new features to automate (the painstaking parts) of new Creative concepting, which importantly, is different than final creative assets. My parting words: if you represent an advertiser, make sure you understand your creatives really well! It’s the last piece of the ad puzzle that humans still have a hand in, and it’s slipping out of grip, fast.