This is the second instalment in our series about Generative AI. If you missed our first post, which explored the potential and limitations of Generative AI in visual and text-based content, you can find it here.
Of all the waves that generative AI has recently been making, the developments with visual assets - images, videos, and similar - have been the most dramatic. We could debate about whether the most significant developments have been with images or text or code, but no matter which side you align with, there is no disputing that what generative AI can do with visuals is pretty amazing.
Existing uses of AI with imagery
Generative AI has been embedded into image and video editing software for a while now. Users may be using it without even knowing that a particular function is using AI to manipulate or enhance a video. Generative AI has been used extensively in the area of computer graphics to create 3D models and animation.
Upscaling of images and videos
Software with embedded AI can improve the resolution and quality of visual content. The software's algorithms analyse image or video frames and, based on surrounding pixels, forecast and generate new pixels. This process makes AI upscaling of visuals time- and cost-effective. By using machine learning algorithms to reduce "image noise", the process can improve resolution and quality and even upscale an entire video to a higher resolution.
AI image generators
Image generators with embedded AI use machine-learning algorithms. Training a set of input parameters or conditions with a large set of visuals - photos, paintings, 3D models, gaming assets, and so on - to mix and match components from existing images to generate new images.
Recent changes in the use of AI with visual assets
The biggest change in the use of generative AI in visual content is its ability to generate more diverse and creative outputs. The technology has rapidly evolved and expanded its application to generate photorealistic images and videos, which are difficult to distinguish from the real thing. Generative AI models can be trained on massive amounts of data, allowing them to understand patterns and relationships and generate content that is indistinguishable from human-created images and videos. From the outpainting technique in the Girl with a Pearl Earring painting, to the Pope in a Balenciaga puffer coat photo, to the fanciful astronaut cowboy on the moon drawing, AI has already started to change how images and videos are being produced in commercial, industrial, and other environments.
Opportunities for the use of AI for imagery
The opportunities coming out of the application of generative AI to visual content are extensive. Every week, another batch of articles are published about new ways to apply AI to visual content. Aside from the click-bait images used in online publications to grab our attention, there are applications already in use across many verticals. Here are a few examples.
- Data visualization: The ability to generate data visualisations allows for near-instant interpretation of data in a story-telling way.
- Product design. AI-generated images are used to create realistic product prototypes. These types of images are also used in visualisations for rapid iterations for product design and development.
- Fashion and beauty. In fashion, AI-generated prototypes of clothing and accessories can be extended to user experiences such as virtual try-ons. In beauty, virtual try-on experiences can be extended to make-up transformations with before-and-after comparison panels.
- Augmented and virtual reality. Creating immersive experiences in augmented and virtual reality applications are already being applied to everything from entertainment and gaming to medical, corporate, industrial, and educational spaces.
- Film and video production. AI-generated images can contribute to creating special effects and computer-generated imagery (CGI) in movies, television, and video games.
- Creative arts: Artists, graphic designers, and other creatives are using AI-generated images to explore new styles and techniques and expand their visions.
Hybrid of imaging and data visualisation
The use of generative AI in healthcare is an interesting example of how to leverage artificial intelligence. Imaging makes up a the bulk of healthcare data. Using AI for medical imaging combined with data labelling, synthetic data generation, self-supervised learning, and automated notetaking, and you have a powerful ecosystem which can be used to diagnose, create regressive timelines, and predict. The healthcare industry took the initiative a step further by adopting an industry standard for AI data. This makes the data deployable across systems such as Amazon, Oracle, and Google cloud services. This video takes a look at some of the ways that generative AI is working in healthcare. There are emerging advances in AI, such as transient imaging, which opens up the potential for the advancement of science.
The consensus, at least for the moment, for the use of generative AI in marketing is not to use the generated output as is but to use the output as inspiration. The generated images can speed up the creative process as there are visuals to compare against. Of course, getting a result that matches your mental image can be an interesting exercise (as depicted in this image), though as we get better at prompt engineering, the results will get proportionally better. AI-generated artworks are not covered under copyright protection, so using them for things like campaigns is probably a risky decision. On the other hand, images for "quick view" purposes, such as visual interest for a social media post, might be very useful - though a quick scroll through my own social media isn't showing much generic art. The output is very fast and cost-effective, though it is important for someone with high visual literacy to finalise a look before publication so no offensive or inappropriate elements inadvertently get included in the generation.
AI-generated images can be used to generate illustrations and animations, particularly useful when your prompts generate output from data sources such as a database or spreadsheet. And those product prototypes you need before the product is ready could be AI-generated art.
Limitations and caveats
Before jumping to put AI to use in an organisation, it's important to understand the legal and ethical guardrails surrounding the use of AI.
Limitations of Generative AI
One limitation of generative AI is that it lacks actual creativity. AI cannot replicate the emotional and cultural nuances that, for example, marketers need to connect with audiences. AI models can combine and recombine existing images but not come up with something actually new. Although, one might argue that an informed use of text prompts on an AI platform like Midjourney can constitute creativity to some extent in the image that is created.
Because the results result from a text-to-image process, the prompts may result in unexpected outputs. To get a specific result, the prompts need to be very detailed. Even then, the results may not be as anticipated, with the requestor being frustrated by their lack of control in the process.
Another limitation is that generative AI models require vast amounts of data to be trained, which can lead to bias in the generated content. We've seen regular reports of applications being exposed as having biases in a wide range of areas, such as recruitment, facial recognition, policing, automotive, social media, education, and so on. Furthermore, the technology is not yet advanced enough to replace human creativity entirely, and it cannot replicate the emotional and cultural nuances needed for effective human communication.
Legal and ethical caveats of AI-generated images
It's important to understand the legal regulations and ethical considerations of using generative AI images. As mentioned earlier, US copyright law does not allow for generated AI images to be copyrighted, while UK does allow copyright where there is no human author. However, using copyrighted elements in a generated AI image creates different challenges. Getty Images is suing Stability AI (Stable Diffusion) in London for infringement on intellectual property rights for allowing copyrighted works to be processed by an AI generator. These and other issues will be raised as new situations come along.
The ethics of using generative AI images is also in flux. In a world where a deepfake can breed disinformation, where artistic styles and specific images can be used in ways that the original artist finds offensive, and where artists whose works contribute to generated works are not compensated for their contributions, the speed of AI's progress may be outpacing the ability of the average human to make discerning decisions about what they're viewing.
The future of AI-enabled visuals
Models are continuously being improved, to enable generative AI models to generate content that is more nuanced and varied. There has been an increase in the use of generative AI to create interactive visual content - chatbots, interactive videos, and game assets.
With the advances of emergent behaviour in AI models - that is, behaviour unprogrammed in the model, but more complex behaviour that emerges, often more complex than the sum of their parts - we can't predict the variety of ways that AI will help organisations manage their digital assets.
If this article has piqued your interest, here are some resources to help you with your generative AI visuals:
Rahel Bailie, Executive Consultant EMEA
Over the coming weeks, we will be publishing three further blog posts on the topic of AI: AI use with Text-Based Content, AI use in Marketing, AI use in Life Sciences, and AI use for Product Content.