The AI tools helping to shape production

On this month's ScreenSkills podcast, we spoke to stage and screen immersive director Robyn Winfield-Smith, Artistic Director of Liminal Stage Productions, Associate Learning Practitioner at the RSC and Development Researcher at Hat Trick Lab. She shared her thoughts on the AI technologies and their use across the screen industries. Here, she picks out some of the platforms and tools already in place as well as those that are on their way.

I'm a relative newcomer to the AI space, and I'm coming at it from the perspective of a values-led stage, screen and immersive director, with an appetite for technical and creative innovation, and a fascination with how the technologies of production are currently converging across different media and genres in truly exciting ways. Emerging technologies such as AI are opening up a whole host of opportunities for live performance, screen and XR companies to engage the Gen Z and Gen Alpha audiences of the future in new, more imaginative, interactive and immersive ways. Already, the creative industries are innovating and adapting in response to the exponentially developing disruptor that is AI, and we're likely to see an evolving cross-industry and cross-sector response going forward into the future.

Here are some of the tools that have caught my eye in recent months:

Veo 2, Sora and Runway

These are the three front-running (and most controversial for the screen industries) video generation tools, and are very much worth checking out.

Move AI

Liminal is currently creating a live multi-camera virtual production stage show that will integrate AI into its realtime production workflows - for example, into the live motion-capture of onstage performers. An example of a company currently working in this space is move.ai, who use AI to enable markerless mocap. In the same way that a tool such as Midjourney uses AI to 'predict' what the next pixel will look like based on the images on which that model has been trained, Move uses AI to predict what the next movement of a fixed point (like an elbow or a wrist) on the human anatomy will be, based on its motion-captured libraries of human movement.

Eleven Labs

Hat Trick and Liminal encountered Eleven Labs for the first time at the brilliant CogX Festival in 2023, and were instantly impressed by the (slightly terrifying) calibre of their voice cloning tool. We subsequently integrated this tool into the technical pipeline for our digital comedian in collaboration with Digital Catapult and Target3D amongst others, partly because of the way in which Eleven are seeking to empower voice artists to monetise their voices in different ways using their online platform.

Audio2Face

This is part of the NVIDIA Omniverse pipeline for metahumans, responsible for turning audio files containing human speech into corresponding human mouth movements. We used Eleven Labs to create our digital comedian's audio/speech, and then Audio2Face to generate mouth movements - and, actually, it's worth saying that the current limitations of this tool represented a great source of creative stimulus for us: we knew that Audio2Face only covers the muscles in the bottom part of the face, which can result in 'mouth-flapping' (whereby the bottom part of the face is really animated but the top part is uncannily motionless and dead-eyed), so we opted to create an incredibly sarcastic and deadpan comedian who would barely require any facial expressions to pull off the humour.

Flawless

Flawless offers extraordinary post-production interventions, enabling productions to avoid costly reshoots using their DeepEditor and TrueSync tools. Again, this is a company seeking to apply AI tools ethically and in ways that integrate with the true needs of the industry: they enable productions to, for example: change the language the actor is speaking, using the actor's voice and matching the actor's mouth movements accordingly; change the actor's expression during post-production to support the storytelling in a scene; make other significant changes at a far lesser cost than running a full re-shoot.

Udio

And for anyone who wants to have a bit of fun by trying out a free generative AI tool... Udio enables users to generate music (complete with lyrics) from a text prompt. 

Ones to watch:

The future tools I'm most excited about are those that will:

  1. Facilitate realtime interactivity. Liminal is developing a generative AI character capable of realtime interactive or emergent, rather than scripted, dialogue with audience members across a 2D interface and in immersive 3D spaces. We're excited about the storytelling, character-based and world-building possibilities here.
  2. Develop smarter methods for valuing/protecting IP, contracting/remunerating artists, and watermarking/protecting output. These are the biggest issues facing artists, studios and networks at the moment - I'm hoping the spate of innovations in this space may result in a tool that helps navigate and solve some of these challenges.
  3. Integrate multiple models into a seamless, end-to-end workflow.We're going to see a multitude of startups offering these sorts of capabilities - I can't wait to see how they support industry adoption of the most technically useful and creatively rewarding tools.

So that's my tuppence. My best advice: stay clear about your values, get actively involved wherever you can, seek out specialist knowledge, join the national conversation, ask probing questions, listen to others' perspectives, be creative and joined-up/collaborative in offering ideas/solutions, and always try to identify the threat where there is clear opportunity, and of course the opportunity when there is clear threat!

Back to news