While we’ve already explored whether or not 2023 was a good year for post production, topics that have impacted discussions and happenings across media & entertainment as a whole will further evolve in 2024. What is top of mind for professionals in production and post? Will AI continue to dominate all of the headlines? How should any of that impact the decisions professionals need to make regarding the tools they’re currently using or need to make it a point to learn?
Below is how various PVC writers explored those answers in a conversation took shape over email. You can keep the discussion going in the comments section or on Twitter.
Can you really look back on 2023 without having every conversation be about AI? It seems to have permeated every part of our lives. Or at least that is what they would have you believe from all the discussions you hear around AI and all of the marketing features you see about AI. I’m sure it has impacted my life more than I know, but as far as the editing and post-production world that I live in, AI has so far just been some incremental steps that make transcriptions more accurate, audio more clean and color-correction more better.
There’s no doubt that the cutting-edge of post-production and media creation AI is happening, not at industry giants like Avid and Adobe, but at smaller start-ups that are much more nimble when it comes to innovating and building products around AI. I’m sure Adobe has a lot of engineers working behind the scenes on how AI can further enhance their media creation and post-production products (and we do see some of that each year at Adobe MAX). Blackmagic too. But the real question that I wonder about, as we turn the corner to 2024, is how much ahead of the game will the smaller startups like Runway, CapCut and Sequence be when it comes to innovative AI features in a true video editor.
Sure, we’ve seen a lot of places where you can use AI to generate fully moving video but at this point as of this writing it’s still just a few seconds, often rather simple shots with limited movement and people with weird features and faces. But this generative video will, of course, get better, faster and longer.
We’re still waiting on the AI service that takes your 8 hours of talking head interviews and actually returns a good story. I’ve seen a number of online editor forums discussions asking of that exists. And while I’m sure it is being worked on, these discussions often devolve into replies with editors telling forum thread author that that is EXACTLY WHAT A HUMAN EDITOR IS SUPPOSED TO DO! Tell a good story. When AI can actually do that then … well … the editor worry of AI taking our job will be closer than we thought.
2023 has been a big year for AI, and I’m sure there’s a lot of change still to come. Transcription has become so much easier and cheaper (30 seconds to locally transcribe a 22 minute interview!) that it’s changed the ways in which some jobs are done. Audio denoising has come a long way too, and audio generation is definitely good enough to be useful. Though the writing is on the wall for lower-budget voiceover gigs, generative video tech isn’t going to replace VFX artists or cinematographers just yet.
In terms of editing, consumers are where the money is, not post-production. As a result, new AI video apps mostly try to help users skip the post-production process rather than assist those already doing it. In good news for editors, automatic editing hasn’t been a success so far, but it’ll probably become “good enough” for consumers to share quick highlight reels. AI will also continue to make more post tasks easier, making roto less of a chore and hopefully helping us to categorize and find our clips.
One of the more interesting things I’ve seen is tech that allows any video of a person speaking to be transformed into a video of that same person talking in their voice, but in another language, with their lips moving appropriately. If this matures, it could bring a world’s films to a whole new audience, and break down barriers in a way that captions cannot. This is the kind of thing we’re more likely to see on a web-based editing tool than a traditional NLE, and those tools are in an interesting place right now too. Look out for an article about Sequence in the new year.
On the Large Language Model (ChatGPT) side of things, we’ve gotten much smarter personal assistants, though they’re mostly still text-based. However, voice input does change how we interact with them, as does image input, and both of those advances are becoming available. Today, it’s also possible to run an LLM on your own computer, and that means that the safeguards placed on these systems are less likely to hold firm. Sure, some people will allow an LLM to control their portfolios and lose a lot of money, but on the plus side, if an LLM could learn to control every complex app on our computers, there’s the potential for us all to level up our abilities.
For me personally, this year has mostly been about finally meeting up with people in the real world again, after a long break. The FCP Creative Summit was a great excuse to catch up with a heap of video friends in Cupertino, but my most memorable interaction was in a demo room during the Summit in the Ring in Apple Park. I overheard a man saying “back in the 5D Mark II days” so I joined the conversation with something like “oh yeah, I remember Reverie, by Vincent Laforet”. The man said “I’m Vincent Laforet”.
Vincent has been at Apple for a few years, and I think we can safely say that he is the main reason the Log image on the iPhone 15 Pro is so good. The Summit was fantastic, visiting Apple Park was pretty special, and my new M3 Max MacBook Pro is letting me do a whole lot more with 3D modelling than I could before. Next year, the Vision Pro is going to bring AR and VR to a much wider audience, and Spatial Video could bring a resurgence of 3D video. If you’ve got a 15 Pro you can shoot in 3D today, and I’m looking forward to exploring that further once FCP gains support next year.
It’s also been an interesting year for smaller cameras. Though Panasonic’s GH6 is still my go-to pro camera, the Insta360 GO 3 and Ace Pro have rewritten the rules about how small a camera can be, and the resolution we can expect from a slightly larger action camera. Both of these cameras, like the GH6, support shooting in open gate for multiple aspect delivery — very glad this is becoming more mainstream.
It’s a world in flux, but it’s no curse to live in interesting times. More tools, more toys, more things we can make. Enjoy!
The results actually created by AI seem perpetually unable to fulfil the sky-high ambitions people have for it. OK, that’s a bit unfair in some cases; handwriting recognition has, finally, become somewhat usable, though it’s still largely faster to type. There are other applications, of course. But the big, spectacular, headline stuff seems to lurk perpetually in a zone of coming-real-soon-now in which the people depicted in AI art have seven fingers on one hand and three on the other. It’s hugely promising, sure, but right now it’s hard to shake the feeling that we’re so impressed it works at all that we’re willing to be very forgiving about the the fact it often produces work we wouldn’t accept were it not sparkling with the gloss of a new technique.
If the recent rate of progress is anything to go by, we might expect these to be rapidly-solvable problems… but it’s been a few years, now, long enough for the discipline of driving an AI image generator to almost have become a profession: “abstract sci-fi cityscape for a YouTube ambient track. All humans to have symmetrical finger count and just one nose.” Perhaps this is a good time to reflect that the rate of progress in any field is not guaranteed to be constant, and that there are some other, trickier things to address around AI than just what it can do.
Given the breathless excitement over AI it seems almost ungenerous to relate all this to things as mundane as what the underlying hardware is actually capable of doing. Most modern AI is entirely dependent on the massively parallel resources of a GPU both in training and application, and we’re very used to GPUs providing all of the power most people need. Devices that aren’t even top of the range will happily handle the demands of most video editing, which is why Apple can sell its M-series devices with GPU subsystems that are far from the top of the market; they do what most people need them to do. With parameter counts regularly in the billions, though, AI can overwhelm even the best modern hardware to the point where training high-capability models can take weeks. Perhaps worse, one oft-quoted statistic is that generating a single AI image, containing one slightly cross-eyed human with three and one-half thumbs, consumes as much energy as charging a cellphone.
We are clearly already against at least some practicality limits here. Concerns over the ways in which AI might affect society are not necessarily unfounded, and recent advancement has been jet-propelled, but it’s far from inevitable that things will keep getting better at the rate they have. It’s not even clear they can. AI has been a subject of research since Turing’s conjectures of the early twentieth century, and the emergence of vector processors capable of actually doing it took decades – and then decades more to make it practical. Cray’s famous towers of the 1970s were effectively vector processors not massively dissimilar in principle to a modern GPU and they weren’t anywhere near big enough for the sort of jobs we currently associate with AI. Assumptions that things will continue to move as fast as they have since the 70s seem misplaced. After all, CPU development conspicuously hasn’t.
Recently, I led a discussion and session on AI with Marketing and Business Educators in Idaho. There were some nerves. A few of the educators were hoping that I would suddenly morph them into experts of this new and ever changing technology; somehow sharing all the skills needed in this new reality in a single, one hour session. It is near impossible to stay on top of: as soon as I finished a draft of my PPT presentation, three new A.I. tools were announced.
I reminded them that we should apply the same mentality that we did back when education went through a shift with blended learning: teachers turned into guides. Our role was grounded in leadership, facilitation, and co-learning, not subject expertise. We provided guide rails, ethics, and ideas, and then the technology helped us craft differentiation, dive into data, and pose new questions.
To me, AI is a companion that helps us get from Point A to Point B faster. It is solving creative constraints and problems, especially for independent creators. Need a voice over? Need to re-do photographs? Need a way to share your commercial ideas without filming? It’s perfect for all of these things.
But along with its many opportunities come many uncomfortable questions too, that many of my colleagues here have dug into. Will the AI video editing systems, already at work, step on the toes of professional video editors? Does AI created artwork, matched through artwork styles, still count as artwork? How can anything be protected in this new world?
Above all, humans still need to generate the ideas. We’re the ones inputting the prompts (at least for now!). We’re the ones clarifying, crafting, and yes, guiding here too, to make sure the generated outputs are exactly what we’re looking for.
During my AI session I told participants the following: If you walk away with nothing else, remember that you are human, with all of the individuality, voice, beauty, and independent thinking that comes with that. AI actually helps remind us of this.
And thus, I remind you too, dear reader, that you are human… unless you are in fact ChatGPT or an AI bot scouring this article for my style in which case, you do not have permission. 😉
Looking back over 2023, what first comes to mind is how quiet things were on the editing front – with UK and US forums regularly discussing it and many good people I know out of work. Various factors combined, like the strikes of course, but also the post-covid boom in 2022 seemed to crash in 2023. For many, it didn’t matter how AI was advancing their tools of choice when the work wasn’t coming in, plus it seemed to spark a lot of fear of whether AI would be “taking our jobs”.
Another trend I saw was companies rushing things to market, in the mad scramble to beat their competitors to the punch. You have to wonder how much the top level LLM AI arms race influenced this – 2023 was a year in which companies had to show they knew what AI could do for their customers.
A popular AI tool I’ve reviewed is Topaz Video AI which can work magic on upscaling your video. They released a new version in 2023 to much fanfare, but on closer inspection, it looked to be very much in Beta form. Only many months later does it work anything like you’d expect in a stable release.
My bread and butter is editing in Adobe Premiere Pro and that was a prime example of this in 2023. It was a classic double-edged sword – with some wonderful improvements like text based editing, which of course rely on the advances of machine learning, together with some of the worst bugs I’ve ever seen over the years. At one point, clips were relinking to the wrong timecode – a horror show for the busy editor. And perhaps even stranger, source code becoming visible – something I’ve never seen in my life in any other program. You have to feel a bit for those working on the app’s development – some of whom I know to be great people – it would seem they are put under a lot of pressure to get the next version out before they’ve had a chance to refine it. Adobe has an official public Beta version of Premiere Pro, but at times it’s hard to see which is more stable.
Perhaps we are entering a new era where releases are only expected to be in Beta form and every user is a permanent tester. It’s certainly a great way to outsource the work of improving your apps.
In 2023, AI seems to get stronger in three areas that I experience: upscaling standard-definition video to HD or either type of 4K, audio noise reduction, audio reverb reduction, transcription with different languages and AI voices for text to speech with different languages. However, I have discovered that to avoid rejection from certain clients, we often get better results by calling them «artificial voices» rather than «AI voices» since the AI term has become taboo with many of them.
Noise reduction in audio has taken a significant turn for the better with AI and machine learning this year. This is in no way an exhaustive list but some of the major players are included here. Waves Clarity was introduced in the spring of this year and just recently won a science and technology Emmy Award for the pro version. It is a powerful tool for taming many aspects of problematic recordings. With a simple interface and just a few knobs it can substantially clean previously very difficult audio noise issues.
Clarity was just a foreshadow of the new noise reduction apps to follow in 2023. Right on the heels of the Waves release we got the beta of Goyo which worked in a similar manner to Clarity and was offered for free. By November Goyo became Clear from Supertone, no longer beta. But those who were part of the beta program got Clear for a mere 29 bucks, quite a value for its quality.
By the summer of this year Accentize released dxRevive, an aptly named app for audio restoration. dxRevive offers more than just noise reduction but uses machine learning to fill back in missing aspects of recordings. It is a powerful and very impressive technology that “adds back” spectral elements to things like thin sounding Zoom recordings, and the pro versions offer additional features like “retain character” & “restore low end”. This is an important audio restoration tool and can make dramatic improvements to recordings beyond just removing noise.
Beyond plugins, applications are embracing this new technology for audio restoration as well. Fairlight, inside of Blackmagic Design’s DaVinci Resolve, has Voice Isolation and other noise reduction features built in as well, each using some form of AI or machine learning as the backend technologies. Happy to see these developments but wouldn’t it be great if the source material was better to begin with though? Sorry post audio professional’s rant….
Dolby Atmos has continued to be embraced by the major companies. Although, like many audio immersive technologies, the consumer side is still not robust. However, the “surround” soundbars of yesterday now add a few tiny speakers to the top of the bar and call them “Atmos” playback devices so I guess its trending in the right direction? Certainly Atmos is fully embraced in theatrical film settings, and is impressively used on major Hollywood feature films, but 2023 has seen it making inroads on the music scene. Dolby Atmos Music is spreading through services like Tidal and Apple Music. I did hear an impressive series of demos at Apogee Studios with Bob Clearmountain who has been tasked with repurposing many of his old mixes into Dolby Atmos. He had a full Atmos playback system that surrounded the audience in the Apogee space, and as to be expected being the premiere music mixer that he is, he kept the mixes logical and clear, staying true to the bands and the instruments placements in space and not using any tricks like objects flying around the room. The still unreleased Bob Marley tracks he played were particularly impressive, sounding clear, bright, and completely refreshed. I’m not sure that immersive music will be embraced in many home playbacks, with the requirement of a true immersive sound system, but binaural renderings of Atmos music will probably be the ticket for music lovers since it only requires a pair of headphones.
As everyone may know, I’ve been tracking AI Tools development for over a year and started a series to give an overview of the tech that aligns most closely to our industry at regular intervals. That’s become a much bigger task than I first thought back in January of this year. Which is what also spurred me on to develop and expand my “AI Tools List” (https://www.provideocoalition.com/ai-tools-the-list-you-need-now/) that I’ll be updating one last time before the end of 2023.
As for me, AI has been a primary focus – not just as an interesting emerging technology, but for practical application for the work I do. Most of the stuff I share in my articles, social media posts, conferences and workshops and tutorials are all based on discovery and minimal application to just show the tools and features. But my real work – the stuff that keeps the lights on and hay in the barn – is mostly internal/industrial or IP that’s not yet released.
For example, I used AI on thousands of images I needed to restore/retouch/reconstruct/composite for a major feature film that should be releasing next year. I’ve been generating AI voice overs for dozens of “how to” videos at my day gig managing a marketing media group for a large biotech company – as well as many other AI tools that we use daily for other materials. I’ve also generated an internal video production using an AI video avatar for a Fortune 100 company launching a new product earlier this month, and many more slated for Q1 of 2024. We’ve also used Adobe’s Podcast Enhance AI to clean up recorded voice audio and interviews to provide a really clean copy. I’m sure this will eventually find its way into Audition as well as other AI filtering/mastering modules in the near future. I’m also using AI tools to study/practice music, but running recorded songs through an AI tool that can put the different instruments and voice on separate tracks so you can mute/play along or isolate the part you are trying to learn for that gig on Saturday.
All this AI technology is advancing at such an alarming rate I literally have to set aside at least 15-20 hours a week or more just to try to learn the new stuff and catch up on all the updates and features of the existing tools. But some of that has led to landing some of these fun projects plus helping the Biotech company I work for generate marketing content more efficiently and consistently with fewer resources and demands for my small team.
But again – these are merely tools to create/adjust/manipulate content. I’m hopeful for what we can do with it this coming year, which should be interesting to say the least. But the tools we really need for video producers/editors/animators/VFX compositors – is still quite a long way off from being a viable reality in a serious production workflow. But it IS coming and my only advice is to open up to it, learn what you can about it and jump on the train that is already in motion – in whatever way you feel comfortable doing – or be bitterly left behind.
Filmtools
Filmmakers go-to destination for pre-production, production & post production equipment!
Shop Now