Video game voice actors are fearing that the ability for generative AI to replicate their voices may cost them work and, more fundamentally, control of their own voice.
I think courts in the US are slowly coming to the consensus that AI-generated content is not eligible for copyright. My opinion is that this solves the problem rather perfectly; companies now have an incentive to use humans, because if they use AI to make content then anyone is free to rip off that content, and I think that’s the way it should be.
AI should benefit humanity, and its products should be open and available for everyone, rather than being something for corporations to exploit for their own sole benefit.
Doesn’t apply to this case. This wasn’t in a commercial product but a fanmade Skyrim mod.
Apart from that, I fully agree. AI is an amazing tool for prototypes and hobby projects that wouldn’t be made at all without it (because honestly, nobody hires artists and voice actors for something only their friends will ever see). Making all AI-generated content public domain seems like a good compromise. Scientists and companies still have an incentive to improve the technology because people still have use cases where it doesn’t matter if someone copies what they generate, hobbyists can play around as much as they want and professionals have another tool in their toolbox to speed up prototyping before they start work on the actual handmade product.
So how much human involvement is required for something to become eligible for Copyright? If I’m an artist and I draw a character all by myself, but use AI to fill in the background, would that be eligible? If I’m a software developer and I occasionally let copilot autocomplete a line because it suggested the correct thing, does that mean the entire programm is now impossible to Copyright? Where is the line?
The Copyright Office recently released a webinar on just this point. Basically anything that is creative and human generated is still granted copyright, but the AI generated components are themselves non-copyrightable. In your examples, those components are fairly de minimis (small and insubstantial) and so the overall copyright of the work wouldn’t be impacted.
AI-generated content is not eligible for copyright.
That’s just not true, no matter how often clueless artists repeat that. What’s not copyrightable is completely AI generated simple stuff, like type “car” into StableDiffusion and copyright that image of a car. That’s not eligible for copyright and rightfully so, since that would turn AI generation into a minefield if people could copyright sections of it in bulk. But nobody does that anyway.
People want full books and games and stuff, not singular images of a car. For the time being at least, any full game or book will still be full of human input, it’s not something the AI will spit out without effort. You still need numerous different AI systems, custom training and a whole lot of manual back and forth before you get something to your liking. And the result of all that effort will still be copyrightable.
There might come a day in the not so distance future when AI can do it all by itself and make the whole game or movie from start to finish with no human input. But at that point you aren’t just replacing the actor, you are making Disney obsolete, as you no longer need static content, you can just generate it dynamic on the fly like you are on the Star Trek Holodeck. But at that point the fate of the voice actor will be the least of your problems, as you just made a whole lot of other human jobs obsolete too.
No, this isn’t really correct. The US Copyright Office has released policy that pretty clearly states where the line falls and it’s certainly beyond super simple prompts. In fact, by the reasoning in the policy document, I’d say it’s any time where if the AI were replaced with a human and you’d want a work for hire agreement to assign copyright, then that is likely non-copyrightable subject matter.
I’ll add, how this works with modern AI art flows, still remains to be seen, but I think probably on the side of no copyright. Currently, works use very elaborate prompts, some edits, bashes, and masks in an editor and then img2img and inpainting to really get your work where it needs to be. However, under the current rubric, the sort of nexus of creativity is still happening in the model so unlikely to be granted copyright.
What’s their definition of AI then? Seems like games that feature heavy procedurally generated content (for example) could fit many common definitions, and that is clearly not in the spirit of what they’re trying to do here.
For a lot of procgen content, i believe the individual assets or comprising components are still handcrafted, it’s just the placement of them that is done procedurally. But video game copyright is actually pretty complex (in theory; somewhat in practice, too, but much more answerable) so I’m not sure, assuming a fully genAI set of assets and their placement, how this would pan out. I suppose those components would need to be identified for limitations on the copyright under current filing guidelines, but there is still a whole lot in the game that is protected.
There need to be laws to establish protections of voice work for this reason. Any commercial use of an actor’s voice should require compenstation.
There are already laws in place concerning use of actos’ likenesses, but I don’t know if that would extend to their voices.
The headline makes it seem like the company used her voice but the article says it was a mod-creator aka non-commercial.
I’m all for this. I’m excited for games to have truly generative NPCs that I can talk to and have conversations with - but I want the voice actors who were used to generate those models to be compensated for it.
I think most people would even pay extra for it. Take the skyrim mod where you get a follower, like the article lays out. I think most people would gladly have a follower with AI based voice be an extra DLC, where the base game is as normal but for say, $10 bucks more you get an AI companion. I think that’d be a fair tradeoff, and the voice actor gets compensated fairly for it.
If the voice actors get paid fairly for the use of their voice, and I mean fairly, then there really isn’t much of an issue.
Provided you can get the model to land at a place where it’s replying like a well constructed character and not, well, an AI model (hopefully through the input and effort of a talented and well-supported writing team) I don’t see a future where this isn’t where this kind of tech lands. Games are always striving for some sense of realism (some correct away from reality but the driving force of the games industry is in that direction) and while I don’t think absolute realism is a healthy direction for people to aim, realistically realizing characters has a lot of room for unique and incredible games to strive for. Obviously bespoke writing is still the heart and soul of a good narrative but there’s some areas it can’t really cover and I think that tech like this is great for covering that uncharted water
Given that, the voice actors who train these models for moment-to-moment interactions and other stuff that can’t really be easily written for if a game’s content creates a need for it really NEED to be properly compensated otherwise it’s an incredibly unhealthy precedent for the industry. The speed at which this sort of AI develops outstripping proper precedent (legally and professionally) is much scarier to me than some sort of like, ai-overlord type future
I think the forseeable future will give us a hybrid solution where a writing team creates most of the content (dialogue for the main story and important side quests, character backstory, distinctive mannerisms) and AI fills in the rest.
One of the main problems with branching narratives is that it makes writing and recording dialogue very expensive. The upcoming Baldur’s Gate 3 has something like 170 hours of cutscenes and players will see less than 10% in a single playthrough. Not to mention hundreds of thousands of dialogue lines. Developers have to find techniques to reuse as much as possible which leads to situations where the ending consists of a loosely connected list of applicable scene snippets. Now imagine that AI can fill in the gaps between those snippets to make them seem like a single continuous sequence.
AI can also fill in events that the developers could never anticipate. Imagine you killing a random blacksmith in Skyrim. With current technology, NPCs would either not react at all or give a generic “killing innocents is bad”. How awesome would it be if the game would automatically generate a prompt from the basic facts: npc refuses to give discount, player kills npc, npc was blacksmith, player steals dead npc’s wares, wares are needed for sidequest, … and then use that to provide not only companion dialogue but also possible replies for the player. If this happens multiple times, maybe the companion will mention it in other situations or confront the player when they’re alone. Imagine if during a long walk through the wilderness, your companions start talking about what happened during the last few days.
With a fully AI-generated character, this would all become very generic and unnatural but if every character can extrapolate from a few hundred handwritten lines to match their tone, this could actually work.
I think one of their biggest concerns is that the majority of the work will be done by the AI and then a significantly smaller team of random writers will be hired on a very short term contract to merely check the work over and dot the t’s and cross the I’s.
I’m actually not too concerned about that. Yes, companies will try it because it saves money. But that will have a serious impact on quality and I still have hope that players will finally learn to just not buy a bad product. Sure, the bigger publishers will be able to sell through brand recognition alone for a while but not forever. This year, we’ve seen a lot of unfinished games and at least reviewers are starting to notice. The difference is that bugs can be fixed to recover from a bad launch. Bad content not so much.
I think the example of the NPC follower in Skyrim is a great place for it, being able to ask your follower “which way to Riften?” is kind of neat, giving them commands like “go stand by the rock and attack the wolf when they get close”. For actual dialogue, take Johnny in Cyberpunk or Liara in Mass Effect, there’s no way AI can stand in for voice actors.
I’m thinking shop owners, random NPCs, etc. Little bits of dialogue.
With games it’s kind of unavoidable in a long run, they are interactive and dynamic, be it outright mods or just new situations arriving out of gameplay. Being able to adopt the voice or automatically or generate variations of a sentences would be a huge benefit. Not exactly a new idea, we had things like iMUSE that dynamically adjust music for 30 years and effects like reverb have been dynamically added for at least 20. Most cutscenes are also realtime rendered for exactly this reason, you can’t reflect a costume change in a static FMV sequences. Now imagine you want a character to make a comment on the costume or weapon you are currently wearing, you’ll quickly end up with a combinatorial explosion of the amount of stuff you’d have to record.
Expanding it all with AI voices or AI filtered voices (think RPG with character creator) is just unavoidable. You can’t drive a dynamic medium with static content to it’s fullest potential. And of course many smaller indie games just don’t have the money for full voice over to begin with.
I also wouldn’t mind AI just as filter to change a voice, since some voice actors are just way to recognizable.
As long as they pay the actors fairly for use of their voice and/or likeness, it’s fine. The problem right now is the exploitation of people’s personality rights.
Yeah but still there will be less work, eventually they will generate new voices not related to nobody, yeah in some cases if they want the fandom of or popularity of a famous person yeah that will still be needed as you say, but for generic NPCs? An AI generated voice that is a new person in a way will be enough, so less work for voice actors for sure.
Ideally they’d just have AI do all the voices and get rid of the voice actors altogether. Maximum profit at all cost.
A company will want to pay voice actors a one off fee to train their voices, cut them loose, and then licence it out in perpetuity. Probably with greater legal protections than the underlying actor.
Yeah, it’ll be like buying instrument sample packs for keyboards. Electronics musicians have been doing this for ages. Now you’ll be able to buy the “12 gruff voices” pack and tell it to read your script. It’ll be great for NPCs, but probably not good enough for main characters in AAA titles.