Can AI Replace Actors? Here’s How Digital Double Tech Works

Inside the orb, the world is reduced to a sphere of white light and flashes. Outside the orb’s metallic, skeletal frame is darkness. Imagine you are strapped into a chair inside this contraption. A voice from the darkness suggests expressions: ways to pose your mouth and eyebrows, scenarios to react to, phrases to say and emotions to embody. At irregular intervals, the voice also tells you not to worry and warns that more flashes are coming soon.

“I don’t think I was freaked out, but it was a very overwhelming space,” says an actor who asked YEAR CATFISH to withhold his name for privacy reasons. He’s describing his experience with “the orb,” his term for the photogrammetry booth used to capture his likeness during the production of a major video game in 2022. “It felt like being in [a magnetic resonance imaging machine],” he says. “It was really very sci-fi.” This actor’s experience was part of the scanning process that allows media production studios to take photographs of cast members in various positions and create movable, malleable digital avatars that can subsequently be animated to perform virtually any action or motion in a realistic video sequence.

Advances in artificial intelligence are now making it steadily easier to produce digital doubles like this—even without an intense session in the orb. Some actors fear a possible future in which studios will pressure them to sign away their likeness and their digital double will take work away from them. This is one of the factors motivating members of the union SAG-AFTRA (the Screen Actors Guild–American Federation of Television and Radio Artists) to go on strike. “Performers need the protection of our images and performances to prevent replacement of human performances by artificial intelligence technology,” the union said in a statement released a few days after the strike was announced in mid-July.

Although AI replacement is an unsettling possibility, the digital doubles seen in today’s media productions still rely on human performers and special effects artists. Here’s how the technology works—and how AI is shaking up the established process.

How Digital Double Tech Works

Over the past 25 years or so, it has become increasingly common for big-budget media productions to create digital doubles of at least some performers’ face and body. This technology almost certainly plays a role in any movie, TV show or video game that involves extensive digital effects, elaborate action scenes or an actor’s portrayal of a character at multiple ages. “It’s become kind of industry standard,” says Chris MacLean, visual effects supervisor for the Apple TV show Foundation.*

The photogrammetry booth is an area surrounded by hundreds of cameras, sometimes arranged in an orb shape and sometimes around a square room. The cameras capture thousands of intentionally overlapping two-dimensional images of a person’s face at a high resolution. If an actor’s role involves speaking or showing emotion, pictures of many different facial movements are needed. For that reason, starring performers require more extensive scans than secondary or background cast members. Similarly, larger setups are used to scan bodies.

With those data, visual effects (VFX) artists take the model from two-dimensional to three-dimensional. The overlap of the photographs is key. Based on camera coordinates—and those redundant overlapping sections—the images are mapped and folded in relation to one another in a process akin to digital origami. Artists can then rig the resulting 3-D digital double to a virtual “skeleton” and animate it—either by directly following an actor’s real-world, motion-captured performance or by combining that performance with a computer-generated series of movements. The animated figure can then be placed in a digital landscape and given dialogue—technically, it’s possible to use a person’s scans to create photorealistic video footage of them doing and saying things that actor never did or said.

Special effects artists can also apply an actor’s digital performance to a virtual avatar that looks completely different from the human person. For instance, the aforementioned video game actor says he made faces in the orb and recorded his lines in a recording booth. He also physically acted out many scenes in a separate studio with his fellow performers for motion capture, a process similar to photogrammetry but designed to record the body’s movements. When players engage with the final product, however, they won’t see this actor on-screen. Instead his digital double was modified to look like a villain with a specific appearance. The final animated character thus manifested both the actor’s work and the video game character’s traits.

Film and television productions have used this process for decades, although it has historically been both labor intensive and expensive. Despite the difficulty, digital doubles are common. Production teams frequently use them to make small adjustments that involve dialogue and action. The tech is also employed for larger edits, such as taking a group of 100 background actors and morphing and duplicating them into a digital crowd of thousands. But it’s easier to accomplish such feats in a convincing way if the original footage is close to the desired final output. For instance, a background actor scanned wearing a costume meant to replicate clothing worn in 19th-century Europe would be difficult to edit into a dystopian future in which their digital double wears a space suit, MacLean says. “I don’t think there’s any way that the studios would have that much patience,” he adds.

Yet generative artificial intelligence, the same sort of machine-learning technology behind ChatGPT, is starting to make aspects of the digital double process quicker and simpler.

AI Swoops In

Some VFX companies are already using generative AI to speed up the process of modifying a digital double’s appearance, MacLean notes. This makes it easier to “de-age” a famous actor in films such as Indiana Jones and the Dial of Destiny, which includes a flashback with a younger-looking version of now 81-year-old Harrison Ford. AI also comes in handy for face replacement, in which an actor’s likeness is superimposed over a stunt double (essentially a sanctioned deepfake), according to Vladimir Galat, chief technology officer of the Scan Truck, a mobile photogrammetry company.

Galat says advances in AI have made some photogrammetry scans unnecessary: A generative model can be trained on existing photographs and footage—even of someone no longer living. Digital Domain, a VFX production company that worked on Avengers: Endgame, says it’s also possible to create fake digital performances by historical figures. “This is a very new technology but a growing part of our business,” says Hanno Basse, Digital Domain’s chief tech officer.

So far living humans have still been involved in crafting performances “by” the deceased. A real-word actor performs a scene, and then effects artists replace their face with that of the historical person. “We feel the nuances of an actor’s performance, in combination with our AI and machine learning tool sets, is critical to achieving photorealistic results that can captivate an audience and cross the uncanny valley,” Basse says, referring to the eerie sensation sometimes caused by something that looks almost—but not quite—human.

Fears of Robot Replacement

There’s a difference between adjusting a digital double and replacing a person’s performance entirely with AI, says computer engineer Jeong Joon “JJ” Park, who currently researches computer vision and graphics at Stanford University and will be starting a position at the University of Michigan this fall. The uncanny valley is wide, and there’s not yet a generative AI model that can produce a complete, photorealistic, moving scene from scratch—that technology is not even close, Park notes. To get there, “there needs to be a major leap in the intelligence that we’re developing,” he says. (AI-generated images may be hard to tell from the real thing, but crafting realistic still images is much easier than creating video meant to represent 3-D space.)

Still, the threat of abuse of actors’ likeness looms. If one person’s face can be easily swapped over another’s, then what’s to stop filmmakers from putting Tom Cruise in every shot of every action movie? What will prevent studios from replacing 100 background actors with just one and using AI to create the illusion of many? A patchwork of state laws means that, in most places, people have legal ownership over their own likeness, says Eleanor Lackman, a copyright and trademark attorney. But she notes that there are broad exceptions for artistic and expressive use, and filmmaking could easily fall under that designation. And regardless of the law, a person could legally sign a contract giving their own likeness rights over to a production company, explains Jonathan Blavin, a lawyer specializing in media and tech. When it comes to protecting one’s digital likeness, it all comes down to the specifics of the contract—a situation SAG-AFTRA is well aware of.

The actor who played the video game villain felt comfortable being scanned for his role last year. “The company I worked with was pretty aboveboard,” he says. But in the future, he may not be so quick to enter agreements. “The capabilities of what AI can do with face capture, and what we saw from the [prestrike negotiations], is scary,” he says. The actor loves video games; he was excited to act in one and he hopes to do so again. But first, he says, “I would double-check the paperwork, check in with my agency—and possibly a lawyer.”

*Editor’s Note (7/25/23): This sentence was edited after posting to clarify Chris MacLean’s position at the Apple TV show Foundation.

Leave a Reply

Your email address will not be published. Required fields are marked *