laitimes

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

Reporting by XinZhiyuan

Editor: Yuan Xie La Yan

Although DeepFake can change his face convincingly, he can't change his hair as well. Now researchers at Zhejiang University and Sweden have broadened their thinking and used GAN or CNN to generate additional realistic virtual hair.

The late 2010 years of DeepFake's technology coincided with the Trump era.

Countless hands-rubbing people who intend to use DeepFake to play a good prank on the big president have encountered a small obstacle in practice:

Various DeepFake software can replace the image with the orange face of the golden retriever, but the unruly blonde hair really makes the AI unable to generate an incredible replacement.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

Look, is that hair that makes DeepFake products dew filling.

DeepFake can change his face, and he can't change his hair

In fact, this is the old problem encountering a new challenge. How to reproduce the hair of a human model to life is a difficult problem since the Greco-Roman statue master.

The average human head has about 100,000 strands of hair, and because of the difference in color and refractive index, after exceeding a certain length, even in the computer age, it can only be simulated with complex physical models to move and reorganize images.

Currently, only traditional CGI techniques since the end of the 20th century can do this.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

CGI hair model treatment results in 2017

The current DeepFake technology is still not very good at solving this problem. For several years, DeepFaceLab also released only a "head hair" model that could only capture short hair, and the hair was still stiff. It's also an industry-leading software package.

Recently, DFL's partner FaceSwap created the BiseNet semantic segmentation model, which enables users to include graphical details to ears and hair in deepfake output images.

Both packages came from the source code of Deepfakes in 2017 and were controversial at the time.

Even if the hair image to be presented by the DeepFake model is very short, the quality of the output is often very poor, and the avatar seems to be superimposed, not like a part of the image that is integrated.

Use GAN to generate hair

At present, the two most used methods used in the industry to simulate portraits are Neural Radiance Fields. NeRF can capture images from multiple perspectives, which can then be encapsulated in explorable neural network AI.

Another approach is to generate adversarial networks (GANs), which are more advanced than NeRF in human image synthesis, even though NeRF only appeared in 2020.

NeRF's speculative understanding of 3D geometry will enable it to replicate pattern scenes with high fidelity and consistency. Even if there is currently no space applied to the physical model, or precisely a change that has nothing to do with the camera's viewing angle, the data collected will cause the same distortion.

For now, though, NeRF's ability to simulate human hair movements is not outstanding.

Unlike NeRF, GANs naturally have an almost fatal disadvantage. The potential space of a GAN does not naturally contain an understanding of 3D information.

Therefore, the composite image of the face generated by the 3D perceptible GAN has become a hot issue in the study of image generation in recent years. And InterFaceGAN in 2019 is one of the most important breakthroughs.

However, even the carefully selected image results on the InterFaceGAN presentation show that achieving satisfactory consistency in the performance of time consistency in neural network AI-generated hairline images is still a daunting challenge, and the application is still unreliable in the VFX image workflow.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

After changing your face with InterFaceGAN, the hair on your avatar appears transpiring

It is becoming increasingly clear that coherent view generation by manipulating the potential space of neural network AI may be an alchemical-like technique.

More and more papers have to take a different approach, incorporating CGI-based 3D information into the GAN workflow as a stable and normalized constraint.

CGI elements can be represented by intermediate graphical elements in 3D, such as the Skinned Multi-Person Linear Model (SMPL).

Or by applying 3D inference techniques similar to the NeRF pattern, the geometric elements of the image are evaluated from the source image and the source video.

Just this week, researchers from the ReLER Lab at the University of Technology Sydney, the AAII Institute, the Ali Damo Institute and Zhejiang University collaborated to publish a paper describing a "multi-perspective coherent generative adversarial network" (MVCGAN) for 3D perceptual image synthesis.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

MVCGAN-generated avatars

MVCGAN includes a "Generative Radiation Field Network" (GRAF) AI that provides geometric limitations in THE GAN. Theoretically, this combination is arguably the most realistic virtual hair output of any GAN-based approach.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

The hair-bound avatar generated by MVCGAN compares to the avatar generated by other models

As you can see from the image above, under extreme hair parameters, the image results of all models except MVCGAN produce unbelievable distortions

However, time-based virtual hair reconstruction remains a challenge in CGI workflows.

Therefore, there is no reason for the industry to believe that traditional, geometric-based methods can bring time-consistent hair pattern synthesis into the potential space of AI in the foreseeable future.

Generate stable virtual hair data with CNN

However, the upcoming paper published by three researchers at the Chalmers Institute of Technology in Sweden may also provide new advances in the study of "using neural networks to generate images of human hair".

The paper, titled "Hair Filters in Real Time with Convolutional Neural Networks," will be published at the major academic conference "Interactive 3D Graphics and Games Extravaganza" in May 2022.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

The system is based on an autoencoder-based neural network AI that evaluates in real time the resolution of the resulting virtual hair pattern, including the shadows and hair thickness presentation that the hair automatically produces in the virtual space. The random number seed for this autoencoder comes from a finite random number sample generated by openGL geometry.

In this way, it is possible to render only a limited number of samples with random transparency and then train U-net to reconstruct the original image.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

The neural network is trained on PyTorch and can be trained to converge in 6-12 hours, depending on the size of the neural network and the number of input eigenvalues. The trained parameters (weights) are then used for the real-time implementation of the image system.

The training dataset is generated by rendering hundreds of actual images of straight and wavy hairstyles at random distances, poses, and different lighting conditions.

The value of the hair translucency in the sample is averaged from images rendered at random transparency at supersampled resolution.

Raw, high-resolution data is downsampled to accommodate network and hardware limitations, and then upsampled in a typical autoencoder workflow to improve clarity.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

Using "real-time" software that uses algorithms derived from the training model, as a real-time inference application for this AI model, a mixture of NVIDIA CUDA, cuDNN, and OpenGL is used.

The initial input eigenvalues are dumped into OpenGL's multisampling color buffer, and the results are shunted to cuDNN tensors before continuing processing in the CNN, which are then copied back into the "real-time" OpenGL texture for application to the final image.

The real-time running hardware of this AI is an NVIDIA RTX 2080 graphics card that produces an image resolution of 1024x1024 pixels.

Since the data values for hair color are completely separate from the final values processed by the neural network AI, changing the hair color is an easy task, although effects such as gradients and streaks of virtual hair strands will still pose challenges in the future.

DeepFake head replacement upgrade: Zhejiang University new model, GAN out of a hair

conclusion

Exploring the potential space of an autoencoder or GAN is still more akin to sailing with intuition than to precise driving. Only in recent times has the industry begun to see reliable results in generating "simpler" geometries such as faces in methods such as NeRF, GAN and non-deepfake (2017) autoencoder frameworks.

The significant structural complexity of human hair, combined with the need to combine other features that current physical models and image synthesis methods cannot provide, suggest that hair synthesis is unlikely to remain just an integrated component in general facial synthesis models. This task requires complex, specialized, and independent neural network AI to accomplish, even though these neural networks may eventually be incorporated into a broader, more complex framework for facial synthesis.

Resources:

https://www.unite.ai/tackling-bad-hair-days-in-human-image-synthesis/

https://arxiv.org/pdf/2204.06307.pdf

Read on