laitimes

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

author:Heart of the Machine Pro

Reports from the Heart of the Machine

Editor: Chen Ping

You write a word on paper, and the AI can imitate your handwriting with just one glance, or the kind that looks flawless.

Facebook recently announced a new image AI, TextStyleBrush, which can copy and reproduce the text style in images.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

With this technology, you only need to enter a word as a "standard", and AI can imitate your writing style in one click, and the effect is amazing.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

In addition, you can use it to replace text in different scenes (such as posters, trash cans, road signs, etc.). In the following image, the original scene image is on the left, and the words are displayed in the blue rectangle; the right is the image after the text is replaced.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

As can be seen from the figure, various styles of font AI can almost hold. Each image pair in the image below shows the input source style on the left and the new content (string) on the right, and the fonts on the left and right look exactly the same style. The output image seems to be a bit blurry in appearance compared to the source image, but we can see that in most cases, the technique seems to work very well.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

Compared to other handwriting imitation AI, TextStyleBrush is more powerful and can analyze text styles from a more subtle perspective, so that handwriting can be imitated from various angles and backgrounds.

The following figure shows the implementation process of replacing soy sauce bottle (Soya) with tea bottle (Tea):

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

This powerful imitation artifact is the "TextStyleBrush" launched by Facebook AI, which can perfectly reproduce the handwriting with just one word. The principle of this technology is similar to the style brush tool in word processing apps, which separates text from style.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

Address of thesis: https://scontent-sjc3-1.xx.fbcdn.net/v/t39.8562-6/10000000_944085403038430_3779849959048683283_n.pdf?_nc_cat=108&ccb=1-3&_nc_sid=ae5e01&_nc_ohc=Jcq0m5jBvK8AX--fG2A&_nc_ht=scontent-sjc3-1.xx&oh=8b7e8221bba5aba6b6331c643764dec5&oe=60EF2B81

Dataset address: https://github.com/facebookresearch/IMGUR5K-Handwriting-Dataset

It has the following features:

It only takes one word to replicate the style of text in a photo. Using this AI model, you can edit and replace text in an image.

Unlike most AI systems, TextStyleBrush is the first self-supervised AI model, using a single example word to replace text in handwriting and images at once.

In the future, it will unleash new potential in areas such as personalized messaging and subtitling, such as realistic language translation in augmented reality (AR).

By publishing the capabilities, methods, and results of this study, the researchers hope to drive conversations and research to uncover potential applications for such technologies, such as deep vacation text attacks — a major challenge in the field of artificial intelligence.

Because TextStyleBrush could also be used to make misleading images, Facebook's CTO said on his personal social networking site that they only published papers and datasets, but did not publish the code. And said that as with our approach to deepfakes, we believe that sharing research and datasets will help build detection systems and prevent attacks in advance.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

TextStyleBrush for text style representations can be learned

Generating images with AI has been evolving at an alarming rate, a technique that is capable of recreating historical scenes or turning photographs into Van Gogh's painting style. Now, Facebook AI has built an AI that can replace scenes and handwritten text styles, requiring only one word as input.

While most AI systems can accomplish well-defined, specialized tasks, building an AI system that is flexible enough to understand the nuances of real-world scenarios Chinese and handwriting is challenging. This means understanding a multitude of text styles, not only different fonts and writing styles, but also different transformations such as rotating, curved text, and image noise.

Facebook AI proposed the TSB (TextStyleBrush) architecture. The architecture is trained in a self-supervised approach, with no target style supervision, only the original style image. The framework can automatically look for the real style of the picture. At training time, it assumes that each word box has a true value (the text that appears in the box); when inferring, it takes a single source style image and new content (string) and generates a new image with the source style of the target content.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

The generator architecture is based on the StyleGAN2 model. However, it has two important limitations:

First, StyleGAN2 is an unconditional model, which means that it generates images by sampling a random potential vector. However, TextStyleBrush must generate an image of the specified text.

Second, textStyleBrush generates an uncontrolled style of text images. Text style involves global information, such as color palettes and spatial transformations, as well as fine combinations of proportional information, such as subtle variations in a single handwriting.

The researchers adjusted the generator to address the above limitations through content and style representations. Handle the multi-scale characteristics of text styles by extracting layer-specific style information and injecting it into each layer of the generator. In addition to generating the target image in the desired style, the generator also generates a soft mask image that represents the foreground pixels (text area). In this way, the generator can control the low resolution and high resolution details of the text to match the desired input style.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

The study also introduced a new self-supervised training guideline that uses typeface classifiers, text recognizers, and adversarial discriminators to preserve source style and target content. First, the researchers evaluated the generator's ability to capture the style of input text by using a pre-trained font classification network. In addition, they used a pre-trained text recognition network to evaluate the content of the generated image to reflect the effect of the generator capturing the target content. All in all, this approach enables effective self-supervision of training.

experiment

Table 2 provides experimental results for evaluating different loss functions, stylistic feature extensions, and the ablation of masks when training TSBs. Experimental results show that the images generated by TextStyleBrush are greatly reduced in MSE (synthesis error), and PSNR (peak signal-to-noise ratio) and SSIM (structural similarity) are improved.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

Table 3 is the accuracy of text recognition measured on images in three datasets. The experimental results show that TSB has the best recognition effect, with an accuracy of 97.2% on IC13, 97.6% on IC15, and 95.0% on TextVQA.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

Table 4 provides a quantitative comparison of the generated handwritten text, comparing the TSB method with the SotA method, designed specifically for generating handwritten text by Davis et al. [14]. The lower the FID score, the better the build quality. Clearly, the TSB approach is superior to previous work.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

TextStyleBrush proves that AI can be more flexible and accurate in text than in the past, but the technology still has many problems, such as the inability to imitate characters on metal surfaces or colored characters, etc. Facebook hopes that this research can continue to expand, breaking through the barriers between translation, autonomous expression and deepfake research.

Giving a word can mimic your handwriting, Facebook is too powerful to open source code

Failure case.

Reference Links:

https://ai.facebook.com/blog/ai-can-now-emulate-text-style-in-images-in-one-shot-using-just-a-single-word

Read on