laitimes

The ghosts of Tencent and Bytes hover over China's AI

author:Silicon Star Man

In April 2023, two days after leaving Tencent, product manager Song Goose (pseudonym) came to Beijing from Shenzhen to join a startup with a pedestal model. At that time, the company did not have much capital and user voice, only a Chinese name that sounded quite awkward: the dark side of the moon.

At that time, the company's product team was only one pine goose.

In the same month, Wang Changhu, who had participated in the construction of products such as Douyin and TikTok at ByteDance, founded a Wensheng video company when China and the United States were crazy about "rolling" Wensheng pictures: Aishi Technology.

Almost at the same time, after trying a number of AI projects such as AI speaking software, AI face swapping, and AI psychological counseling, the surge in the number of users and the enthusiasm of investors made Wan Lei, who was at Tencent, find that the outlet for AI has really arrived.

Ultimately, the big model will be implemented by the product, and this is an opportunity that these product managers with a keen sense of smell see. And that's where the story begins.

6 months after the pine goose came to the moonlight, they made Kimi, and after another 6 months, Kimi exploded on the whole network; Before Sora became popular, the product team of Esté had already begun to focus on breaking through the problem of "consistency" and repeatedly iterating and optimizing. After Wan Lei was questioned by investors many times that "AIGC applications have no core technical barriers and are easy to be imitated", he met Jiang Yuchen, who had just graduated from ETH Zurich and brought large model technology, at a closed-door meeting with Lanchi. One understands the product, the other has the technology, and they make up the waveform intelligence.

If we compare the history of the mobile Internet, the only way for technology to penetrate into the lives of ordinary people is the blowout of application. The field of large models seems to be experiencing something similar today. Every day, we see one or two new products being born, going viral, and being widely discussed, all with the goal of becoming an "AI Native" super app.

Behind these seemingly new AI star products, an interesting phenomenon is becoming more and more obvious:

You can always find the shadow of the last era in them, more precisely, the shadow of Tencent and Byte - the two strongest companies in China's mobile Internet era, just like ghosts, hovering over China's large-scale model products.

Part 1: Tencent's "Disciples"

When people from Tencent come out, they are always very "Tencent". They make products and are also loyal disciples.

The personal column of the pine goose is called "Goose Library", and Wan Lei has a bunch of Zhang Xiaolong's emojis.

Goose and Zhang Xiaolong are Tencent's "totems".

In terms of product style, Tencent's product managers are deeply influenced by Zhang Xiaolong, the "father of WeChat". The product is the "connector" that Ma Huateng has been emphasizing, it is the connection between technology and users, and in Tencent's product system, 2C products are to achieve the "ultimate" user experience.

At the beginning of 2023, Songe is still staying in the Tencent Meeting team, and in his own words, he can even say that he is "very happy".

It is one of the most available conferencing tools at the moment, and it was even once said that Tencent was the next star product after WeChat. When all other manufacturers packaged and integrated IM, conferences, documents, and OA into one software, Tencent disassembled them.

Minimalism is the concept of "less is more" put forward by Zhang Xiaolong in the era, and Songgoose continued to implement this concept after leaving Tencent Meeting.

Songgoose once shared on social platforms: "It's easy to make a product and add a feature, it's hard to iterate the user experience without adding a function, and it's the hardest to reduce a feature." Most products are bloated by constantly adding features. "It's a translated version of Zhang Xiaolong's product quotes.

And Kimi can indeed find the corresponding imprint on him.

The ghosts of Tencent and Bytes hover over China's AI

Opening Kimi early on, there was almost nothing but a dialog box. In the constantly updated version, it "grows" out of several buttons.

These buttons include "Home", "New Session", "Historical Session" and "Kimi+", as well as Kimi's unique capabilities "Web Link" and "File Upload", which condense the core functions into buttons, which is another tradition of Tencent.

In order for the user to understand what the button does, there are also a large number of "bubbles" in Kimi to explain further. Not only that, but at the bottom of the Logo, the copywriting with a sense of literature and art is different.

According to an interesting example shared by Songgoose publicly, it can also be seen that this kind of Tencent brand is not a default principle for everyone who makes products. Especially, when your team also has byte people - on April 18th, in some product discussions, Songgoose proposed that a copy was needed somewhere in the product. And "a colleague from Byte feels understandable: making products at Byte will never have this kind of requirement, which will not improve conversion." He shared.

"But there really has to be a copywriter here, and it has to be a more skillful copywriter. Well, it's almost funny now. After a pause, Songe said: "That's how we make products at Tencent. ”

Dismantling the needs of users one by one to "no matter how big or small", people-oriented is the soul of Tencent's products. At present, Kimi is the only domestic large-scale model, both the web version and the App applet form of a large model product, an independent developer gave us feedback: "On the surface, the web version, APP version and mini program version set by Kimi are not much different from other similar products, but when it is used as a plug-in in the web page, Kimi's rendering effect is better." ”

The ghosts of Tencent and Bytes hover over China's AI

On the right is the Kimi plug-in

The flexible product form makes Kimi more popular and universal, Kimi plug-ins can directly translate + organize important information, and at the same time, you can also customize and adjust your needs and make personalized settings. According to public information, the web plug-in version is generated by the developer spontaneously calling the API, and is not from the Moonlight team.

The ghosts of Tencent and Bytes hover over China's AI

Frog writing, which is also polished by Tencent's product manager, has many stylistic similarities with Kimi in terms of product form. For the dismantling of different scenes, function introduction, and tutorial guides, there are many details of "more real" and "dead".

For example, for an AI product that generates text content, Wan Lei's team found that the current model generation capabilities are often not able to directly generate the entire usable content, and often part of the available content needs to be adjusted twice or even three times. The answer they gave was very tencent-in the generation results of the large model, and the "sliding word" function was added to facilitate the secondary optimization after AI production.

The ghosts of Tencent and Bytes hover over China's AI

At the same time, Tencent's other soul, "social", is also quickly displayed in these products.

After graduating from university, Wan Lei entered Tencent, responsible for the part of the social relationship chain in the national karaoke, and then transferred to the "innovation project" department, constantly catching up with the Internet outlet, doing digital people, metaverse games, making friends with strangers, and also doing a bunch of small projects related to AI.

Before the launch of the national K song, K song already had a benchmark product "Sing it". At that time, the singing bar had attracted and cultivated a large number of KOLs, and it also worked product details such as tuning and making MVs.

How to play this headwind? Referring to the previous routines of WeChat taking advantage of QQ and enterprise WeChat taking advantage of WeChat, the national karaoke emphasizes more "who is listening" rather than "who is singing". After opening up with WeChat, the social relationship chain was naturally transferred to the karaoke platform, which also created soil for subsequent development.

When you open Frog Writing, you will find that unlike other products, it emphasizes social communication, "inviting members" and "joining the group as a courtesy" make people dream back to 2018, when the whole network was "fission" to attract new users.

The ghosts of Tencent and Bytes hover over China's AI

We have talked to a number of Agent entrepreneurs about the current entrepreneurial environment, and their feedback is often "what we are facing today is not the problem of not being able to do it, but the problem of who can make it for whom".

"Finding users", whether in the era of mobile Internet or AI, is the first core problem. When it comes to copywriting tools, it's one thing to write it out, and it's another thing to show it to. In the group, users can communicate and exchange experiences, share their results with each other, and even sign up for a novel contest through the channel of frog writing.

Wan Lei shared: "We have hundreds of users interviewed by phone, sometimes we will call key users to the office, use the product in front of us, and the product team will immediately adjust after finding problems. ”

The "path dependence" of these product styles is becoming prominent with the growth of products, and these subjective, fragmented and even somewhat obsessive practices are very tencent.

Part II: The "Inheritor" of Bytes

But the style of bytes is almost the other end of the spectrum. Reflected in the new batch of AI products, it is not inherited by people, but emphasizes the thorough innovation of a product logic.

Byte people come out to start a business, with a way to make products, not a style of products.

"When I was 19 years old, there was a little girl in our group who had a byte, and it felt very different. Although we usually look at the data, it is obvious that she is more sensitive to the data and AB test, and all the needs are deduced according to the data results, including good and bad, are judged by the data. Wan Lei recalled.

Byte-based products run relatively well, the product model is basically very similar and rough and simple, the recommendation system is on the middle stage + data input with sufficient scale, from the connotation of the joke to the headline to Douyin to the younger Che Di later, the tomato novels are the same, and the products that Tencent products do well are basically inseparable from the social relationship chain.

Tencent is like a liberal arts student, and the people who make products are all human sociological research scholars; Bytes, on the other hand, are science students, who pour data, do AB tests, and then run data and produce results. Mu Zhi, the head of product at Aishi Technology, summed it up like this.

Algorithm is the soul of the entire "byte system" product, public information shows that Wang Changhu is responsible for the construction of the visual algorithm platform and business middle platform in ByteDance, and uses the methodology represented by the algorithm to determine the appearance of the product, which is also brought to his Aishi Technology.

"The creation of the recommendation algorithm technology platform is the most difficult, he needs to give enough space and freedom, so that the product can complete more requirements testing in a short period of time, and he also needs to be open enough to the pressure and adaptability of the future product development space." Makiyuki said.

Referring to Douyin in the mobile Internet era, the recommendation algorithm framework built when millions of video views are reached, and when the scale of tens of billions of views is reached, the algorithm is still applicable and can still efficiently analyze the preferences of each user.

Here we take a simple example, using a two-way label comparison system, tagging users and content tags, and two-way matching, so that no matter how huge the content volume is, how the user volume grows, such a mechanism can ensure the realization of "thousands of people, thousands of faces".

Bytes believes in this methodology, and this methodology will have different results and problems in different scenarios and industries. For example, for Aishi, if the product manager wants to design a mechanism and run a recommendation model, how can you make your Tranformer or Diffusion model get enough upfront data through the product?

"The first important thing is still to know what the user wants, what he wants this video to do, and what Yang data I need, and the second is to design a set of mechanisms, whether it is to produce data by yourself, buy or crawl data, or reinforcement learning to recover data, and input it to your model." Pastoral sermons.

Therefore, before launching PixVerse, the Esther product team did a long period of user research in order to complete the first step.

Through a preliminary survey of professional video producers at home and abroad, Aishi found that clarity is a core requirement of users, and consistency is a higher standard for whether the video generation model can become a higher standard of productivity.

In the early stage of technology development, for example, the time of Wensheng video can only have a high-quality output of 3-4 seconds, which can meet the limited needs, and the product needs to find a suitable entry point.

With the entry point in place, the next step is to build a benign algorithm model.

Makizhi gave an example, in the matter of training lenses, if the product can make it clear that the current users actually need some professional lenses the most, then when you do data collection and data labeling and clarity, there will be a strong tendency, that is, the data of these professional lenses is needed, with such judgment, you can often get a better result with less data and lower training costs.

As the number of users grows, more and more metadata is annotated, and the algorithm becomes more and more flexible.

At present, PixVerse has continuously ranked high in the number of downloads in the list of overseas Wensheng video products, and through continuous algorithm optimization and data accumulation, it has formed a product breakthrough, and the rolling snowball has begun to have a substance.

Part 3: New Bonds between Technology and Products

The difference between the product styles of Tencent and Byte actually implies an essential problem: the balance between technology and products.

Tencent's products were born and matured in the Internet period and the early days of the mobile Internet, and the technology is ready-made, and the products need to provide an accurate understanding of user needs to reflect its value. Byte-based products are growing rapidly in another stage of the mobile Internet, at this time, the technology with algorithms as the core is itself making rapid progress in a kind of instability, and the powerful capabilities it brings are the foundation for products like Toutiao and Douyin to appear, and the role of the "God" of the product manager has given way to algorithm technology.

In today's era of AI large models, this kind of bond between products and technologies continues to shroud the manufacturing process of various products in a different form.

"The biggest difference between making products and mobile Internet today is that we need to think about the problems that technology can solve and the problems that products can solve." Almost all of the product managers we asked gave this answer. But again, different undertones still determine the answer to the question.

For video generation products, on the one hand, the effect of the model itself is closely related to technical resources: for example, the limitation of graphics card, video memory, or insufficient computing power will directly affect the effect; On the other hand, there is a lot of narrative logic in the video itself, and there is complete control over the plot, which leads to the inability to meet the ideal product form. Therefore, just like Byte's products, Aishi's product design should also be largely based on algorithm technology.

In the beginning, most video production products can only generate 4s video, but the average single shot of a movie is 6 seconds, and how to break through the duration of technology to advance, and the product needs to think about what scene can be used even if it is a 4s video.

Even under the limitation of 4S, it can still solve the problem of making up for some empty shots and missing frames in traditional film and television production, and solving the problem of high cost of reshooting and reshooting.

Moreover, in today's continuous iteration of the underlying large model, the pursuit of product details should also be based on technical differences, which even directly determines the form of the product. Kimi and Frog Writing seem to be large model products of text generation, but their technical capabilities are completely different.

As we all know, Kimi is good at the input of long texts, and can read a "Three-Body Problem" in one breath, but in the application, you will find that Kimi's long text output ability is not strong enough, no matter what kind of prompt words are given, the output content is often about 1000 words, so Kimi's use scenarios are often "modify part of the paper", "write Xiaohongshu copy" and so on.

As a more vertical product in the generation of creative copywriting, Frog Writing's core technical capabilities are long text output and long-term memory, with the novel function as the entry point, through the setting of the background, task, and main plot of the novel, Frog Writing can often generate thousands of words of novel content, and can completely preserve the preface plot. It is also more comfortable for enterprise-level database customization and imitation output of fixed files.

"Of all the text generation, writing a novel is actually the hardest. It needs to be output in strict accordance with the worldview framework, the emotional portrayal of the characters must be in place, the characters' lines must be anthropomorphic enough, and the story reversal must be closely related to the setting, etc. Even though Sora is a simulator of the physical world, it still needs text to set all the pretexts before it can be generated. In the future, text creation content will be used as the bottom layer, and each independent multimodal technology will really be implemented, and it will still have to be called by the bottom layer in order to have greater value," Wan Lei said.

But people who believe that the product is still a "handiwork" will not leave everything to technology.

You can see that in domestic general large model products, at the bottom of the generated content, there is often a "like" and "click" buttons, which are the evaluation and feedback of human beings to the large model.

The ghosts of Tencent and Bytes hover over China's AI

Wan Lei said: "If you let the technology identify which generated content is good and which is bad, it is very difficult to achieve it by technology, but adding some design to the product can feed back the technology and make the generated results more and more accurate." ”

More importantly, at this stage, the product is a "hook" thrown to the market, and only continuous use and continuous feedback can it continue to iterate and evolve. Only if the product hooks the user, everything after that makes sense.

"The end result of the rapid development of technology is that it is becoming more and more homogeneous, and at this time there is a need for product differentiation, and I think there will be more room for product managers at that time." Makino said.

From the method of making products, to the so-called product philosophy, to the relationship between products and technology, these issues that determine the future direction of China's AI are, to a certain extent, continuing the story of Tencent and Byte in the mobile Internet, and these two ghosts will continue to wander over China's AI.

Read on