laitimes

Dialogue with Yang Heng of Aimo Technology: 15 years of data simulation research and development met the wave of large models

author:Leifeng.com

Author: Sun Puqian

Editor: Chen Caixian

Interviewee: Yang Heng

He is currently the founder and CEO of Shenzhen Aimo Technology Co., Ltd

He is a postdoctoral fellow at the University of Cambridge, a Ph.D. from the University of London, and a master's degree from the University of Defense Sciences

Adjunct professor of Fudan University, Xidian University, Shenzhen University, external master's student/doctoral supervisor

Shenzhen overseas high-level people (peacock talents), member of the 6th CPPCC Shenzhen Nanshan District, Shenzhen Top Ten Entrepreneurial Talents, industry expert of Shenzhen Artificial Intelligence Industry Association, AI field expert of Shenzhen Software Industry Association

As the person in charge, he has been deeply involved in the research and development of a number of national 863 / international key projects in the field of AI

He has published more than 30 papers in top AI conference journals (such as CVPR/ICCV/NeurIPS/ICML/IEEE Trans, etc.), and has been authorized more than 40 invention patents

"Data scarcity", "research sources are urgent", "big models exhaust cosmic text"... During this time, there were endless topics about the lack of training data for large models. Correspondingly, "AI training AI", "synthetic data", the so-called use of magic to defeat magic is also one after another.

Sam Altman, CEO of Open AI, mentioned in an interview in the first half of this year that "all data will become synthetic data in the future", and AI Technology Review also found in exchanges with different interviewers that the way of AI training AI has quietly become popular in the process of deploying large models.

The industry has mixed views on synthetic data. Aidan Gomez, one of the authors of Transformer, believes that synthetic data could accelerate the path to "superintelligent" AI systems. But there are also people who have the opposite opinion: they believe that "synthetic data is biased" and "training with synthetic data will make the model irreversible defects." Some netizens even joked that synthetic data sounds as if AI is inbreeding.

However, the discussion voice on the Internet is still 108,000 miles away from the first line of application landing.

Founded in 2018, AiMall is a company that uses artificial intelligence technology to provide digital solutions for offline consumer retail. The founder Dr. Yang Heng has more than fifteen years of research experience in the field of data simulation and computer vision, he studied "pattern recognition and intelligent systems" during his bachelor's and master's degrees, in order to be able to deepen artificial intelligence research, Yang Heng went to the University of London to study for a doctorate, focusing on the research direction of face recognition, and then continued to Cambridge University to do postdoctoral research. In the interview, he introduced the data simulation training model method belonging to Aimo Technology and how to implement the application.

The following is a conversation between AI Technology Review and Yang Heng, founder of Aimo Technology:

When a large model meets a data simulation

AI Technology Review: We learned that Aimo Technology released the big retail model in April this year, what are your considerations for entering the field of large models with computer vision as its strength?

Yang Heng: That's a good question. I personally have more than ten years of academic research experience before, and for academia, it is necessary to deepen the technical path. But the industry is just the opposite, the mode of enterprise thinking is more based on customer needs, Aimo Technology in the past four or five years mainly focused on the industrialization of visual AI, but the service should be tailored to customer needs. Customers don't care what technology you used to do it, they only care if their problem is solved. In the process, we found that computer vision alone is not enough to solve customer problems, and it also requires the big models that are known today, and in essence, enterprises have a need for these solutions.

We have been studying computer vision for a long time, but in fact, before the concept of large models exploded, we had begun similar research and development in 2020, and launched the first similar product in 2021, called "Ask and Get".

"Ask and get" is not a pure visual product, it also has a language model, and the combination of language and vision can enrich the perception of the environment of AI products. Customers can quickly get the answers they want through conversations, and the logic of this product is very similar to ChatGPT.

Dialogue with Yang Heng of Aimo Technology: 15 years of data simulation research and development met the wave of large models

A series of products based on large models in 2020-2022

Taking offline consumer retail as an example, enterprises need to process a large amount of consumer data on the C-end of images, videos, texts, and services. If a model only has a single modal capability, there is no way to systematically solve the customer's needs. Now with the tool of large model, the model created by combining recognition ability, language understanding ability, and understanding ability of business internal process is a valuable large model in vertical scenarios. So now we are better positioned to continue to meet customer needs driven by large models with multimodal capabilities in vertical scenes.

AI Technology Review: There is a saying to describe the shortcomings of AI and large models, "there is a threshold, no barrier", what do you think? What do you think is the advantage of Aimo Technology's entry into the big model this time?

Yang Heng: For AI companies, technology is a basic threshold, and if they do not have technical capabilities, they cannot enter this industry. But it is true that now supports calling various large model interfaces, or open source large models, which are slowly lowering the threshold for AI technology entrepreneurship. In fact, whether it is a large model or a so-called small model, or traditional machine learning, the model itself has no value, and with an understanding of the business, the model can be empowered.

I think our biggest advantage in entering the game is in two aspects: we have an understanding of the business, and we have industry data.

In the past few years, we have established long-term cooperation with many customers, built high-value business products for scenario know-how, and established benchmark applications in the current subdivisions.

From the perspective of data advantages, we focus on the research and development of offline consumer retail digital applications, but offline data is very scarce. All large model training requires data, and companies like OpenAI mainly crawl Internet data, but it is still not enough for vertical scenarios. For example, fine-grained data such as the business performance and establishment of each offline store cannot be obtained through crawlers like online. In the past five years, Aimo Technology has accumulated a large amount of offline consumer retail data and formed its own retail data platform, which is the key fuel to support us to build a large model of vertical retail scenarios.

AI Technology Review: You just mentioned the importance of data for model training, how does Aimo Technology deal with this data problem?

Yang Heng: Just mentioned that whether it is a large model or a small model, the value for the industry is based on the method of supervised learning, and the most basic logic of supervised learning is to make good manual labeling of the data, and then train it, and finally form a usable model, which is basically all pipelines.

But there are two major problems with manual labeling. The first is the high cost, whether it is collecting data or finding manual labeling, it requires cost; But this is not the main bottleneck, I think the biggest problem is: people have the upper limit of labeling ability, and the upper limit of human labeling ability determines the upper limit of the model, if people can't learn, machines can't learn.

That's why our company has been building Knowledge-driven Intelligence based on Simulation System (K.I.S.S.S). The core of the simulation system is to solve two problems: how to reduce the cost of annotation? How to break through the limits of manual labeling?

Dialogue with Yang Heng of Aimo Technology: 15 years of data simulation research and development met the wave of large models

K.I.S.S

Here are two examples.

The first is about the identification of "people". Taking the familiar face as an example, face recognition has always been a very competitive scenario for AI companies, but our company can still license the face recognition algorithm to large companies at a very high price in 2019, thanks to our simulation system-based face recognition algorithm training method. Usually, everyone uses face data for model training, and the front face is well lit and easy to annotate manually, and many companies can do very well. However, some pictures with very large angles, particularly blurry, particularly poor lighting, and people who cannot see clearly exceed the limit of human annotation, and at this time humans cannot accurately annotate, which may also lead to the model not learning such scenes at all.

At this time, we use our own K.I.S.S simulation system, only need to provide a face picture to provide customers, we can generate a 3D face model based on 2D pictures, simulate many complex scene face data samples to train the model, improve the recognition accuracy, so even in the earliest stage of Aimo Technology, it can be positively PK with all large companies in the market.

The second example concerns the identification of "things". In the offline consumer retail scenario we are doing, the accurate identification of commodity display is a large demand, and accuracy requires the model to achieve very fine-grained identification, such as customers want to identify ice cream in the freezer, brand is Menglong or Helu Xue, taste is chocolate or vanilla, and what is the proportion of display exposure. However, in practical applications, there are many specifications and high similarity, and the placement is messy and occluded, and it is difficult to achieve fast, detailed and accurate labeling and statistics by relying on manual labor.

Dialogue with Yang Heng of Aimo Technology: 15 years of data simulation research and development met the wave of large models

Aimo Technology 3D simulation data based on K.I.S.S ice cream display

For the recognition model training of this kind of retail goods, we also use simulation to generate a large number of data samples with self-labeling, and the accuracy, maturity and stability of the model have been verified by market applications, and have obvious advantages from training cost to accuracy, which is the core logic of our technical route.

AI Technology Review: How do you see the future application value of data simulation for large model training?

Yang Heng: Recently, I watched an interview with Sam, CEO of OpenAI, and in the first half of this year, he said that if OpenAI's current big model wants to continue to improve its capabilities, the only solution at present is to better synthesize data, which is actually the way we say data simulation.

Through simulation technology, we can simulate different business scenarios and generate a large amount of data to train the model, such as different lighting, angles, expressions, and various occlusions, which is more in line with the actual situation that the camera may capture. But the significance of simulation technology is not only to increase the number, the greater value is to make the data distribution more diverse. When the model has seen various scene data during training, its practical application effect will be better. This allows us to continuously simulate real-world scenario data more accurately based on demand, thereby improving model accuracy and performance. At the same time, the simulation data comes with annotation, which no longer requires a lot of labor, and the cost and effect are improved.

Each company has its own technology path, taking business scenarios, we and other technology providers are in the same market, but each company has its own approved technical roadmap, which is also the most essential difference between AI companies.

Our choice of simulation system-based approach may be mainly related to my own background, I have been studying computer simulation since undergrad, I think this thing is valuable, so from the establishment of the company to the present and the future, Aimo Technology will firmly adhere to this route.

Now: Tailor-made AI to empower offline retail

AI Technology Review: What is the reason why Aimo Technology has focused on AI applications in offline retail since its inception? What are the main AI solutions currently being promoted?

Yang Heng: Aimo Technology was established in 2018, which can be regarded as another trough period of artificial intelligence, but it is also in this way that everyone can return to the essence of business and think about how to achieve industrial landing. After market research, combined with the advantages of the team, we finally decided to put AI in the retail market, which is large enough and closest to the consumer public, and "Aimo" is the meaning of AI reaching the end.

Our application is to cut from the rigid needs of the offline scene, and the first main product created is called "one-shot core", which mainly helps brands achieve efficient development and effect evaluation of offline marketing activities, and uses AI to improve the brand's channel power. In the past, because of the large number of stores and scattered, it was difficult for brands to implement and evaluate the effects of marketing activities for offline retail stores, "one-shot and core" can not only realize the intelligent verification of offline display marketing of goods and materials, instant verification, real-time feedback, but also provide more rich and diverse play for brand marketing activities, and now it has been applied to beverage beverages, dairy products, food, medicine and other sub-industries, such as Unilever, Dongpeng Beverage are our customers.

Aimo Technology empowers physical retail is also another main product "virtual store manager", mainly by identifying and analyzing some scene data such as store customer flow, consumption atmosphere, employee operations, safety and health conditions, etc., to help store owners grasp the operation situation in real time, not only can adjust the store atmosphere in real time, but also accurately improve the quality of service, for example, within one minute of consumers sitting down, there will be a waiter warmly received, consumers leave the table within two minutes, the cleaning union cleans the tableware in time, bringing consumers a better experience. It also saves labor costs for stores and realizes digital and fine management of stores in all scenarios.

Dialogue with Yang Heng of Aimo Technology: 15 years of data simulation research and development met the wave of large models

The core function of Aimo Technology's "Virtual Store Manager"

AI Technology Review: AI companies are generally difficult to make a profit, how has Aimo Technology achieved sustained profitability in recent years?

Yang Heng: There are many reasons. From the perspective of business strategy, if it is summarized in one sentence: we must do the business that truly belongs to the AI company.

Most AI companies have serious losses, because when they have not found a business scenario that really needs AI, they invest too much research and development for many pseudo-demand scenarios, and finally cannot generate customer value, or generate a lot of revenue but do not belong to the real AI business, such as doing installation and integration projects, it seems that the income is high, but it is just doing a low gross profit with high costs, so it is impossible to make a profit.

We think about Product-market fit (PMF), or product-market matching, which is very important. Aimo Technology has in-depth cooperation with benchmark customers, mining the AI needs of business from actual scenarios in retail, catering, logistics and other fields, helping customers solve practical problems, creating or enhancing business value for customers, so as to reflect our value. After five years of entrepreneurship and three years of epidemic, we can always be in a state of small profits, that is, we have grasped better in product and market matching. Of course, there is more than one road to success, but this road is more in line with Aimo Technology.

Dialogue with Yang Heng of Aimo Technology: 15 years of data simulation research and development met the wave of large models

Second, teamwork is also important. Our co-founding teams are very complementary in capabilities, with some good at algorithms and others good at architecture. For example, one of my co-founders has more than ten years of experience working in Fortune 500 consumer retail companies, and she has a very deep understanding of the consumption scene. And I do technical background, without her, I would spend a lot of time to investigate the retail industry Know-how, such as why there are brand owners, why there are retailers, do not know how to operate the brand side, do not understand the market, but through her more than ten years of industry experience, the whole team can think about how to build products faster and better based on the understanding of the scene.

Future: WPA, intelligence + knowledge + execution

AI Technology Review: What is the development plan of Aimo Technology in the future? See you put forward a concept called WPA, what is the difference with RPA, and how does it relate to your development?

Yang Heng: AI is a very large industry, we subdivide AI, for example, when it comes to RPA, it is UiPath that goes deep into everyone's mind, when it comes to CRM, Salesforce will first come to mind, and now when it comes to ChatGPT, everyone will first think of OpenAI. So our plan for the future is: Workflow Process Automation (WPA). The concept of WPA was first proposed by us, there are not many competitors, and I hope that when WPA is mentioned in the future, the industry will think of the name of Amor Technology.

Returning to how to explain WPA, its practical AI empowers enterprise digitalization, which is essentially to automate enterprise operation processes. For example, now give ChatGPT a goal, let it write a document and a piece of code for me, and it will immediately help me automate after issuing instructions, whether this thing is called AI or not, the essence of the product to achieve this goal is the automation of the operation process.

But now there are still many operation processes, such as the design of marketing plans, the review and judgment of marketing effects, etc., which are not simple regular work, and require higher-level "intelligence" as a basic capability to promote the automatic execution of the operation flow. Higher "intelligence" not only includes the same intelligence as humans, but also needs to have the knowledge of specific work, and then execute decisions, optimizations and adjustments in real time to achieve true automation of the work process, that is, WPA, which is what Aimo Technology will do in the future.

Dialogue with Yang Heng of Aimo Technology: 15 years of data simulation research and development met the wave of large models

The evolution of MPA, RPA, WPA

AI Technology Review: In the face of fierce competition in the industry, what strategy will you adopt to maintain the competitive advantage of AIMO Technology?

Yang Heng: We do have a lot of competitors in the pan-AI industry, but as I mentioned just now, each company's technical route is different, we have been on the road of data simulation for 5 years, and we have a large number of industry customers endorsement, competition has always existed, and there is competition in every dimension, but if you look at it from the overall dimension, I am still very optimistic about the future development of Aimo Technology.

AI Technology Review: As an AI practitioner for more than ten years, how do you personally see the future of AI?

Yang Heng: I think the current artificial intelligence industry is full of opportunities and challenges. The opportunity lies in the continuous emergence of innovative technologies, attracting more talent and capital to enter. But at the same time, it also faces the risk of excessive hype and irrational development, so practitioners in the industry need to remain calm and rational to ensure the healthy development of artificial intelligence.

For example, excessive promises and exaggerated publicity may make demand-side expectations of artificial intelligence too high, resulting in actual application effects that are not in line with expectations. On the other hand, over-promising may also attract some irrational talents into the industry, which may mislead the direction of the industry.

Including whether AI will eliminate humans has also been a very controversial topic. Some time ago, the CEO of StabilityAI also said in an interview that human programmers will be unemployed in five years, but in fact, thinking the other way around, humans can accomplish a lot of things with the help of AI tools, the great improvement of artificial intelligence productivity will change production relations, and AI machines that were completely controlled by humans in the past will gradually transition to be able to collaborate with people, and people and AI can achieve a state of co-prosperity and symbiosis, and Aimo Technology is also moving in this direction.

Welcome to add the author WeChat Sunpx33 and make a friend~

Leifeng NetLeifeng Net

Read on