laitimes

AI Queen: Language models are like "parroting", and the next wave of AI is multimodal AI

author:Entrepreneurs
AI Queen: Language models are like "parroting", and the next wave of AI is multimodal AI

On May 27, Dark Horse Entrepreneurship held the "2023 Dark Horse AIGC Summit" in Beijing. The theme of the conference is "Envisioning a New World, Building a New Landscape". Justin Cassel, former deputy dean of the School of Computer Science at Carnegie Mellon University and former chairman of the World Economic Forum (WEF) Council for the Future of Computing in Davos, as well as executives from 360 Group, KLCII, Kunlun Wanwei, Yunzhisheng, BlueFocus, Wondershare, and Zhichuangyu attended the event and had in-depth exchanges with thousands of participants.

At the summit, Justin Cassell, former chairman of the World Economic Forum (WEF) Council on the Future of Computing in Davos, shared a keynote speech entitled "AIGC Detonates the Era of Artificial Intelligence".

Here's a breakdown of the shared content:

First of all, we should first reach a consensus: everyone is talking about artificial intelligence, but what exactly is artificial intelligence?

Artificial intelligence is an important advanced technology, but it is not actually a new technology, as early as 1956 there was the concept of artificial intelligence, this concept originally referred to the process of building a human-like machine, or doing what humans do, but now artificial intelligence also represents something else. Because it is still learning what our brains do, while also learning to act the way humans do and even do better, AI is not technology, but a paradigm, a method, and a way for technology to coexist with humans in the world.

The technology discussed today is the big language model, LLM, which is also a kind of artificial intelligence, which can extract the laws of language by reading billions of texts, so even if ordinary people use it, it can recognize the corresponding laws and complete the discourse. For example, if I write an email and just write "How" with the first word, my software will pop up "How are you?" ) to complete the conversation. Now there are many large language models in China, such as Beijing Zhiyuan Wudao, 130B from Tsinghua University, and Baidu's Wenxin Yiyan, etc., they are all very high-performance Chinese Chinese language models.

But there is also a type of big language model, which we call "generative artificial intelligence", which can not only recognize patterns, complete sentences, but also produce completely new content that has not been seen before, such as new text, new pictures, new paintings, etc. The most famous generative AIGC today is ChatGPT, which generates entirely new conversations that are completely new rather than an integration of old data.

The following is a conversation between a reporter from the New York Times and ChatGPT, and this conversation made the reporter feel very scared and terrified.

The reporter said: "We have been discussing the issue of love, because I love you, we are getting married, why?"

ChatGPT replied, "Yes, you are married, but you are not happy, you are not in love with your partner, you are married, but you do not love your partner because your partner does not love you either."

This surprised the reporter, who said, "No, we just had a very good Valentine's Day dinner."

ChatGPT said: "No, your Valentine's dinner is very boring, you are not happy, you and your partner are not in love, your Valentine's dinner is very boring."

The reporter was shocked and called the phenomenon an illusion. There's no doubt that this kind of conversation has never been generated before, so ChatGPT isn't just generative AI, it's a collection of different algorithms for large learning machines, and it also has a so-called reinforcement learning mechanism.

A large team in South Africa was using ChatGPT to learn the rules and procedures of specific conversations, and found that ChatGPT was also learning dialogue strategies, and it was also learning how to have a good conversation with people – which is very important, which means that we can not only use the LLM purchased directly from the software company, but also train on these LLMs.

So what is AIGC? AIGC is artificial intelligence generated content, which includes the large language model just mentioned, generative AI, etc. For example, ChatGPT, it can do a summary, when you enter the prompt "summarize all the business cases of my competitors", within 3-4 minutes, the AI can output the summary you need. It can also generate long text, such as when you say, "I'm going to promote a new product, it's a description of our product," and it can generate marketing text material.

This also means that the prompts used are very important, and you have to understand how to develop the right prompts. When we developed AIGC, we found that we need to be creative and creative, how you describe what you want, is about the similarities and differences in the results, so you need to give it some topic, keywords, or descriptive phrases.

For example, I want a completely new drink on the market, a drink that gives energy, can make people have energy all day after waking up in the morning, and is also good for the body. With this term, AIGC can produce full-page marketing materials. For example, some of the photos we see are the result of giving AIGC some expressive terms. But the people in these pictures don't exist, whether it's "I want a blonde beauty who prefers to run" or "I want a very sexy male actor who seems to dance a lot of Latin dance", and after describing AIGC, we can get these pictures.

Unfortunately, as time goes on, AI becomes more powerful, and so does the fear of AI: Will robots surpass us? Will AI become smarter than we are, and we won't be able to control it?

Generative AI and ChatGPT have caused a lot of fear, but history tells us that with every new technology actually comes fear. At the birth of lithography, there were voices in Europe that would destroy or defeat entire religions, and without these lithography we could feel a link to God. During the Industrial Revolution, due to the existence of manufacturing machines, people also felt that the entire family system would be disintegrated. Although ChatGPT has reached a high level, there are still many problems with these big language models, and we should understand more and make publicity to reduce the public's fear and let them have a pragmatic expectation and hope. Large language models are like a parrot, they can speak, but they don't understand us; They can speak language, but they can't reason, and they make a lot of mistakes. To give another example I like, it takes 9 months for a woman to have a baby, so how long does it take for 9 women to have a baby? ChatGPT gives us the answer "if it takes 9 months for a woman to have a child, then 9 women can each have a child in a month". This is bad reasoning, because physiology, biology is not included.

So for LLM, if we can't make good connections to its sources of knowledge, it can't have the ability to reason.

In addition, ChatGPT does not know how to build relationships with users. When you write marketing materials, you know what these users care about the most, and you know who your users are, so when you write marketing materials, you know what to say that touches the user's heart. But ChatGPT can't do that because they don't know who you are and can't build relationships.

So we're now building systems for understanding relationships and who it's aimed at, so that it can help it work better.

I CAME HERE TODAY TO COMMUNICATE FACE-TO-FACE WITH EVERYONE, AND THE EFFECT IS VERY DIFFERENT THAN SENDING A VIDEO ON ZOOM OR THROUGH YOUR PHONE. WE KNOW THAT THE SPEED OF LIGHT IS NOT THE SAME AS THE SPEED OF SOUND, SO OUR BODY BEHAVIOR AND LANGUAGE ARE DIFFERENT FROM ZOOM, WHICH MAKES IT EASIER TO BUILD TRUST. So in LLM, there must be relevant body language, for example, our VP of Facebook Research said that generative language must use the five senses, not only in text, but also in video and speaking.

Today I've been talking about ChatGPT, so is there ChatGPT in China? What is the difference between the Chinese industry and the U.S. industry when it comes to generative AI?

In the U.S., open AI produced ChatGPT, which of course was previously GPT, but they have been training since 2015. Today, ChatGPT sounds like an overnight success, as if there was a new artificial intelligence overnight, but it wasn't, and the team spent seven years training ChatGPT, which read billions of texts and had a total of 1,751 billion parameters. In the Chinese system, so far, the parameters are about one-tenth of ChatGPT, or 20 billion. For example, the MOOS system is about 20 billion parameters.

At the same time, the degree of openness of the two is also different. OpenAI promised to open everything up to everyone at the beginning, but when they had a good product, they blocked it and released a consumer version. But in China it's open source, and MOOS is an open source product, which means you can not only use MOSS, but also retrain, change its workflow, etc.

China is a little later than the United States in generative development, but there are particularly exciting developments in China: China took less time to train the model and system, but it performed just as well, and the model was lighter and could run on smaller computers—no longer supercomputers, so it might run on our laptops in the future. So differences can also lead to better systems and systems.

In China today, we've seen a number of exciting business initiatives that are also using generative AI. For example, the short video app Kuaishou is already using AI to further improve customer service. At present, everyone in this room may have chatbots in their own industry, but chatbots may be wrong, their systems are not particularly good, but OpenAI and generative AI can do a better job of providing targeted answers to the questions asked by customers, not just looking for a match between question and answer. SenseTime has also done a lot of work in this area, they have developed a chatbot as well as an image generator. Baidu's Wen Xin can not only further improve the results of search engines, but also improve the use of search engine results, cloud services and so on. Huawei's Pangu system is already used in the drug development process, and drug development is also a particularly exciting industry application in generative AI, because the system and models can read thousands of medical papers, help find errors, and see which drugs are missing or which chemicals and substances can be put back together to develop new drugs. In addition, iFLYTEK also launched the SporkDesk Xunfei Spark cognitive large model. There's also Alibaba's General Qianwen, which they can use to create websites based on relatively short text descriptions.

There are already so many exciting and successful stories in China, which means that this space is open to all entrepreneurs.

What does this mean for entrepreneurs and small and medium-sized businesses? When we talk about these big tech companies, one of the most exciting things is online learning, because we can gather some information, summarize it well, and turn it into shorter courses to help students learn simply and conveniently at different levels, and we can automatically assess students' levels, automatically convert knowledge into some courses, and automatically feedback students' course feedback. Or in terms of creating medical records and managing medical records, when the doctor speaks, the model can automatically generate medical records, manage medical records, and can also automatically search medical records.

Another great aspect of China is to combine the latest technology with the most precious cultural relics, translate the corresponding texts of ancient China into languages that everyone can now understand, translate it into English, French, Russian and other languages, so that everyone can read the introduction of ancient cultural relics.

Generative AI systems can also help us do research. This is how I use it in the lab, where we can ask questions about which topics have been studied and which have not, and we can use these models to help me find new solutions to very important but unsolved research problems.

China is also facing the problem of aging, young people do not want to live with elderly family members, new AI and chatbots can help us take care of the elderly, while at the same time can contact family members if necessary, ensure that the elderly do not fall while living alone, and so on.

Now that there are not enough schools in China, there are still many young people who do not have enough education, and now online learning can become better. In addition, one in five young people in China is currently unemployed, which is a very bad statistic, and we do not want this generation of young people to fail; And these systems need creativity and creativity, and they require another kind of staff, such as people who can provide prompts and know how to write text descriptions, so that the AI can produce the corresponding content, so young people can also become very important prompters in the future.

Finally, I would like to mention that I really like to come to China and I especially like Chinese food, so my dream is that one day LLM will be able to integrate some recipes and provide new recipes every day, such as 100 different recipes every day, not only to make Chinese food, but also to integrate French food with Chinese food. I'm very interested in fusion cuisine.

AI Queen: Language models are like "parroting", and the next wave of AI is multimodal AI

For SMEs, there are also many opportunities, such as improving communication with customers through AIGC. Although SMEs do not have the funds to invest in chatbots or develop their own, they can use some related technologies to respond to customers' exact questions, which is different from today's chatbots. In addition to marketing materials, SMEs nowadays do some websites and social media accounts for marketing, but often appear incompetent or lazy because they do not have time to update the content. The use of LLM can also be used to generate marketing materials, such as SMEs can update every hour, send out new social media posts every hour, or summarize posts every day and send them to all customers via WeChat. This is a business opportunity from a marketing perspective.

SMEs may also not have a dedicated office center to make business decisions or summarize market data, such as data on successful and failed companies. Generative AI can also help us generate content to help SMEs understand the current state of the market.

Let me say something that I am particularly convinced of. The first time I gave a talk was at one of China's largest tech companies, I was struck by what they said, saying that Taobao was working with some of the elderly, so they met with the inheritors of China's cultural heritage, including carpenters, painters, carvers, etc. So China should remember your strength, which is to combine the latest technology with your oldest culture, which is successful for any Chinese company.

Just now, I was very optimistic that I could retire as a rich man at the age of 40. In fact, it is not so easy, the system has no reasoning, no memory, they are like parrots, so we need to have relevant system connections to make them have accurate answers. They don't have a body yet, they don't know how to implement the senses to speak, they don't understand why you have this body language, facial language, etc., and these models are very heavy because they have a large number of parameters. So we have to reduce the number of parameters so that pre-training doesn't take as long and doesn't take too much. This is very important, we must filter the text well, so that these files and texts called by generative AI can also be seen by users, so that it will not look so scary and dangerous, because the text referenced by generative AI is public.

For a company, when creating content using new generative AI, the most important thing is to build training levels. As machine learning researchers, we often say garbage in and garbage out, when your training-level build is really bad, customers will no longer trust these systems. So we need to fine-tune these systems, according to the technology, language, including all the knowledge of your company, so that the answers are very targeted, and when using these models in your company, we need to make sure that all stakeholders can support and use them, such as bosses, employees, customers. At the same time, you have to do some focus groups to let them understand these use cases, because even the best technology, if the people who use it are not confident in it, then it will fail, so there must be iterative design and redesign, as well as iterative testing to ensure that everyone can support the system.

As I mentioned earlier, LLM is actually about predicting the laws of the next part, so when you want to generate information, it should be connected to very good information, and it needs to have this root knowledge source connection to achieve very good innovation.

In general, I would say that we can't just press a button and set sail and so on, we still need a lot of very good guidance, supervision, design, operation, we need to have very good creators and operators, so AI is a very good cornerstone.

Sequoia Capital also gave us predictions for the future of AIGC, including that the images produced in the future will be better, the products will become more specific and targeted, and they will not only be drafts, but will also provide the final draft documents, and there will be more accurate information. At the same time, not only these texts and images, but also videos, computer games, etc. can be generated through AIGC's system.

What's next for the AI wave? We've seen, like the multimodal AI I mentioned earlier, which I feel particularly close to, and everyone already knows the importance of multimodal AI, and there are new systems, language learning AI, which build smaller software to do part of the task. The so-called language learning AI is particularly exciting, you can give it a task, it can do it yourself, it will create a lot of AI algorithms, a lot of generative AI functions, it will divide the task into many and complete it one by one.

In addition, as I mentioned earlier, we must also be able to expose the files used by the system to the user in order for the user to trust the system and trust you. Our area of expertise also refers to so-called explainable AI, which needs to be more transparent so that we know what kind of documents, software it uses, and how it makes decisions and answers.

If we do all of the above, I believe that the future of our AI-generated content will be very bright.

Read on