Jeremy Howard, founder of fast.ai: Training models is a craft, and practice makes sense

Big Data Digest is reprinted with permission from KLCII Community

Recently, the KLCII community conducted an exclusive interview with Jeremy Howard, a celebrity AI educator, former president and chief scientist of Kaggle, and founder of the fast.ai. This interview focuses on core topics such as technology inclusion, the debate between open source and closed source, the gap between China and the United States, and the cultivation of AI talents.

fast.ai is one of the first open source projects supported by a16z, and the course (https://course.fast.ai) has now reached 6 million views, and Peter Norvig, director of research at Google, has strongly recommended, "'Deep learning is something for everyone' is the proposition of many tutorials, but this tutorial really delivers. This is one of the best resources for programmers to become proficient in deep learning. ”

When it comes to AI education, Jeremy mentions that one of the secrets to making knowledge "easy to understand" is to keep it practical and provide plenty of examples. "I don't want to make learning a privilege limited to a few. Our goal is to make learning open and accessible, accessible to everyone, and accessible to everyone. ”

In his sharing, he especially mentioned the experience of studying Chinese in China, where he regarded Chinese learning as a 10-year challenge project, where the learning of 6,000 Chinese characters and complex idioms opened up new perspectives on understanding computer concepts. Talking about the evaluation of the current open source model, Jeremy pointed out that although a large number of excellent models have emerged in China, there is still a big gap between China and American companies in terms of instruction fine-tuning, which is also the direction of future efforts.

The following is the full text of the interview (the editor has adapted it without changing the original meaning).

On technology inclusion: AI learning is not the privilege of a few

Q：你的教育理念是“Make deep learning uncool!" 能再详细阐释一下吗？

A: When we talk about something 'cool', English uses "cool" and Chinese uses "cool" to describe it, and Chinese and English languages are similar in meaning. I'm not entirely sure that the word 'cool' in Chinese also has this connotation: in English, the word 'cool' often carries an exclusive connotation. It doesn't just mean that something is great, but it also contains a kind of particularity that transcends ordinary people. We do not want to make learning a privilege limited to a few. Our goal is to make learning open and accessible, accessible to everyone, and accessible to everyone.

Therefore, learning should not be limited to the 'cool' qualities. It also shouldn't be something amazing or mysterious and doesn't need to be overly complicated. Learning should be intuitive and clear, make people feel 'this makes sense' and be able to find applications in everyday life. That's what we're looking for. Our goal is to create a way of learning that is practical, efficient, and accessible to everyone.

Q: As one of the founders of fast.ai, please share the original intention and goal of the fast.ai library course, and why did you decide to provide a completely free deep learning course?

A: My wife and I co-founded fast.ai, and the course is just one of the four businesses in the program. Since 2016, we have made it clear that fast. AI's mission: Committed to making deep learning and AI technology more accessible and understandable.

In order to avoid the fact that only a few people can use AI, we want to teach more people to be proficient in deploying AI in their own fields. Pedagogical courses are clearly an effective tool. In the course, we show students the best practices of AI and teach them to build high-quality models with minimal cost and maximum efficiency.

My wife and I divide the teachings, she is responsible for natural language processing and linear algebra, and I am responsible for machine learning and deep learning. We hope that students will not only learn theoretical knowledge in the course, but also apply the knowledge quickly and efficiently in practice.

Q: The fast.ai course is known for its down-to-earth style, how to ensure that the course content is concise and easy to understand, so that more people can benefit from it?

A: We put a lot of time and effort into polishing our courses, for example, we only launch one or two courses per year. The goal we pursue is how to simplify complex concepts, but it is not easy to simplify the complex, in fact, there is no knowledge point that is particularly difficult to master, the key is to find the right analogies, examples and steps.

One of the secrets to making knowledge "easy to understand" is to keep it practical and provide plenty of examples.

As a result, we rarely use complex mathematical formulas. Where mathematical formulas are usually used, we use code examples instead. Whenever we show code examples, we run them live and practically. We try to make sure that each run of code is within one or two lines so that students can see the results immediately.

In this way, complex concepts can be disassembled. I think the difficulty increases when we try to grasp too much at once. So, if you can break the content into smaller pieces, it will be easier to understand. If students are confused during the learning process, they can always play back the video to find the part that they started to be confused about, and then gradually solve the confusion.

fast.ai Homepage: https://www.fast.ai/

Q: What are the most important advantages of the fast.ai course?

Theory is supplemented, practice is the mainstay. The curriculum has always been designed with application at its core. Teach students the most direct and practical content first. Students can get into practice as soon as they finish the first lesson.

Then, the course will gradually go deeper, and the learning depth will be expanded for the camp step by step. This is different from most courses, which tend to be like a university classroom, starting with basic concepts and gradually building a complete body of knowledge, but it is not until the end of the course that students are exposed to truly practical knowledge. In contrast, our top-down approach to learning is preferred by many students.

Q: What knowledge do students need to take before they can use fast.ai courses correctly?

A: It is not necessary to have too much theoretical knowledge, but it does require students to have at least one year of programming experience, and there are no other special requirements.

Even if you don't know how to code, you can get started and advance with the help of tools like ChatGPT. Programming backgrounds may be less important, but you still need some programming skills to be truly proficient in AI.

Q：fast.ai最近发布了新课程“From Deep Learning Foundations to Stable Diffusion”，要知道训练扩散模型通常需要大量计算资源，没有充足 GPU 的学生该如何学习这门课程？

A: Actually, this course requires only one GPU. You can run it on Google Colab or a similar platform. I believe that there are also many companies and institutions in China that offer free GPU access to students, so computing resources will not be a barrier to entry for taking this course.

Founded Answer.AI Lab, focusing more on practical products than AI manufacturers

Answer.AI Blog Homepage: https://www.answer.ai/

Q: You led the creation of Answer.AI Lab, can you tell us about it?

A: As we all know, AI research is carried out in different units, such as companies like Google, DeepMind, OpenAI, etc. In China, it could also be institutions like the Beijing Academy of Artificial Intelligence (BAAI), BAT, and so on. These studies, which are published through academic channels, are usually more theoretical, and it is not always straightforward to apply them to practical scenarios and create practical value. Therefore, the goal of establishing Answer.AI is to create a laboratory that integrates research and development.

In our lab, we identify innovative projects that are good for the world, and then think about how we can use AI to accelerate those goals. I want Answer.AI to do critical research with the goal of making the world a little bit better. Of course, Answer.AI take a profit model, making money by selling useful and innovative products, and then using the profits to develop more good products. After all, profitability is a necessity for a sustainable laboratory.

The core goal of Answer.AI is to make AI more accessible and usable. Although fast. AI pursues a similar mission, but fast. AI is a non-profit organization that relies entirely on my personal funding, and this model is not sustainable in the long term. Therefore, Answer.AI aims to become a more sustainable organization that harnesses and maximizes the potential of language models on a profit-based basis. Ideally, we want to develop products that charge a small monthly fee directly to the end user. In short, our goal is to discover people's needs, produce and sell these products profitably.

Q: What kind of team do you hope to assemble to achieve this?

A: Our plan is to build a smaller, well-rounded team. The team members have a wide range of knowledge and a solid foundation, which means that they are good at programming, good at problem solving, and highly creative. They may come from all walks of life, but expect them to already have some success in a certain field, because hard work, intelligence, and creativity often come with success. Of course, the most important thing is whether they are able to make a real impact through their creativity, intelligence, and hard work.

Q: What makes Answer.AI unique compared to big companies like Google, OpenAI, or DeepMind?

A: Unlike OpenAI and DeepMind, which focus on research, we focus more on practicality. We won't develop chatbots like OpenAI does, but we may develop tools to help kids learn chess and apps to assist lawyers in drafting contracts.

Google, while working on AI-based products, doesn't stand out in terms of practicality, and it's taken them a lot of time to catch up with OpenAI's progress in chatbots. Therefore, we are different from them, we do not have formalism, and pay more attention to innovation and application.

Talking about the development path: open source, an excellent national strategy

Q: In the battle between open source and closed source paths, how should we choose?

A: Closed source can only help the company that created it, and only people inside the company can improve it. Open source, on the other hand, helps society as a whole, and anyone can contribute to its improvement. From the experience of other software fields, open source is more successful in creating products and services that are useful to society. For example, most of the internet runs on the open-source Linux operating system. Most of the web runs on open source servers such as Apache, Nginx, and Caddy. Most of our emails are stored and sent on open-source servers.

Open source is not accidental, and from the perspective of the development of the Internet, it can help AI develop faster and more efficiently. At the moment, AI is likely to be more important than the internet. If closed-source development continues, although a small number of groups will benefit, the total social benefits will be lost. As a result, universities and institutions are unable to use codes and models for scientific research, social innovation continues to decline, and security risks increase due to models that cannot be directly accessed. Therefore, the more AI develops, the more necessary it is to open source.

The importance of open source is regardless of region and nationality, and China, Australia, and the United States all need open source to promote AI development. In particular, Chinese attach great importance to family and country feelings, and if China can become a major force in open-source AI models, and everyone can contribute to these models, then it can increase national glory and promote unity and progress. Currently, the best models are mainly from Google and OpenAI in the United States, with Australia and China lagging behind. But with open-source collaboration, we can quickly catch up with these closed-source models.

So open source helps break the monopoly of closed source, and I think it's a great national strategy. Not only is this good for society and the community, but it also means that we can share the benefits of this technology together.

Q: How do we educate the public about the benefits of open source?

A: Many of today's top open source models are made in China. We've seen China do this by creating some of the world's finest open source models, which in itself is a source of great national pride. Continuing to promote open source is not only a way to demonstrate the values of Chinese society, but also an important strategy to avoid the monopoly of technological value by a few large companies. China is a country that cares about the well-being of its communities and people, and the promotion of open source is in line with this philosophy.

The process of promoting open source can be arduous, as many companies are more focused on profitability and competition. For example, in the field of AI, they might try to convince the government to give them some form of monopoly, just as OpenAI encourages the government to enact regulations that are beneficial to them, making it harder for competitors. Therefore, community groups like ours need to work together to fight against this monopoly.

Q: In the entire open source industry chain, what tools and systems are still scarce, and what gaps need to be filled?

A: Open-source AI software faces challenges with ease of use. These software are often developed by engineers who are passionate about technology, thinking more about their own needs and those of their peers than the actual needs of the average user. Therefore, despite being superior to closed-source software in terms of creating and training models, open-source software still has a high user barrier, which is its main problem.

In addition, high-performance GPUs are required to run AI training or inference, but due to U.S. restrictions on high-end GPUs in the Chinese market, this is an opportunity for China to develop better products. Just as Huawei did when it faced limitations, China has the potential to develop GPUs that perform better and are better priced.

No matter how good the software is, if the hardware cost is too high, it is still a significant obstacle. The rapid pace of the AI space means that technology is changing rapidly, and it is a challenge to improve the ease of use and accessibility of software while maintaining cutting-edge technology. Bringing in more experts in the field of human-computer interaction (HCI) and developing a more human-friendly interface will greatly promote the popularization of software. Software development should not be limited to mathematicians and computer scientists, but needs to involve more experts who understand human needs to make software more human.

Talking about the potential of large models: fine-tuning is key

Q: How do you evaluate the current popular open source model, and what are the advantages and disadvantages?

A: Among the best open source models at present, there is a very good Yi-34B model developed by the Kaifu team. It's 34B, relatively easy to obtain, and can run on a regular PC. But one of the main problems with this model is that it is not really open source, as use for commercial use requires a license, which is currently a limitation.

The Llama 2 model, which is now slightly outdated, has been surpassed by the Mistral model. Although the Mistral model is not as large as the Yi-34B, which is only 7B, it is developing and progressing at a rapid pace, which is very exciting.

One issue I've noticed is that in order for these models to work effectively, fine-tuning is required. Most of the successful fine-tuning work I've observed seems to have taken place in the United States. While China has contributed greatly to the development of the base model, it seems that not enough has been done to fine-tune this critical part, which is actually more important because it is the key to making the model work really effectively. It's an area where anyone can contribute. As a result, the community can try to develop better fine-tuning models, especially for bilingual data, which will be a valuable community project.

Talking about talent training: programming should start from the baby

Q: What do you think about AI popularization education and the cultivation of AI talents for the future?

A: I think this issue should be considered from two very different angles. First, we need to think about how we can integrate AI into our daily lives. The second is to explore how to push the frontier of AI technology.

One phenomenon I'm particularly concerned about is that in the U.S. and Australia, many teachers are opposed to using AI to help them write. I don't know if this is the case in China, but in my opinion, students should be encouraged to actively use AI. Because if AI can help them, their future work will definitely be without it. Conversely, if AI doesn't help, students should recognize its limitations now.

Banning the use of AI is tantamount to banning the use of computers or the internet, depriving students of the opportunity to master key tools at work and in school.

For teachers, AI represents a huge transformation. Teachers should step out of their comfort zone and work to make their students more competitive. We should teach students how to use this technology. The key to cultivating students' ability to develop AI models and solutions on their own lies in programming education, so programming should start from the "baby".

Certain branches of mathematics, such as linear algebra and calculus, are important for AI. I recommend focusing on these courses at the high school and college levels. If you're proficient in linear algebra, calculus, and programming, you have the opportunity to become an outstanding AI researcher and practitioner.

Q: What are the disadvantages of the current AI education model, and how do you think that most AI education resources are concentrated in the hands of top institutions and universities?

A Hackers' Guide to Language Models：https://www.youtube.com/watch?v=jkrNMKz9pWU

A: Online courses may be able to fill some of the current educational shortcomings. For example, we offer excellent and free courses. In the West, YouTube has a large number of excellent video content that follows the most cutting-edge technological developments of AI. Recently, I posted a video on YouTube called "A Hackers' Guide to Language Models", and after watching this 90-minute video, you can get everything you need to know about language models.

I would like to talk about copyright issues in the field of AI. As we all know, AI models require data for training, and data is protected by copyright. In the West, the issue of paying for data rights is very hotly debated. In the social landscape of the United States, the rights and interests of the individual are more likely to take precedence over the collective. Against this backdrop, a court may order the model trainer to pay royalties, which limits the development of AI technology.

And in China. The opposite may be true, and people are more likely to believe that the collective good should be valued. Therefore, in such scenarios, China may take different measures. I think China could consider avoiding blindly following the Western line of copyright law that might restrict society's right to exploit these models, as that could harm the interests of society as a whole.

As far as I know, in order to comply with the standards of the World Intellectual Property Organization (WIPO), China has made changes to its copyright regulations. The regulations didn't specifically target language models. Therefore, whether copyrighted information can be used to train language models still needs to be explained by courts and legislators.

Talking about interest learning: Chinese learning opens up new perspectives on AI research

Q: You studied languages in China, can you talk about that experience?

A: I studied at Beijing Language and Culture University for a while, and I often miss that time.

Back then, in order to gain a deeper understanding of machine learning, I began to understand the process of human learning. I plan to devote 10 years to learning Chinese, with the aim of delving into the nature of human learning. This experience has been invaluable to me. Although I initially learned Chinese only to better understand how humans learn, I soon discovered that the Chinese language, Chinese history and culture are fascinating in their own right. It's not just about learning a language, it's actually opened the door to a whole new world for me.

Q: How did I decide to learn Chinese?

A: Actually, there is no particular reason, I just feel like I need a challenging project that will last for 10 years. The CIA ranked Arabic, Chinese, and Japanese as the top three most difficult languages for native English speakers to learn. I'm not very interested in Arabic, but I already have a certain understanding of Japanese culture. Therefore, I thought it might be fun to learn Chinese, so I started my journey to learn Chinese.

This decision came with unexpected surprises. For example, every time you learn a new idiom, it's like the beginning of a whole new story. These idioms are not only an integral part of the language, but also the embodiment of the culture and art of the era. When I went to China, I was surprised to find that my Chinese friends had an in-depth knowledge of Western history and culture. And most Europeans and Americans know very little about Chinese history and culture. It's a whole new world that inspires me and brings me great joy. I feel very fortunate to have had this experience.

Q: How has Chinese learning helped you in your computer research?

A: It does help. Learning Chinese requires mastering a large number of Chinese characters, so I invested a lot of time to learn about 6,000 Chinese characters. In order to study effectively, I had to develop a variety of strategies. I think some of the language learning methods I developed at that time were very beneficial to me. While I can't specify how this directly affects my computer science research, I'm sure there are many similarities between the way humans learn language and computers. As a result, this learning experience has provided me with valuable perspectives on how to learn new concepts more effectively.

Q: Could you please share your advice and experience on deep learning, especially for beginners?

A: I would like to say to beginners who are just getting started with deep learning, it's important to remember that deep learning is still a relatively new field. Many of the current experts in this field are also starting as beginners. At first, everything may seem mysterious and complex, but be assured that you will eventually be able to understand and catch up.

My advice is, don't get overly obsessed with math and don't worry too much about math problems. The focus should be on programming and actual project building. It's normal not to fully understand every detail at first, the key is to practice and try. Deep learning is not only an art and engineering discipline, but also a science. Therefore, I suggest that you focus on the practical and engineering aspects and treat it as a craft. Just like a potter or potter, while they can learn a lot about the theory of making pottery, it's the hands-on practice that really matters. If you want to make beautiful ceramics or high-quality models, you need to make a lot of work. Just as there may be flaws when you first try to make pottery, the model you initially trained may not be perfect. But the key is not to give up, as long as you persevere, it means that you are constantly learning and improving.

Rent GPU computing power

新上线一批4090/A800/H800/H100

Ideal for enterprise-level applications

Scan the QR code to learn more ☝

Jeremy Howard, founder of fast.ai: Training models is a craft, and practice makes sense

Jeremy Howard, founder of fast.ai: Training models is a craft, and practice makes sense

Read on

Be a sneak peek! ByteDance is unprecedented! The large model is stunningly unveiled, and the price is as low as 99%!

39 million people watched Lei Jun's live test drive; Musk recruits second brain-computer experiment patient; DeepMind launches a large-scale model risk assessment framework

From "sky-high prices" to "fracture prices", large models are about to change

If you want to land a large model, let everyone afford to use it first

Direct interaction with hundreds of millions of users Third-party AI models accelerate access to the Weibo ecosystem

iFLYTEK Xinghuo large model empowerment, opening up the "new consciousness" of virtual people

When open source meets large models, what kind of changes will occur?

It is said that the senior management of the Tsinghua Department of the large model company has changed

58.com Sun Qiming: How to build a large model of life service vertical? Self-developed + open source with both hands

AI Dimensity Full Push, China's First End-to-End Large Model Mass Production on the Car Xpeng opens the era of AI intelligent driving

The price of large models has fallen, and the Internet-style "turf war" has reappeared, will big factories really lose money?

The Past of China's Large Model Capital: 20 Large Model Insiders Walk on the "Life and Death Table"

The price war of AI large models starts, and the winner will be decided in a year?

Baidu's first Wenxin large model learning machine Z30 is on sale, and 8G +256G is sold for 6694 yuan

OpenAI officially announced the launch of "next-generation cutting-edge model" training! It is expected that the training parameters will be further improved, or the "Wensheng video" model Sora will be integrated

In the large-scale model competition, why are Chinese and American tech giants rolling in different directions?