Transformer papers cite more than 40,000, and the two authors left Google to start a business

2022-04-28 18:59:50

Reports from the Heart of the Machine

Editor: Qian Zhang

"At Google, we've trained bigger and bigger Transformers, dreaming of one day building a generic model that supports all ML use cases. However, there is a clear limitation to this: models trained with text can write great prose, but they cannot act in the digital world. You can't ask GPT-3 to book you a ticket, write a check to a supplier, or conduct a scientific experiment."

After a vigorous "refining model" campaign, the world is looking for applications and scenarios for these models, and ashish Vaswani and Niki Parmar, former Google Brain researchers and important transformer authors, are no exception.

In 2017, Ashish Vaswani, Niki Parmar, and several other researchers published a landmark paper that ushered in the era of big models, "Attention Is All You Need." In this paper, they proposed the famous Transformer architecture. In 2018, a model called BERT detonated the NLP academic community, refreshing the SOTA record for 11 NLP tasks, and it was Transformer that was responsible.

Transformer papers cite more than 40,000, and the two authors left Google to start a business

Ashish Vaswani, Niki Parmar, and others published Attention Is All You Need in 2017. Note: Indicates that these researchers made different but equally important contributions (random ranking). Among them, Ashish and Illia designed and implemented the first Transformer models together and were heavily involved in all aspects of the Transformer architecture. Niki designed, implemented, tuned, and evaluated countless model variables in the original code base and tensor2tensor.

In the years that followed, Transformer became the dominant architecture in the field of natural language processing, and successfully crossed over to multiple fields such as vision and audio processing, and the iconic "xxx is all you need" became a popular title template.

Five years later, Ashish Vaswani and Niki Parmar decided to embark on a new journey. In a recent tweet, they announced that they had co-founded a new startup, Adept, dedicated to enabling general intelligence by allowing people and computers to work together in creative ways. "We believe that AI systems should be user-centric, and our vision is to have machines work with people sitting in the driver's seat: discover new solutions, make decisions smarter, and give us more time to do the work we love." The company wrote in the introduction.

In addition to Ashish Vaswani and Niki Parmar, the company has gathered a number of top researchers in the field of AI (most of whom have worked at Google), including:

Anmol Gulati, a former Google Brain research engineer who was involved in Google's large-scale speech and language modeling research;

Augustus Odena, a former Google Brain research scientist who was involved in building Google's code generation model;

David Luan, former vice president of engineering at openAI California Labs and later google brain, was one of the authors of GPT-2, PaLM (https://mp.weixin.qq.com/s/-Annt2JkAhgv9YxYpc7pXQ), and was involved in part of GPT-3.

Erich Elsen, who has worked at DeepMind, Google Brain, and Baidu, is a researcher at the intersection of machine learning and high-performance computing, and has participated in leading the training of large models at DeepMind, focusing on improving training efficiency;

Fred Bertsch, a former Google Brain software engineer who is an expert in data and collaborative AI systems;

Kelsey Schroeder, former product manager at Google ML, who led Google's big model production infrastructure products;

Dr. Maxwell Nye, an MIT intern at Google Brain, focuses on using deep learning and symbolic techniques to automatically write code. During his internship at Google, he used very large language models (> 100 billion parameters) to write and understand Python programs.

Adept Founding Team.

So, why should these big cattle leave a big factory like Google to start their own business? What products will their new company do?

David Luan wrote in the company's first blog:

At Google, we've trained bigger and bigger Transformers, dreaming of one day building a common model to support all ML use cases. However, there is a clear limitation to this: models trained with text can write great prose, but they cannot act in the digital world. You can't ask GPT-3 to book you a ticket, write a check to a supplier, or conduct a scientific experiment.

True universal intelligence requires models to not only read and write, but also act in a way that is helpful to the user. That's why we started Adept: We're training a neural network to use every tool and API in the world, building on a lot of existing capabilities that people have already created.

In fact, we're creating a universal system to help people get their work done in front of a computer, which we call: the "universal collaborator" of every knowledge worker. You can think of it as an overlay in your computer that works with you and uses the same tools as you.

With Adept, you can focus on the work you really enjoy and ask the model to take on other tasks. For example, you can ask the model to "generate monthly compliance reports," all using existing software such as Airtable, Photoshop, ATS, Tableau, Twilio. We want this "collaborator" to be a good student, very trainable and very helpful.

We are excited by this product vision, not only because it is very useful for everyone working in front of a computer, but also because we believe it is the most practical and "secure" way to achieve general intelligence. Unlike large models that generate languages or make their own decisions, our models are narrower in scope – we are interfaces to existing software tools that make it easier to mitigate bias issues. What matters to our company is how our products become a tool to understand people's preferences and integrate human feedback at every step.

From this blog, we can see that although Adept also claims to implement AGI, they have chosen a different path, that is, they are not in a hurry to replace humans with AI, but are committed to using AI to enhance human capabilities, which sounds easier to achieve.

Of course, this is not a new concept. Terence Terence, author of Deep Learning and known as the "Father of THE World's AI", Terry Sejnowski said back in 2019 that "in the future, humans and machines will be cooperative rather than competitive." As a transitional phase before the implementation of AGI, the concept of "collaborative intelligence" is gaining more and more attention. Cécile Paris, chief research scientist at CSIRO, Australia's largest national research institution, even points out that "[collaborative intelligence] will be the next scientific frontier of digital transformation". At present, many technology companies with the vision of "enhancing human capabilities with AI" have emerged at home and abroad, such as circular intelligence and Ronglian cloud. Before a true AGI is implemented, more and more companies are likely to choose this route.

David Luan revealed that Adept has raised $65 million in funding, and Uber CEO Dara Khosrowshahi and Tesla AI senior director Andrej Karpathy are their angel investors.

Reference link: https://www.adept.ai/post/introducing-adept

Transformer papers cite more than 40,000, and the two authors left Google to start a business

Read on