laitimes

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

author:InfoQ

Author | Chu Xingjuan

In March, Devin, the "world's first AI programmer", was born and was immediately sought after. Devin can reportedly plan and execute complex engineering tasks that require thousands of decisions, and recall the context of each step, learning and fixing mistakes over time. For a moment, the programmers panicked.

Recently, Andrew Gao, a former employee of LangChain, broke the news on the Internet about the upcoming Devin 2.0 new features.

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

Please watch the video in its original article

First, start interactive mode to help Devin navigate the web. It is very useful if you get stuck on something like a picture captcha. Admittedly, it's a bit slow (they admit it), but it works well enough to be able to make click-and-click action.

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

Second, the previous complaints about the inability to intervene and edit code with Devin can now be done by starting Web VSCode.

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

Another update is a cookie that allows Devin to log in to a website with a user's account without having to provide Devin with a user's password. PhantomBuster did something similar.

Andrew gave the example of having Devin order chicken wings on DoorDash, and Devin was able to find the store Wingstop, select the wings, and manipulate the various checkboxes......

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

Please watch the video in its original article

Devin now seems to be better at writing websites:

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

Devin also adds a "machine snapshot" feature that allows users to save the state of Devin so that they can start it again when the server is shut down.

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

Devin also supports integration with GitHub, which allows Devin to make commits.

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

However, it should be noted that Cognition, the company behind Devin, has not officially released the above features.

The founder's latest interview, don't talk about the fraud scandal

There were two hottest moments for Devin: when it was released on March 13, and when it was accused of fraud more than two weeks later.

Just last month, Carl, an internet blogger who claimed to have 35 years of experience as a software engineer, questioned Devin's fraud, and Carl reproduced Devin's demo video frame by frame and questioned it, including the following:

  • Devin is considered to be able to solve arbitrary Upwork tasks. However, in the video presentation, the problem that is being addressed does not match the requirements specified by the customer (the customer asks for a description of the setup, not the code);
  • Devin is fixing bugs in GitHub repository sources, but the files it edits don't actually exist in that repository, and some of the bugs it fixes are nonsensical and of the type that humans would never make. Corollary: Devin must have been fixing a bug in a file it created itself, but it didn't explicitly state it;
  • The EC2 section doesn't need to be coded at all, as the readme file in the repository contains all the instructions needed to complete the task, and it works fine with just one line of tweaking, even if the repository is an older version. That's why customers are asking for instructions on how to run on EC2, rather than some coding requirements. Devin doesn't seem to read the README and doesn't understand that it only needs to execute a couple of pre-existing Python scripts. The output in the video looks like the task is complex, with a long plan and many checkboxes showing that the work has been done, but in reality the work is pointless and redundant;
  • Devin's code changes are terrible, such as writing his own low-level file read loop instead of using the standard library correctly;
  • While the video looks like Devin completed the task quickly, and the video creator was able to complete the requested task in about 30 minutes, the timestamp in the chat shows that the task lasted for many hours, even into the next day;
  • Devin 执行无意义的 shell 命令,如“head -n 5 foo | tail -n 5”。

Carl believes that Cognition Labs exaggerated Devin's capabilities, and there were lies in the video descriptions and tweets, creating confusion and misunderstanding. Carr advises not to blindly repeat and amplify claims found online without proper research.

"There's hardly any AI product that performs satisfactorily after a few weeks of hype." Some netizens commented.

While there was much expectation that Cognition would respond to these queries, the team has not provided an explanation so far. We can only vaguely see his attitude towards Devin's shortcomings in mid-April, in Scott's tweet: Devin today is far from perfect. Devin works a lot, but also often makes mistakes, writes mistakes, or gets bogged down.

On May 2, a video of the interview that Scott Wu participated in was released for less than 30 minutes. In the video, Scott says that in the future, there will be more and more engineers because of AI. First of all, AI will have a greater demand for engineering, "many problems can be solved with code, and many problems can be built with code"; Secondly, Devin is not the one who decides what to do, the person who uses it should know what to build, what problem to solve, etc., so he thinks that Devin just makes engineers more pure.

According to Scott, Devin is better at DevOps and Dev setup. "The first moment Devin really excited us about was when the database tables were spinning and Kubernets was launched." Another good use case is data analytics. Scott emphasizes that Devin is the executor, and its focus is on how to accurately understand the requirements and express them as code and do it.

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

"They gave him every opportunity to respond to criticism of the video, but he kept avoiding. He didn't say anything substantive. The interview did not inspire any confidence in his company. Some netizens commented under the interview video, and some even joked, "Cryptocurrency scammers are interviewed by cryptocurrency scammers." ”

Of course, there are also strong netizens, "It's crazy to see so many haters here." Scott has built a very good team and is developing a revolutionary product. ”

According to Linkedin, the company currently has more than 35 employees, and the above updates are still the same as the day Devin was first announced.

Former LangChain employee broke the news that stronger Devin 2.0 is coming?

"Unable to disclose further details"

Cognition 公司拥有三位创始人:CEO Scott Wu、CTO Steven Hao 和盒首席产品官 Walden Yan。

Scott Wu describes himself as having been programming since he was 9 years old and loves the feeling of bringing his ideas to life. Someone else dug up a video of Scott Wu competing in the MathCounts competition at the age of 14. In the competition, Scott Wu basically does not need much time to think about answering the Olympiad questions, and after the host reads the questions, Scott Wu can immediately report the answers.

Hao previously worked as a top engineer at Scale AI, a high-value start-up that specializes in training AI systems. Yan, who had just dropped out of Harvard, asked to keep the matter a secret because he had not yet gotten angry with his parents. The founders also reported that the team had a total of 10 IOI gold medals.

Such a team has already raised $21 million in Series A funding led by Peter Thiel's Founders Fund. According to Bloomberg, former Twitter executive Elad Gil is also involved in the investment in Cognition AI.

But how Cognition managed to make a major breakthrough in such a short period of time remains a mystery.

Scott declined to reveal too many underlying details about the technology, saying only that his team found a unique way to combine large language models (LLMs) like OpenAI's GPT-4 with reinforcement learning techniques. Cognition, for its part, also declined to say how much Devin relies on other existing large language models.

Scott also said in the interview that he couldn't give more details about how Devin works.

The entire Cognition team is silent about everything involved in running the implementation, adding to the mystery and making the outside world more suspicious, after all, "Talk is cheap, Show me your code" has become a common understanding.

Original link: Former LangChain employee broke the news that the stronger Devin 2.0 is coming? So, is the "world's first AI programmer" fake? _AI& large model_Chu Xingjuan_InfoQ selected articles

Read on