laitimes

Devin, the world's first AI software engineer, has been accused of falsifying false propaganda and creating anxiety

author:Not bald programmer
Devin, the world's first AI software engineer, has been accused of falsifying false propaganda and creating anxiety

In March 2024, Cognition Labs, a Cognition AI company, launched Devin, the world's first fully autonomous AI software engineer, and successfully raised $21 million

Cognition AI claims that Devin can independently complete an entire software project in minutes, capable of complex multi-step inference, and can perform thousands of tasks without error

Now Devin has been accused of fraud by blogger Internet of Bugs:

Devin, the world's first AI software engineer, has been accused of falsifying false propaganda and creating anxiety

Devin claims to be able to work like a real software engineer (an outsourcing platform), but in fact, Devin can't complete the complete task according to the employer's requirements, on the one hand, Devin only selects a part of the requirements to complete, and on the other hand, Devin can't be like a real engineer, to propose a plan with the employer and confirm the requirements!

Devin, the world's first AI software engineer, has been accused of falsifying false propaganda and creating anxiety

Specifically, the so-called "world's first AI software engineer" Devin was born, and the company lied that their video showed Devin completing freelance work on Upwork and getting paid, but there was no such thing at all

This dude did a frame-by-frame analysis of Devin Upwork's video, and the author spent 36 minutes going through the upwork tasks in Devin's video, showing what Devin was supposed to do, what it actually did, and how badly it was doing. The whole debunking process is extremely powerful and convincing, and the main contents are as follows:

  • Devin is advertised as being able to solve arbitrary Upwork tasks. However, in the video demonstration, the problem solved did not match the customer's request for installation instructions, not code
  • The video shows Devin fixing bugs in the source code of a GitHub repository, but the files it edits don't actually exist in that repository, and some of the bugs it fixes are pointless and not mistakes that humans would make. This implies that Devin may be fixing a bug in a file he created, but this is not explicitly stated
  • There is no need to do any coding at all, because the README file in this repository already contains all the instructions needed to complete the task, and it only takes a simple one-line modification to work fine, even though the repository is quite old. That's why customers are asking for instructions to run on EC2, not coding. Devin doesn't seem to be reading the README file and doesn't realize that only a couple of pre-existing Python scripts need to be executed. The output shown in the video gives the impression that the task is complex and sophisticated, with a long plan and many completed checkpoints, but in reality the work is pointless and redundant
  • Devin's code modifications are terrible, such as writing his own low-level file read loops without using the standard library correctly
  • Although the video gives the impression that Devin completed the task quickly and the video creator completed the requested task in about 30 minutes, the timestamp in the chat log shows that the task lasted for many hours and even continued to the next day
  • Devin did some pointless shell commands like "head -n 5 foo |." tail -n 5"。

epilogue

When Devin first came out, the overwhelming hype did create a lot of anxiety, which led many non-technical people to believe that AI might soon replace programmers

In the words of blogger Internet of Bugs: "Part of a software developer's job, the part that AI doesn't do well." The difficult, critical, complex, and time-consuming part of the communication is mainly with customers, bosses, and other stakeholders. Figuring out what exactly needs to be dealt with, going over and over again, and saying, "It's going to be a lot easier to do this, how about we do this?" are all tasks that AI can't do right now, and that's exactly what we're doing. ”

I'm curious to see how the Devin team will respond? The Internet of Bugs blogger said that he wasn't targeting the engineers who demoed Cognition AI upwork, nor the engineers who developed devin, he was just targeting the Cognition AI company's propaganda tactics, and there is too much hype about AI at the moment

Read on