laitimes

Adobe used competing products to practice Firefly, and the employee broke the news and caused controversy: users were ashamed to make money

author:Not bald programmer
Adobe used competing products to practice Firefly, and the employee broke the news and caused controversy: users were ashamed to make money

The AIGC circle is so magical, the moment of slapping and slapping in the face is always on the way!

For example, Mistral, which is considered an open-source unicorn, began to consider the "balance between mission and commercial interests" and launched the closed-source flagship model Large. Of course, the last time I said this was OpenAI.

However, in the eyes of the outside world, the reason is four words: I, want, earn, money!

For another example, Perplexity AI, the AI search darling that initially attacked Google's proliferation of search ads, also announced that its website began to sell advertising space, and also changed its attitude towards "just rice" ads: as long as the ads are good enough, it will not affect the user experience.

Who will be the one who was also "slapped in the face" by netizens today? Well, this time it's Adobe.

Adobe used competing products to practice Firefly, and the employee broke the news and caused controversy: users were ashamed to make money

质疑Midjourney,

理解Midjourney,

成为Midjourney?

When Adobe released its image-generating software, Firefly, last year, the company said that the AI model was primarily trained on Adobe Stock, with a database of hundreds of millions of licensed images. Adobe claims that Firefly is a "commercially safe" alternative to competitors like Midjourney, which learns by scraping images from the internet.

In addition, Adobe has criticized its competitors' data collection practices. Scott Belsky, the company's chief strategy officer, said last year that other models were built on "publicly scraped" data.

One of the reasons Firefly is better than OpenAI's similar models is that it respects the creative community and only uses licensed or freely available data for training, Adobe says on its website. And in a blog post last March titled "Responsible Innovation in the Age of Generative AI," legal director Dana Rao noted that generative AI is "only as good as the quality of its training data."

"Trained on a curated, diverse dataset, your model naturally has a competitive advantage in terms of commercial security and ethical outcomes," she wrote. At the same time, she pointed out that Adobe's training on Firefly is based on Adobe Stock images, licensed content, and public domain content whose copyrights have expired.

Ashley Still, senior vice president at Adobe, said at a Bloomberg Intelligence event earlier this month: "When we launched Firefly, our enterprise customers came to us and said, 'We love what you're doing, and we're really grateful that you're not stealing all of our intellectual property on the open internet.'" ”

However, in numerous speeches and public articles, Adobe has never made it clear that its models actually use some images from these competitors as a reason why they are safer than their competitors.

Personality collapse often comes from internal information. According to the latest revelations, the behind-the-scenes situation is that Adobe also relies to some extent on AI-generated content to train Firefly, including content taken from those AI competitors.

Adobe used competing products to practice Firefly, and the employee broke the news and caused controversy: users were ashamed to make money

彭博社报道: Adobe’s 'Ethical' AI Tools Used Rival AI Images for Training

Adobe used competing products to practice Firefly, and the employee broke the news and caused controversy: users were ashamed to make money

Internal employees couldn't stand it, and broke the news that they used competing materials for training

AI-generated content makes its way into Firefly's training set because creators are allowed to submit millions of images using other companies' technology to the Adobe Stock marketplace. "The generative AI images in the Adobe Stock collection are a small part of the Firefly training dataset," Adobe Representative Michelle Haarhoff wrote in a Discord group for photographers and artists last September. ”

Adobe says only a relatively small fraction (about 5%) of the images used to train its AI tools are generated by other AI platforms. "Every image submitted to Adobe Stock, including a small selection of AI-generated images, goes through a rigorous review process to ensure it does not contain intellectual property, trademarks, recognizable characters or logos, or the artist's name," a company spokesperson said. ”

Criticism of the practice has been frustrating within the company: According to several employees familiar with Firefly's development process (who requested anonymity because the discussions were private), there has been a debate within the company about the ethics and visuals of incorporating AI-generated images into models since the early days of Firefly. Some have suggested a gradual reduction in the system's use of generated images, but one insider said there are no plans to do so at this time.

Adobe used competing products to practice Firefly, and the employee broke the news and caused controversy: users were ashamed to make money

Adobe Stock has added a number of AI-generated images

However, Adobe has never publicly made it clear that Firefly was trained in part using supposedly unethical imagery from rival tools. However, according to information viewed by Bloomberg, Adobe disclosed these details in at least two online Discord discussion groups run by the company — one for Adobe Stock and the other specifically for Firefly.

Adobe used competing products to practice Firefly, and the employee broke the news and caused controversy: users were ashamed to make money

User real hammer: ashamed,

The AI images used for training did get a bonus

In March 2023, Adobe released a "beta" version of Firefly's product. That month, Raúl Cerón, who works with the Adobe Stock community, posted on Discord that the company did not plan to use the generated images to train the upcoming public release of Firefly.

"Once we're done testing and officially launched, we'll build a new training database for it, which won't contain content for generative AI. He wrote in a June post.

When Adobe announced the public launch of Firefly on September 13, the company also paid a special "Firefly bounty" to Adobe Stock contributors whose content "was used to train the first commercial Firefly model." According to a message on Discord from Mat Hayward, who works with the Adobe Stock community, those contributors who use generative AI are among those who receive the bounty.

Hayward writes that the AI-generated images in Adobe Stock "enhanced our dataset training model, and we decided to include them in the commercially released version of Firefly." ”

This has also been proven by users. A user uploaded a Midjourney image to Adobe, and as a result, his backhand was used for training, and he also received a bounty.

Brian Penny is a writer and stock image contributor who has submitted thousands of AI-generated images to Adobe Stock — most of them made with Midjourney. When he received the bonus, he was surprised because he didn't think he was eligible for it as an AI contributor. Despite the financial gains, Penny thinks it's a bad decision to have Firefly training include contributions like his own, and says the company should be more candid about how it trained the image creation software.

"They need to be ethical, they need to be more transparent, they need to do more," he said. ”

Since the official acceptance of AI content at the end of 2022, Adobe Stock's stock library has flourished. Today, about 57 million images, or about 14% of the total, are tagged as AI-generated images. Artists submitting AI images must indicate that the work was created using technology, but they are not required to indicate which tool was used. To feed its AI training set, Adobe has also offered to compensate contributors for submitting a large number of photos – such as images of bananas or flags – for AI training.

Adobe used competing products to practice Firefly, and the employee broke the news and caused controversy: users were ashamed to make money

What exactly is ethical/responsible AI? It's messy

Massive amounts of data are needed to train AI models that underpin popular content creation products, and the use of copyrighted material by AI technology companies in the process is under increasing scrutiny.

Companies such as Midjourney, OpenAI, the maker of Dall-E, and Stability AI, the maker of Stable Diffusion, have used image datasets scraped from the internet to build their media generation models, a practice that has sparked outrage and lawsuits from many artists.

According to an assistant professor who studies the legal and ethical implications, "This shows the ambiguity of the definition of responsible AI and the difficulty of moving away from the social, cultural, and ethical issues (if not the legal ones) that come with generating content." ”

Adobe decided to build Firefly using content that the company owns copyright or that is in the public domain, apparently in order to differentiate its AI image tools in the fast-growing generative AI market.

However, the company advertises it as a more ethical and legal option for customers who want to generate images with a small vocabulary but are concerned about potential copyright issues. However, Adobe also says it does not generate content based on other people's intellectual property or branding, and it will avoid producing harmful images.

Harvard University professor Rebecca Tushnet focuses on the area of copyright and advertising law. Training with AI-generated content may not make Adobe's Firefly image generator less commercially safe, she said, and as long as the company doesn't mislead consumers, it doesn't need to say what it's trained in. However, training with AI images like the one created by Midjourney would break the idea that Firefly is different from competing services.

"Adobe basically wants to position itself as a higher-level alternative, but it also wants very cheap input, and AI is a really good way to get cheap input," she said.

Therefore, what is ethical and responsible AI, I believe many people are messy.

Read on