laitimes

Llama 3没能逼出GPT-5!OpenAI怒“卷”To B战场

author:51CTO

Compile | Yifeng

Produced by | 51CTO Technology Stack (WeChat ID: blog51cto)

Meta is the AI superstar of the week, and the just-launched Llama 3 has quickly jumped up the LLM rankings with its powerful performance and open-source ecosystem.

It stands to reason that Llama 3 has achieved the level of GPT-3.7 in an open source state, which will inevitably make it seem that users (especially enterprise users, who have the ability to deploy Llama 3 independently) spend money to buy GPT-4 is not so fragrant. Netizens then "encouraged" OpenAI to hand over GPT-5 in order to continue to sit firmly on the throne of large models.

Not only netizens, but even OpenAI researchers couldn't sit still and ran to X to post a tweet with ambiguous meaning.

Llama 3没能逼出GPT-5!OpenAI怒“卷”To B战场

Image

Now netizens are even more anxiously speculating that the powerful Llama 3 may disrupt the release schedule of GPT-5, or even advance to 4.22.

It wasn't until Thursday that OpenAI's new move was long overdue.

Llama 3没能逼出GPT-5!OpenAI怒“卷”To B战场

Image

OpenAI has launched its expanded API for its enterprise-grade capabilities for customers, further enriching its assistant APIs, and introducing new tools designed to enhance security and administrative control, as well as manage costs more effectively.

OpenAI has high hopes for it: "OpenAI is still ahead of the curve when you talk to developers and businesses about meaningful work on AI models," said Olivier Godement, head of API product at OpenAI. "

However, OpenAI's roundabout strategy surprised many people. The netizen who "oil and salt does not enter" replied, "Did you spell GPT-5 wrong"?

Llama 3没能逼出GPT-5!OpenAI怒“卷”To B战场

Image

However, as Xiaozha once said frankly in an interview, Meta will open source the model, but not the product. Excellent products are the real technical barriers and cash cows of an enterprise. The era of blindly burning money has passed, and the main theme of AI now is to explore more business value.

OpenAI's high-profile announcement at this time to upgrade its enterprise-level products shows its determination to fight on the To B track. So will the capabilities of this newly upgraded API make enterprises want to pay for it?

1. Private links and enhanced security features

Among the major security upgrades, the new API offering introduces Private Links, a security approach that allows direct communication between Microsoft's Azure cloud services and OpenAI, which helps minimize "exposure to the open internet" of customer data and queries sent through APIs.

This addition complements the existing security stack, including SOC 2 Type II authentication, single sign-on (SSO), AES-256 data encryption at rest, TLS 1.2 encryption in transit, and role-based access control.

In addition, OpenAI has introduced native multi-factor authentication (MFA) to strengthen access controls to meet the growing demand for compliance.

For healthcare companies that require HIPAA compliance, OpenAI continues to offer a business associate agreement and a zero data retention policy for eligible API customers.

2. Upgraded assistant API to handle 500 times more files

One lesser-hyped but most important enterprise product offered by OpenAI is its assistant API. It allows enterprises to deploy custom fine-tuned models they train and invoke specific documents via Retrieval Enhanced Generation (RAG) and provide corresponding session assistants.

For example, e-commerce company Klarna boasted earlier this year that its AI assistant using the OpenAI Assistant API was able to complete the work of 700 full-time human agents, reduce duplicate queries by 25%, and reduce resolution time by almost 82% (from 11 minutes to 2 minutes).

OpenAI has upgraded its assistant API to include a new "file_search" feature that enhances file retrieval capabilities, with each assistant being able to process up to 10,000 files.

This represents a 50x increase over the previous limit of 20 files, with the addition of additional features such as parallel queries, improved reranking, and query rewrites.

Additionally, the API now supports streaming to respond in real-time sessions – meaning that AI models like GPT-4 Turbo or GPT-3.5 Turbo will return output as quickly as possible, rather than waiting for a full response to be generated.

It further integrates new "vector_store" objects to better manage files and provides more granular token usage controls to help manage costs effectively.

3. New feature "Projects" to control people's access to specific tasks

A new feature called "Projects" provides improved administrative oversight that allows organizations to manage roles and API keys at the project level.

This feature allows enterprise customers to limit permissions, control available models, and set usage-based limits to avoid unexpected costs—enhancements that promise significant simplification of project management.

Essentially, they can isolate a fine-tuned version of an AI model or even a normal model to a specific task or set of documents and allow specific people to work on each task.

So, if your business has a team working on a set of public-facing documents and another team working on a set of confidential or internal documents, you can assign a separate project for each within OpenAI's API, and the two can work separately using AI models without mixing or jeopardizing the latter.

"As more organizations and even individual developers deploy AI, they want to do things in a constrained box," said Miqdad Jaffer, a member of OpenAI's product team, in the same video phone interview with VentureBeat yesterday. "What 'project' lets you do is isolate your resources, your people, into a small, personalized project. You get a separate usage report. You have the ability to control access, security, latency, throughput, and cost, and an organization can really be built in a very secure way. If you're a solo developer, you can deploy hundreds of projects without any worries. ”

This last point is especially helpful for development teams that consult or work with multiple clients at the same time.

4. There are also some new upgrades

To further help organizations scale their AI operations in an economical way, OpenAI has introduced new cost management capabilities.

These include discounted rates for customers with consistent levels of token usage per minute, as well as a 50% cost reduction for asynchronous workloads through a new Batch API that also has higher rate limits and a commitment to deliver results within 24 hours.

However, to use it, customers must send their batch of tokens together in a single request – they want input from the AI model analysis, whether it's a prompt or a file – and be willing to wait up to 24 hours to receive a response from OpenAI's AI model.

While this may seem like a long time, OpenAI's executives told VentureBeat that the return can be as fast as 10-20 minutes.

It's also designed for customers and businesses that don't need an AI model's instant response, like an investigative journalist researching a long-form feature article who wants to send a bunch of government documents for OpenAI's GPT-4 Turbo to sift through and pick out selected details.

Alternatively, a business prepares a report that looks at its financial performance over the past few weeks, rather than due in days or minutes.

As OpenAI continues to enhance its offerings with a focus on enterprise-grade security, administrative controls, and cost management, the update shows the company's interest in providing a more "plug-and-play" experience directly to businesses in response to the liftoff of Llama 3 and the rise of open models like Mistral that may require more setup on the enterprise side.

Reference link: https://venturebeat.com/ai/openai-shrugs-off-metas-llama-3-ascent-with-new-enterprise-ai-features/

Source: 51CTO Technology Stack

Read on