OpenAI releases Assistants API update: up to 10,000 files can be retrieved

2024-04-18 23:29:32

OpenAI releases Assistants API update: up to 10,000 files can be retrieved

Tencent Technology News reported on April 18 that according to foreign media reports, on Wednesday local time in the United States, artificial intelligence startup OpenAI released an Assistants API update, adding a lot of features to this tool that specializes in helping to build artificial intelligence assistants, such as faster and more accurate file search tools, vector storage and tool selection parameters.

The beta version of the Assistants API was launched in November 2023, and the new version is called OpenAI-Beta: assistants=v2. This tool is designed to help developers build assistants for a variety of software tasks. It can invoke OpenAI models with specific instructions, access multiple tools in parallel, support persistent threads, and access files in multiple formats.

Each Assistant can retrieve 10,000 files

In the latest update, OpenAI has launched an improved retrieval tool called file_search. Each Assistant can now retrieve up to 10,000 files, a 500-fold increase in search capacity compared to previous versions. At the same time, the retrieval speed of this tool has also been significantly improved, and it supports multi-threaded search and parallel query, which greatly improves the retrieval efficiency. In addition, file_search has enhanced reordering and query rewriting capabilities to make search results more accurate and user-friendly.

In addition to file_search, OpenAI has also introduced a vector_store object feature in the Assistants API. When files are added to vector storage, they are automatically parsed, chunked, and embedded, making them ready for efficient searching. This feature not only simplifies the file management and billing process, but also enables vector storage to be used across assistants and threads, providing users with a more flexible and convenient experience.

Manage threads and messages

Together, threads and messages form the flow of conversation between the assistant and the user. In OpenAI's Assistants API, there is no limit to the number of messages a thread can store, which means that users can have long, in-depth conversations. However, when the total size of the message exceeds the context window that the model is able to handle, the thread intelligently truncates the message, prioritizing the deletion of those it considers relatively less important.

When you create a thread, you can include an initial list of messages, which can take many forms, such as text, images, or file attachments. To make it easier to add files to threads, OpenAI provides tool_resources as an auxiliary method. Of course, users can also add files directly through the thread.tool_resources. It's important to note that user-created messages don't yet support the inclusion of image files, but OpenAI is planning to add this feature in the future.

The Assistants API automatically manages the truncation of messages, ensuring that they stay within the maximum context length that the model can handle.

Controls the maximum number of tokens that can be used

OpenAI gives users more control by allowing them to set the maximum number of tokens used per run in the Assistants API. This feature not only helps users better manage the cost of token usage, but also effectively optimizes the use of resources. In addition, users have the flexibility to set limits on the number of previous or recent messages used in each run to meet different usage scenarios and needs.

In order to achieve precise control over the use of a single running (Run) tokens, users can set two parameters, max_prompt_tokens and max_completion_tokens, when creating a Run. These limits will apply to the total number of tokens used throughout the run lifecycle, ensuring that each run is within a predetermined resource range.

Let's take a practical operation as an example, let's say we initialize a run and set the max_prompt_tokens to 500 and the max_completion_tokens to 1000. This means that the first completion will truncate the thread to 500 tokens and limit the output to 1000 tokens. If only 200 hint tokens and 300 completion tokens are used in the first completion, then the second completion will have the remaining 300 hint tokens and 700 completion tokens available.

If the completion process reaches the max_completion_tokens limit, the run will terminate in an incomplete state.

Polling assistant

In order to keep the running status updated in real time, the user needs to retrieve the running object on a regular basis. With each retrieval, the user can check the running status and decide what to do next for the application. To simplify this process, OpenAI provides a polling assistant feature in the Node and Python SDKs.

These assistants are able to automatically poll the user for a running object and return it to the user when the running object reaches a terminal state. This greatly reduces the workload on the user, allowing them to focus more on the core logic of the application.

When the run is in a in_progress state (i.e., a non-terminal state), the thread is locked. This means that users will not be able to add new messages to the thread or create new running instances on the thread. This mechanism ensures the continuity and consistency of threads in processing runtime tasks, avoiding conflicts and errors that can arise due to concurrent operations.

In addition, OpenAI has made the following updates:

OpenAI has added support for tool_choice parameters that can be used to force the use of a specific tool such as a file_search, code_interpreter, or function in a specific run.

Users can now also use the Role Assistant to create messages to create custom conversation history in threads.

The Assistant and Runtime now support popular model configuration parameters such as temperature, response_format (JSON mode), and top_p.

In addition, users can use fine-tuned models in the Assistants API. Currently, only the fine-tuned version of GPT-3.5-turbo-0125 is supported.

OpenAI releases Assistants API update: up to 10,000 files can be retrieved

OpenAI releases Assistants API update: up to 10,000 files can be retrieved

Read on

Open source criticism is "victimized" again, Google and OpenAI compete to be the "model worker" of the basic model

OpenAI introduces more enterprise-grade AI capabilities for API customers to compete with Meta's Llama 3

Nvidia delivered the world's first H200 to OpenAI [with a forecast of the market size of the global AI chip industry]

Llama 3没能逼出GPT-5！OpenAI怒“卷”To B战场

Lao Huang personally came to the door to deliver supercomputing!OpenAI Ultraman went to Stanford to give a speech on GPT-5 after signing

Huang delivered the first super AI chip!

OpenAI is betting on solar energy to drive AI development, co-investing $20 million in Exowatt

Sound cloning revolution: OpenAI technology takes only 15 seconds and realistically mimics the human voice

Abandoning OpenAI, HUDstats adopts Amazon Bedrock to advance esports storytelling technology

My company hasn't been killed by OpenAI yet

Interview with the person in charge of OpenAI Sora: 20 questions to delve into the details of R&D, Sora is still in the GPT-1 period

Fresh Early Technology丨OpenAI opens the "memory" function to ChatGPT Plus users, Cao Cao Travels submits an IPO application to Hong Kong, and Xiaohongshu denies the Pre-IPO round of financing

OpenAI is making trouble mysteriously, GPT-4.5 is online, reasoning crushes GPT-4, Ultraman laughs but doesn't say anything

Restart negotiations with OpenAI, Apple finds a "spare tire" for iOS 18's AI

OpenAI secretly launched a mysterious model, suspected to be ChatGPT4.5 for public testing

Microsoft and OpenAI have been sued as a class

The finale of the city in the city: Zhao Hui was sentenced for an attempted jump off the building, Shen Jing was imprisoned, and Tao Wuji became a big winner

The U.S. government's retired supercomputer is being auctioned off with a starting price of just $2,500

外媒评级MSI中单:chovy、左手和Faker最强!creme其次

The CCTV Host Contest is about to be launched, and it has been officially announced that the news anchor season will be launched in 2024

The return capsule of the Shenzhou 17 manned spacecraft successfully landed [News Fast Food] for three minutes a day, taking you to a quick overview of the world

[Couples shoot multiple indecent videos: the man sends them to friends, and the woman sells them to fans to make money] In March this year, the Baoshan police received clues that a short video anchor named Lili (pseudonym) was selling online

Tracing the memory of the Red Revolution and re-embarking on the road of red journalism

Have a supper, catch someone!【News Morning Broadcast】

Pingyin News | Our county held the fourth meeting of the Urban Gas Professional Committee and the promotion meeting of the special management of urban gas (pipeline facilities) safety

Guanlan News | The 2024 Ranma Sports Carnival Guozhuang Dance Show was brilliantly staged

Red Star News | The Chang'e-6 mission is scheduled to be launched on 3 May

CCTV News | 31.5 kg! The samples of the space science experiment were successfully returned and delivered to the scientists

Foreign Ministry News (May 1, 2024)

---On April 30, a man revealed that he had spent 8,000 yuan to rent a skewered house! After three months, his body had collapsed, and he was forcibly told that there was no need to spend money on the theory of finding a landlord

Artificial intelligence has opened up a new battlefield for the competition between China and the United States in Africa

Three abnormal things indicate that there may be big news in the cross talk circle recently