天天看點

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

【新智元導讀】程式員福音!OpenAI新推出的模型API全部支援結構化輸出,JSON Schema比對率高達100%,成本還立減一半。

還在絞盡腦汁想一堆提示詞,為一頓操作後五花八門的輸出結果而頭疼?

OpenAI終于聽到了群衆的呼聲,為廣大開發者送上渴望已久的第一大功能。

OpenAI今日宣布新功能上線,ChatGPT API現已支援JSON結構化輸出。

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

JSON(JavaScript Object Notation)是檔案和資料交換格式的行業标準,因為它既易于人類讀取又易于機器解析。

然而,LLM常常與JSON對着幹,經常會産生幻覺,要不生成僅部分遵循指令的響應,要不就生成一堆「天書」,根本無法完全解析。

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

這就需要開發人員使用多種開源工具、嘗試不同的提示或重複請求等來生成理想的輸出結果,耗時耗力。

結構化輸出功能于今天釋出,以上棘手的難題迎刃而解,確定模型生成的輸出與JSON中規定的schema相比對。

一直以來,結構化輸出功能是開發人員呼聲最高的頭号功能,奧特曼在推文中也表示,該版本是應廣大使用者的要求釋出的。

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

OpenAI釋出的新功能确實擊中了許多開發者的心,他們一緻認為「This is a big deal」。

紛紛留言表示贊歎,直呼「Excellent!」。

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

幾家歡喜幾家愁,OpenAI的這次更新,又讓人擔心會吞噬初創公司。

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

然而,對于更多的普通使用者來說,他們更關心的問題是GPT-5到底什麼時候釋出,至于JSON Schema,「那是什麼?」

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%
輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

畢竟,沒有GPT-5的消息,OpenAI今年秋季的DevDay,可能與去年相比,将會顯得安靜了許多。

輕松確定模式一緻性

有了結構化輸出,隻需要定義一個JSON Schema,AI就會不再「任性」,乖乖按照指令要求輸出資料。

并且,新功能不僅僅讓AI變得更加聽話,還能大大提高輸出内容的可靠性。

在對複雜的JSON schema的跟蹤評估中,帶有結構化輸出的新模型gpt-4o-2024-08-06獲得了100%的滿分。相比之下,gpt-4-0613的得分不到40%。

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

實際上,JSON Schema功能就是OpenAI在去年的DevDay上推出的。

現在,OpenAI在API中擴充了這項功能,確定模型生成的輸出與開發人員提供的JSON Schema完全比對。

從非結構化輸入生成結構化資料是當今應用中人工智能的核心用例之一。

開發人員使用OpenAI API建構強大的助手,能夠通過函數調用擷取資料和回答問題,提取結構化資料以進行資料輸入,并建構多步驟的智能體工作流(multi-step agentic workflows),進而允許LLM采取行動。

技術原理

OpenAI采用了一種雙管齊下的方法來提高模型輸出與JSON Schema的比對度。

最新的gpt-4o-2024-08-06模型經過訓練,可以更好地了解複雜的Schema并生成與之比對的輸出。

盡管模型性能已顯著提升,在基準測試中達到了93%的準确性,但固有不确定性仍然存在。

為了確定開發者建構應用的穩定性,OpenAI提供了一種更高準确度的方法來限制模型的輸出,進而實作100%的可靠性。

限制解碼

OpenAI采用了一種稱為限制采樣或限制解碼的技術,預設情況下,模型生成輸出時完全不受限制,可能從詞彙表中選擇任何token作為下一個輸出。

這種靈活性可能導緻錯誤,例如,在生成有效JSON時随意插入無效字元。

為了避免此類錯誤,OpenAI使用動态限制解碼的方法,確定生成的輸出token始終符合提供的schema。

為了實作這一點,OpenAI将提供的JSON Schema轉換為上下文無關文法(CFG)。

對于每個JSON Schema,OpenAI計算出一個代表該模式的文法,并在采樣期間高效地通路預處理的元件。

這種方法不僅使生成的輸出更準确,還減少了不必要的延遲。首次請求新模式可能會有額外的處理時間,但随後的請求通過緩存機制實作快速響應。

備選方案

除了CFG方法,其他方法通常使用有限狀态機(FSM)或正規表達式來進行限制解碼。

然而,這些方法在動态更新有效token時能力有限。特别是對于複雜的嵌套或遞歸資料結構,FSM通常難以處理。

OpenAI的CFG方法在表達複雜schema時表現出色。例如,支援遞歸模式的JSON schema在OpenAI API上已得到實作,但無法通過FSM方法表達。

輸入成本節省一半

支援函數調用的所有模型均可實作結構化輸出,包括最新的GPT-4o和GPT-4o-mini模型,以及微調模型。

此功能可在Chat Completions API、Assistants API和Batch API上使用,并相容視覺輸入。

與gpt-4o-2024-05-13版本相比,gpt-4o-2024-08-06版本在成本上也更具優勢,開發者可以在輸入端節省50%的成本(2.50美元/1M oken),在輸出端節省33%的成本(10.00美元/1M token)。

如何使用結構化輸出

在API中可以使用兩種形式引入結構化輸出:

函數調用

通過在函數定義中設定strict: true,可以實作通過工具的結構化輸出。

此功能适用于支援工具的所有型号,包括所有型号gpt-4-0613和gpt-3.5-turbo-0613及更高版本。

啟用結構化輸出後,模型輸出将與提供的工具定義比對。

示例請求:

POST /v1/chat/completions

{

"model": "gpt-4o-2024-08-06",

"messages": [

{

"role": "system",

"content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function."

},

{

"role": "user",

"content": "look up all my orders in may of last year that were fulfilled but not delivered on time"

}

],

"tools": [

{

"type": "function",

"function": {

"name": "query",

"description": "Execute a query.",

"strict": true,

"parameters": {

"type": "object",

"properties": {

"table_name": {

"type": "string",

"enum": ["orders"]

},

"columns": {

"type": "array",

"items": {

"type": "string",

"enum": [

"id",

"status",

"expected_delivery_date",

"delivered_at",

"shipped_at",

"ordered_at",

"canceled_at"

]

}

},

"conditions": {

"type": "array",

"items": {

"type": "object",

"properties": {

"column": {

"type": "string"

},

"operator": {

"type": "string",

"enum": ["=", ">", "<", ">=", "<=", "!="]

},

"value": {

"anyOf": [

{

"type": "string"

},

{

"type": "number"

},

{

"type": "object",

"properties": {

"column_name": {

"type": "string"

}

},

"required": ["column_name"],

"additionalProperties": false

}

]

}

},

"required": ["column", "operator", "value"],

"additionalProperties": false

}

},

"order_by": {

"type": "string",

"enum": ["asc", "desc"]

}

},

"required": ["table_name", "columns", "conditions", "order_by"],

"additionalProperties": false

}

}

}

]

}

示例輸出:

{

"table_name": "orders",

"columns": ["id", "status", "expected_delivery_date", "delivered_at"],

"conditions": [

{

"column": "status",

"operator": "=",

"value": "fulfilled"

},

{

"column": "ordered_at",

"operator": ">=",

"value": "2023-05-01"

},

{

"column": "ordered_at",

"operator": "<",

"value": "2023-06-01"

},

{

"column": "delivered_at",

"operator": ">",

"value": {

"column_name": "expected_delivery_date"

}

}

],

"order_by": "asc"

}

response_format參數的新選項

開發人員現在可以通過response_format的新選項json_schema選擇是否需要規定格式的輸出。

當模型不調用工具,而是以結構化方式響應使用者時,這一功能非常有用。

此功能适用于最新的GPT-4o型号:今天釋出的gpt-4o-2024-08-06和gpt-4o-mini-2024-07-18 。

将response_format設定為strict:true時,模型輸出将與提供的schema比對。

示例請求:

POST /v1/chat/completions

{

"model": "gpt-4o-2024-08-06",

"messages": [

{

"role": "system",

"content": "You are a helpful math tutor."

},

{

"role": "user",

"content": "solve 8x + 31 = 2"

}

],

"response_format": {

"type": "json_schema",

"json_schema": {

"name": "math_response",

"strict": true,

"schema": {

"type": "object",

"properties": {

"steps": {

"type": "array",

"items": {

"type": "object",

"properties": {

"explanation": {

"type": "string"

},

"output": {

"type": "string"

}

},

"required": ["explanation", "output"],

"additionalProperties": false

}

},

"final_answer": {

"type": "string"

}

},

"required": ["steps", "final_answer"],

"additionalProperties": false

}

}

}

}

示例輸出:

{

"steps": [

{

"explanation": "Subtract 31 from both sides to isolate the term with x.",

"output": "8x + 31 - 31 = 2 - 31"

},

{

"explanation": "This simplifies to 8x = -29.",

"output": "8x = -29"

},

{

"explanation": "Divide both sides by 8 to solve for x.",

"output": "x = -29 / 8"

}

],

"final_answer": "x = -29 / 8"

}

開發人員可以使用結構化輸出逐漸生成答案,以引導達到預期的輸出。

根據OpenAI的說法,開發人員不需要驗證或重試格式不正确的響應,并且該功能允許更簡單的提示。

原生SDK支援

OpenAI稱他們的Python和Node SDK已更新,原生支援結構化輸出。

為工具提供架構或響應格式就像提供Pydantic或Zod對象一樣簡單,OpenAI的SDK能将資料類型轉換為支援的JSON模式、自動将JSON響應反序列化為類型化資料結構以及解析拒絕。

from enum import Enum

from typing import Union

from pydantic import BaseModel

import openai

from openai import OpenAI

class Table(str, Enum):

orders = "orders"

customers = "customers"

products = "products"

class Column(str, Enum):

id = "id"

status = "status"

expected_delivery_date = "expected_delivery_date"

delivered_at = "delivered_at"

shipped_at = "shipped_at"

ordered_at = "ordered_at"

canceled_at = "canceled_at"

class Operator(str, Enum):

eq = "="

gt = ">"

lt = "<"

le = "<="

ge = ">="

ne = "!="

class OrderBy(str, Enum):

asc = "asc"

desc = "desc"

class DynamicValue(BaseModel):

column_name: str

class Condition(BaseModel):

column: str

operator: Operator

value: Union[str, int, DynamicValue]

class Query(BaseModel):

table_name: Table

columns: list[Column]

conditions: list[Condition]

order_by: OrderBy

client = OpenAI()

completion = client.beta.chat.completions.parse(

model="gpt-4o-2024-08-06",

messages=[

{

"role": "system",

"content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function.",

},

{

"role": "user",

"content": "look up all my orders in may of last year that were fulfilled but not delivered on time",

},

],

tools=[

openai.pydantic_function_tool(Query),

],

)

print(completion.choices[0].message.tool_calls[0].function.parsed_arguments)

而且,本機結構化輸出支援也可用于response_format 。

from pydantic import BaseModel

from openai import OpenAI

class Step(BaseModel):

explanation: str

output: str

class MathResponse(BaseModel):

steps: list[Step]

final_answer: str

client = OpenAI()

completion = client.beta.chat.completions.parse(

model="gpt-4o-2024-08-06",

messages=[

{"role": "system", "content": "You are a helpful math tutor."},

{"role": "user", "content": "solve 8x + 31 = 2"},

],

response_format=MathResponse,

)

message = completion.choices[0].message

if message.parsed:

print(message.parsed.steps)

print(message.parsed.final_answer)

else:

print(message.refusal)

其他用例

開發人員經常使用OpenAI的模型為各種用例生成結構化資料。

其他一些例子包括:

-根據使用者意圖動态生成使用者界面

開發人員可以使用結構化輸出來建立代碼或UI生成應用程式。

使用相同的response_format ,可根據使用者輸入生成不同的 UI。

例如,建立「園丁的登入界面」:

輸入價格打骨折!OpenAI官宣API支援結構化輸出,JSON準确率100%

就是用以下代碼生成的:

{

"type": "div",

"label": "",

"children": [

{

"type": "header",

"label": "",

"children": [

{

"type": "div",

"label": "Green Thumb Gardening",

"children": [],

"attributes": [{ "name": "className", "value": "site-title" }]

},

{

"type": "div",

"label": "Bringing Life to Your Garden",

"children": [],

"attributes": [{ "name": "className", "value": "site-tagline" }]

}

],

"attributes": [{ "name": "className", "value": "header" }]

},

{

"type": "section",

"label": "",

"children": [

{

"type": "div",

"label": "",

"children": [

{

"type": "div",

"label": "About Us",

"children": [

{

"type": "div",

"label": "At Green Thumb Gardening, we specialize in transforming your outdoor spaces into beautiful, thriving gardens. Our team has decades of experience in horticulture and landscape design.",

"children": [],

"attributes": [

{ "name": "className", "value": "about-description" }

]

}

],

"attributes": [{ "name": "className", "value": "about-section" }]

}

],

"attributes": [{ "name": "className", "value": "content" }]

}

],

"attributes": [{ "name": "className", "value": "about-container" }]

},

{

"type": "section",

"label": "",

"children": [

{

"type": "div",

"label": "",

"children": [

{

"type": "div",

"label": "Our Services",

"children": [

{

"type": "div",

"label": "Garden Design",

"children": [],

"attributes": [

{ "name": "className", "value": "service-item" }

]

},

{

"type": "div",

"label": "Plant Care & Maintenance",

"children": [],

"attributes": [

{ "name": "className", "value": "service-item" }

]

},

{

"type": "div",

"label": "Seasonal Cleanup",

"children": [],

"attributes": [

{ "name": "className", "value": "service-item" }

]

},

{

"type": "div",

"label": "Custom Landscaping",

"children": [],

"attributes": [

{ "name": "className", "value": "service-item" }

]

}

],

"attributes": [{ "name": "className", "value": "services-list" }]

}

],

"attributes": [{ "name": "className", "value": "content" }]

}

],

"attributes": [{ "name": "className", "value": "services-container" }]

}

],

"attributes": [{ "name": "className", "value": "landing-page" }]

}

- 将最終答案與支撐性的推理或附加評論分開

為模型提供一個單獨的思維鍊字段可以提高響應的最終品質。

請求:

{

"model": "gpt-4o-2024-08-06",

"messages": [

{

"role": "system",

"content": "You are a helpful assistant"

},

{

"role": "user",

"content": "9.11 and 9.9 -- which is bigger?"

}

],

"response_format": {

"type": "json_schema",

"json_schema": {

"name": "reasoning_schema",

"strict": true,

"schema": {

"type": "object",

"properties": {

"reasoning_steps": {

"type": "array",

"items": {

"type": "string"

},

"description": "The reasoning steps leading to the final conclusion."

},

"answer": {

"type": "string",

"description": "The final answer, taking into account the reasoning steps."

}

},

"required": ["reasoning_steps", "answer"],

"additionalProperties": false

}

}

}

}

結構化輸出:

{

"reasoning_steps": [

"First step is to compare the numbers 9.11 and 9.9.",

"Both numbers have the same whole number part, which is 9.",

"To compare the decimal parts, convert them to the same number of decimal places.",

"9.11 has two decimal places: it is 9.11.",

"9.9 has one decimal place: it can be rewritten as 9.90.",

"Now, compare 9.11 and 9.90 by looking at the decimal parts.",

"Compare 11 with 90.",

"90 is greater than 11, so 9.90 is greater than 9.11."

],

"answer": "9.9 is bigger than 9.11."

}

- 從非結構化資料中提取結構化資料

例如,訓示模型從會議記錄中提取待辦事項、截止日期和作業等内容。

請求:

POST /v1/chat/completions

{

"model": "gpt-4o-2024-08-06",

"messages": [

{

"role": "system",

"content": "Extract action items, due dates, and owners from meeting notes."

},

{

"role": "user",

"content": "...meeting notes go here..."

}

],

"response_format": {

"type": "json_schema",

"json_schema": {

"name": "action_items",

"strict": true,

"schema": {

"type": "object",

"properties": {

"action_items": {

"type": "array",

"items": {

"type": "object",

"properties": {

"description": {

"type": "string",

"description": "Description of the action item."

},

"due_date": {

"type": ["string", "null"],

"description": "Due date for the action item, can be null if not specified."

},

"owner": {

"type": ["string", "null"],

"description": "Owner responsible for the action item, can be null if not specified."

}

},

"required": ["description", "due_date", "owner"],

"additionalProperties": false

},

"description": "List of action items from the meeting."

}

},

"required": ["action_items"],

"additionalProperties": false

}

}

}

}

結構化輸出:

{

"action_items": [

{

"description": "Collaborate on optimizing the path planning algorithm",

"due_date": "2024-06-30",

"owner": "Jason Li"

},

{

"description": "Reach out to industry partners for additional datasets",

"due_date": "2024-06-25",

"owner": "Aisha Patel"

},

{

"description": "Explore alternative LIDAR sensor configurations and report findings",

"due_date": "2024-06-27",

"owner": "Kevin Nguyen"

},

{

"description": "Schedule extended stress tests for the integrated navigation system",

"due_date": "2024-06-28",

"owner": "Emily Chen"

},

{

"description": "Retest the system after bug fixes and update the team",

"due_date": "2024-07-01",

"owner": "David Park"

}

]

}

安全的結構化輸出

安全是OpenAI的首要任務——新的結構化輸出功能将遵守OpenAI現有的安全政策,并且仍然允許模型拒絕不安全的請求。

為了使開發更簡單,API響應上有一個新的refusal字元串值,它允許開發人員以程式設計方式檢測模型是否生成拒絕而不是與架構比對的輸出。

當響應不包含拒絕并且模型的響應沒有過早中斷(如finish_reason所示)時,模型的響應将可靠地生成與提供的schema比對的有效JSON。

繼續閱讀