【New Zhiyuan Guide】Gospel for programmers! OpenAI's newly launched model API supports structured output, with a JSON Schema matching rate of up to 100%, and the cost is cut in half.
Still racking your brains to think of a bunch of prompts, and have a headache over the various output results after a meal?
OpenAI has finally heard the voice of the masses and delivered the long-awaited first feature to the majority of developers.
OpenAI today announced the launch of a new feature, and the ChatGPT API now supports JSON structured output.
JSON (JavaScript Object Notation) is the industry standard for file and data exchange formats because it is both human-readable and machine-parsed.
However, LLMs often work against JSON, often creating the illusion that they either generate responses that only partially follow instructions, or generate a bunch of "heavenly books" that cannot be fully parsed.
This requires developers to use multiple open-source tools, try different prompts, or repeat requests to produce the desired output, which can be time-consuming and labor-intensive.
With the release of structured output today, this is a tough nut to come by, ensuring that the output generated by the model matches the schema specified in the JSON.
The structured output feature has always been the number one feature that developers have been asking for, and Altman also said in a tweet that this version was released at the request of a wide range of users.
OpenAI发布的新功能确实击中了许多开发者的心,他们一致认为「This is a big deal」。
纷纷留言表示赞叹,直呼「Excellent!」。
A few are happy and a few are worried, and this update of OpenAI makes people worry that it will swallow up startups.
However, for more casual users, the question they are more concerned about is when exactly GPT-5 will be released, and as for the JSON schema, "What's that?"
After all, without news of GPT-5, OpenAI's DevDay this fall may be much quieter than last year.
Easily ensure pattern consistency
With structured output, only a JSON schema needs to be defined, and the AI will no longer be "capricious" and obediently output data according to the instructions.
In addition, the new features not only make AI more obedient, but also greatly improve the reliability of the output content.
The new model gpt-4o-2024-08-06 with structured output received a perfect score of 100% in the tracking evaluation of complex JSON schemas. In comparison, GPT-4-0613 scored less than 40%.
In fact, the JSON Schema feature was introduced by OpenAI at last year's DevDay.
Now, OpenAI has extended this functionality into the API, ensuring that the output generated by the model exactly matches the JSON schema provided by the developer.
Generating structured data from unstructured inputs is one of the core use cases for AI in today's applications.
Developers use the OpenAI API to build powerful assistants that can fetch data and answer questions through function calls, extract structured data for data input, and build multi-step agentic workflows that allow LLMs to take action.
Technical principle
OpenAI takes a two-pronged approach to improve the matching of model output with JSON schema.
The latest gpt-4o-2024-08-06 model is trained to better understand complex schemas and generate outputs that match them.
Although model performance has improved significantly, achieving 93% accuracy in benchmarks, inherent uncertainties remain.
To ensure the stability of developers building applications, OpenAI provides a more accurate way to constrain the output of the model, resulting in 100% reliability.
Constraint decoding
OpenAI employs a technique called constrained sampling or constrained decoding, and by default, the model generates an output that is completely unconstrained, potentially selecting any token from the vocabulary as the next output.
This flexibility can lead to errors, such as arbitrarily inserting invalid characters when generating a valid JSON.
To avoid such errors, OpenAI uses a dynamic constraint decoding method to ensure that the generated output token always conforms to the provided schema.
To achieve this, OpenAI converts the provided JSON schema into Context-Free Syntax (CFG).
For each JSON schema, OpenAI calculates a syntax that represents that schema and efficiently accesses the preprocessed components during sampling.
This approach not only makes the output generated more accurate, but also reduces unnecessary delays. The new mode of first request may have additional processing time, but subsequent requests are cached for fast response.
Alternatives
In addition to the CFG method, other methods usually use finite state machines (FSMs) or regular expressions for constraint decoding.
However, these methods have limited capabilities when dynamically updating valid tokens. Especially for complex nested or recursive data structures, FSM is often difficult to handle.
OpenAI's CFG approach excels when it comes to expressing complex schemas. For example, a JSON schema that supports recursive patterns is implemented on the OpenAI API, but cannot be expressed by the FSM method.
Half the input cost
Structured output is available for all models that support function calls, including the latest GPT-4o and GPT-4o-mini models, as well as fine-tuned models.
此功能可在Chat Completions API、Assistants API和Batch API上使用,并兼容视觉输入。
Compared with gpt-4o-2024-05-13, gpt-4o-2024-08-06 is also more cost-effective, allowing developers to save 50% on the input ($2.50/1M oken) and 33% on the output ($10.00/1M token).
How to use structured output
There are two forms of structured output that can be introduced in the API:
Function calls
By setting strict: true in the function definition, you can achieve structured output through the tool.
This feature is available on all models of the support tool, including all models GPT-4-0613 and GPT-3.5-TURBO-0613 and above.
When structured output is enabled, the model output will match the provided tool definition.
Example request:
POST /v1/chat/completions
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function."
},
{
"role": "user",
"content": "look up all my orders in may of last year that were fulfilled but not delivered on time"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "query",
"description": "Execute a query.",
"strict": true,
"parameters": {
"type": "object",
"properties": {
"table_name": {
"type": "string",
"enum": ["orders"]
},
"columns": {
"type": "array",
"items": {
"type": "string",
"enum": [
"id",
"status",
"expected_delivery_date",
"delivered_at",
"shipped_at",
"ordered_at",
"canceled_at"
]
}
},
"conditions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"column": {
"type": "string"
},
"operator": {
"type": "string",
"Is": ["=", ">", "<", ">=", "<=", "!=" ]
},
"value": {
"anyOf": [
{
"type": "string"
},
{
"type": "number"
},
{
"type": "object",
"properties": {
"column_name": {
"type": "string"
}
},
"required": ["column_name"],
"additionalProperties": false
}
]
}
},
"required": ["column", "operator", "value"],
"additionalProperties": false
}
},
"order_by": {
"type": "string",
"enum": ["asc", "desc"]
}
},
"required": ["table_name", "columns", "conditions", "order_by"],
"additionalProperties": false
}
}
}
]
}
Sample output:
{
"table_name": "orders",
"columns": ["id", "status", "expected_delivery_date", "delivered_at"],
"conditions": [
{
"column": "status",
"operator": "=",
"value": "fulfilled"
},
{
"column": "ordered_at",
"operator": ">=",
"value": "2023-05-01"
},
{
"column": "ordered_at",
"operator": "<",
"value": "2023-06-01"
},
{
"column": "delivered_at",
"operator": ">",
"value": {
"column_name": "expected_delivery_date"
}
}
],
"order_by": "asc"
}
New options for response_format parameters
Developers can now choose whether they want to format the output or not json_schema with response_format new options.
This is useful when the model responds to the user in a structured way instead of invoking the tool.
此功能适用于最新的GPT-4o型号:今天发布的gpt-4o-2024-08-06和gpt-4o-mini-2024-07-18 。
When the response_format is set to strict:true, the model output will match the schema provided.
Example request:
POST /v1/chat/completions
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor."
},
{
"role": "user",
"content": "solve 8x + 31 = 2"
}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "math_response",
"strict": true,
"schema": {
"type": "object",
"properties": {
"steps": {
"type": "array",
"items": {
"type": "object",
"properties": {
"explanation": {
"type": "string"
},
"output": {
"type": "string"
}
},
"required": ["explanation", "output"],
"additionalProperties": false
}
},
"final_answer": {
"type": "string"
}
},
"required": ["steps", "final_answer"],
"additionalProperties": false
}
}
}
}
Sample output:
{
"steps": [
{
"explanation": "Subtract 31 from both sides to isolate the term with x.",
"output": "8x + 31 - 31 = 2 - 31"
},
{
"explanation": "This simplifies to 8x = -29.",
"output": "8x = -29"
},
{
"explanation": "Divide both sides by 8 to solve for x.",
"output": "x = -29 / 8"
}
],
"final_answer": "x = -29 / 8"
}
Developers can use structured output to generate answers step-by-step to steer to the desired output.
According to OpenAI, developers don't need to validate or retry malformed responses, and the feature allows for simpler prompts.
Native SDK support
OpenAI says their Python and Node SDKs have been updated to natively support structured output.
Providing a schema or response format for a tool is as simple as providing a Pydantic or Zod object, and OpenAI's SDK converts data types to supported JSON schemas, automatically deserializes JSON responses into typed data structures, and parses rejections.
from enum import Enum
from typing import Union
from pydantic import BaseModel
import openai
from openai import OpenAI
class Table(str, Enum):
orders = "orders"
customers = "customers"
products = "products"
class Column(str, Enum):
id = "id"
status = "status"
expected_delivery_date = "expected_delivery_date"
delivered_at = "delivered_at"
shipped_at = "shipped_at"
ordered_at = "ordered_at"
canceled_at = "canceled_at"
class Operator(str, Enum):
eq = "="
gt = ">"
lt = "<"
on = "<="
ge = ">="
ne = "!="
class OrderBy(str, Enum):
asc = "asc"
desc = "desc"
class DynamicValue(BaseModel):
column_name: str
class Condition(BaseModel):
column: str
operator: Operator
value: Union[str, int, DynamicValue]
class Query(BaseModel):
table_name: Table
columns: list[Column]
conditions: list[Condition]
order_by: OrderBy
client = OpenAI()
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{
"role": "system",
"content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function.",
},
{
"role": "user",
"content": "look up all my orders in may of last year that were fulfilled but not delivered on time",
},
],
tools=[
openai.pydantic_function_tool(Query),
],
)
print(completion.choices[0].message.tool_calls[0].function.parsed_arguments)
Also, native structured output support is available for response_format.
from pydantic import BaseModel
from openai import OpenAI
class Step(BaseModel):
explanation: str
output: str
class MathResponse(BaseModel):
steps: list[Step]
final_answer: str
client = OpenAI()
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor."},
{"role": "user", "content": "solve 8x + 31 = 2"},
],
response_format=MathResponse,
)
message = completion.choices[0].message
if message.parsed:
print(message.parsed.steps)
print(message.parsed.final_answer)
else:
print(message.refusal)
Other use cases
Developers often use OpenAI's models to generate structured data for a variety of use cases.
Some other examples include:
- Dynamically generate user interfaces based on user intent
Developers can use structured output to create code or UI build applications.
With the same response_format, you can generate different UIs based on user input.
For example, to create a "Gardener's Login Screen":
It is generated with the following code:
{
"type": "div",
"label": "",
"children": [
{
"type": "header",
"label": "",
"children": [
{
"type": "div",
"label": "Green Thumb Gardening",
"children": [],
"attributes": [{ "name": "className", "value": "site-title" }]
},
{
"type": "div",
"label": "Bringing Life to Your Garden",
"children": [],
"attributes": [{ "name": "className", "value": "site-tagline" }]
}
],
"attributes": [{ "name": "className", "value": "header" }]
},
{
"type": "section",
"label": "",
"children": [
{
"type": "div",
"label": "",
"children": [
{
"type": "div",
"label": "About Us",
"children": [
{
"type": "div",
"label": "At Green Thumb Gardening, we specialize in transforming your outdoor spaces into beautiful, thriving gardens. Our team has decades of experience in horticulture and landscape design.",
"children": [],
"attributes": [
{ "name": "className", "value": "about-description" }
]
}
],
"attributes": [{ "name": "className", "value": "about-section" }]
}
],
"attributes": [{ "name": "className", "value": "content" }]
}
],
"attributes": [{ "name": "className", "value": "about-container" }]
},
{
"type": "section",
"label": "",
"children": [
{
"type": "div",
"label": "",
"children": [
{
"type": "div",
"label": "Our Services",
"children": [
{
"type": "div",
"label": "Garden Design",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Plant Care & Maintenance",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Seasonal Cleanup",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Custom Landscaping",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
}
],
"attributes": [{ "name": "className", "value": "services-list" }]
}
],
"attributes": [{ "name": "className", "value": "content" }]
}
],
"attributes": [{ "name": "className", "value": "services-container" }]
}
],
"attributes": [{ "name": "className", "value": "landing-page" }]
}
- Separate the final answer from supporting reasoning or additional comments
Providing a separate chain-of-thought field for the model can improve the final quality of the response.
Request:
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant"
},
{
"role": "user",
"content": "9.11 and 9.9 -- which is bigger?"
}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "reasoning_schema",
"strict": true,
"schema": {
"type": "object",
"properties": {
"reasoning_steps": {
"type": "array",
"items": {
"type": "string"
},
"description": "The reasoning steps leading to the final conclusion."
},
"answer": {
"type": "string",
"description": "The final answer, taking into account the reasoning steps."
}
},
"required": ["reasoning_steps", "answer"],
"additionalProperties": false
}
}
}
}
Structured Output:
{
"reasoning_steps": [
"First step is to compare the numbers 9.11 and 9.9.",
"Both numbers have the same whole number part, which is 9.",
"To compare the decimal parts, convert them to the same number of decimal places.",
"9.11 has two decimal places: it is 9.11.",
"9.9 has one decimal place: it can be rewritten as 9.90.",
"Now, compare 9.11 and 9.90 by looking at the decimal parts.",
"Compare 11 with 90.",
"90 is greater than 11, so 9.90 is greater than 9.11."
],
"answer": "9.9 is bigger than 9.11."
}
- Extract structured data from unstructured data
For example, instruct the model to extract things like to-dos, due dates, and assignments from meeting minutes.
Request:
POST /v1/chat/completions
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "Extract action items, due dates, and owners from meeting notes."
},
{
"role": "user",
"content": "... meeting notes go here..."
}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "action_items",
"strict": true,
"schema": {
"type": "object",
"properties": {
"action_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Description of the action item."
},
"due_date": {
"type": ["string", "null"],
"description": "Due date for the action item, can be null if not specified."
},
"owner": {
"type": ["string", "null"],
"description": "Owner responsible for the action item, can be null if not specified."
}
},
"required": ["description", "due_date", "owner"],
"additionalProperties": false
},
"description": "List of action items from the meeting."
}
},
"required": ["action_items"],
"additionalProperties": false
}
}
}
}
Structured Output:
{
"action_items": [
{
"description": "Collaborate on optimizing the path planning algorithm",
"due_date": "2024-06-30",
"owner": "Jason Li"
},
{
"description": "Reach out to industry partners for additional datasets",
"due_date": "2024-06-25",
"owner": "Aisha Patel"
},
{
"description": "Explore alternative LIDAR sensor configurations and report findings",
"due_date": "2024-06-27",
"owner": "Kevin Nguyen"
},
{
"description": "Schedule extended stress tests for the integrated navigation system",
"due_date": "2024-06-28",
"owner": "Emily Chen"
},
{
"description": "Retest the system after bug fixes and update the team",
"due_date": "2024-07-01",
"owner": "David Park"
}
]
}
Secure, structured output
Security is OpenAI's top priority – the new structured output feature will adhere to OpenAI's existing security policies and still allow the model to reject insecure requests.
To make development simpler, there is a new refusal string value on the API response, which allows developers to programmatically detect if the model generates an output that is rejected rather than matched to the schema.
When the response does not contain a rejection and the model's response is not prematurely interrupted (as shown in finish_reason), the model's response will reliably generate valid JSON that matches the provided schema.