A common way to use Chat Completions is to instruct the model to always return JSON in some format that makes sense for your use case, by providing a system message. This works well, but occasionally the models may generate output that does not parse to valid JSON.
To prevent these errors and improve model performance, when calling gpt-4-1106-preview or gpt-3.5-turbo-1106, you can set response_format to { type: "json_object" } to enable JSON mode. When JSON mode is enabled, the model is constrained to only generate strings that parse into valid JSON.
In the past you could ask ChatGPT to generate JSON or use function calling to output JSON, but in my experience working with both approaches on Preceden, both methods would still occasionally return invalid JSON (due to characters not being escaped properly, for example). So, this new JSON mode is a welcome addition and will simplify quite a bit of my prompting.
Here’s a simple Ruby script demonstrating how to use JSON mode with GPT-4 Turbo:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This instructs GPT-4 Turbo to generate valid JSON by setting the response_format to { type: "json_object" }. And sure enough, ChatGPT fills in the JSON values:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Let’s try some other prompt variations and see what happens.
Not mentioning JSON in the prompt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
To use JSON mode, your system message must instruct the model to produce JSON. To help ensure you don’t forget, the API will throw an error if the string "JSON" does not appear in your system message.
Lowercase JSON
Does JSON have to be properly cased in the prompt?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Instead of giving it a JSON template, what if we just described what we want?
Generate json about Bill Gates with two keys: full_name and title.
That works too:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Generate json about Bill Gates with three keys: full_name, title, and facts. facts should be an array with 3 items.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"Bill Gates started Microsoft with his childhood friend Paul Allen in 1975.",
"He held the position of chairman, CEO, and chief software architect at Microsoft during his career.",
"Beyond technology, Gates is also known for his philanthropic work with the Bill & Melinda Gates Foundation, which focuses on health, education, and poverty alleviation."
Generate json about Bill Gates with two three keys: full_name, title, and events.
Facts should be an array of 3 objects representing important milestones in his life. Each of those objects should have two keys: date and description. The dates should be formatted like “Nov 7, 2023”.
How does it do?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Another version of this is to provide it a JSON template with comments above each key with guidance about how to generate those values:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is more verbose, but might be work better for complex use cases where you want to provide a lot of guidance to ChatGPT about the structure and content of the output.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Without reading the docs, it’s easy to come away believing that the API will always return valid JSON if you’re requesting the output in JSON mode. However, there is one situation when it doesn’t, and that’s if ChatGPT reaches its maximum output token limit while generating the response. From the docs:
The JSON in the message the model returns may be partial (i.e. cut off) if finish_reason is length, which indicates the generation exceeded max_tokens or the conversation exceeded the token limit. To guard against this, check finish_reason before parsing the response.
Both gpt-4-1106-preview and gpt-3.5-turbo-1106 have a maximum token output of 4,096 tokens, which means if you request a large amount of data, or if you explicitly set max_tokens to a lower limit and ChatGPT hits that limit while generating the response, it will stop generating the JSON midway through it. Here’s an example to demonstrate:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Attempting to parse this output with something like JSON.parse will obviously not work.
Following the doc’s advice to check the finish_reason, we can update the script accordingly to return null if ChatGPT couldn’t generate the entire JSON object:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hopefully this gives you an idea of what’s possible with the new JSON mode. Shout out to the OpenAI team for implementing it. And if you have any other tips about using this new feature, please drop a comment below and I’ll update this post accordingly. Cheers!