On this page

Supported models
JSON mode example

JSON mode corrals the LLM into outputting JSON conforming to a provided schema. To activate JSON mode, provide the response_format parameter to the Chat Completions API with {"type": "json_object"}. The JSON Schema can be specified with the schema property of response_format

Supported models

meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
mistralai/Mixtral-8x7B-Instruct-v0.1
mistralai/Mistral-7B-Instruct-v0.1
togethercomputer/CodeLlama-34b-Instruct

JSON mode example

When using JSON mode, always instruct the model to produce JSON, either in a system or user message. This is very important so that it only responds in JSON, along with providing the response_format parameter.

With JSON mode, you can specify a schema for the output of the LLM. In Python, we’ll do this with Pydantic and in TypeScript, we’ll do this with Zod. Here’s an example of JSON mode with Python using Llama 3.1.

python
TypeScript
cURL

import json
from together import Together
from pydantic import BaseModel, Field
import together

together = Together()

# Define the schema for the output
class VoiceNote(BaseModel):
title: str = Field(description="A title for the voice note")
summary: str = Field(description="A short one sentence summary of the voice note.")
actionItems: list[str] = Field(
    description="A list of action items from the voice note"
)

def main():
transcript = (
    "Good morning! It's 7:00 AM, and I'm just waking up. Today is going to be a busy day, "
    "so let's get started. First, I need to make a quick breakfast. I think I'll have some "
    "scrambled eggs and toast with a cup of coffee. While I'm cooking, I'll also check my "
    "emails to see if there's anything urgent."
)

# Call the LLM with the JSON schema
extract = together.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "The following is a voice message transcript. Only answer in JSON.",
        },
        {
            "role": "user",
            "content": transcript,
        },
    ],
    model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
    response_format={
        "type": "json_object",
        "schema": VoiceNote.model_json_schema(),
    },
)

output = json.loads(extract.choices[0].message.content)
print(json.dumps(output, indent=2))
return output

main()

Updated 9 days ago

Llama 3.1 Overview