Skip to main content
POST
/
chat
/
completions
Create chat completion
curl --request POST \
  --url https://api.together.xyz/v1/chat/completions \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
  "max_tokens": 123,
  "stop": [
    "<string>"
  ],
  "temperature": 123,
  "top_p": 123,
  "top_k": 123,
  "repetition_penalty": 123,
  "stream": true,
  "logprobs": 0,
  "echo": true,
  "n": 64,
  "min_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logit_bias": {
    "105": 21.4,
    "1024": -10.5
  },
  "response_format": {
    "type": "json",
    "schema": {}
  },
  "tools": [
    {
      "type": "tool_type",
      "function": {
        "description": "A description of the function.",
        "name": "function_name",
        "parameters": {}
      }
    }
  ],
  "tool_choice": {
    "type": "tool_choice_type",
    "function": {
      "name": "function_name"
    }
  },
  "safety_model": "safety_model_name"
}'
{
  "id": "<string>",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "<string>"
      },
      "finish_reason": "stop",
      "logprobs": {
        "tokens": [
          "<string>"
        ],
        "token_logprobs": [
          123
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  },
  "created": 123,
  "model": "<string>",
  "object": "chat.completion"
}

Authorizations

Authorization
string
header
default:default
required

Body

application/json
messages
object[]
required

A list of messages comprising the conversation so far.

model
string
required

The name of the model to query.

Example:

"mistralai/Mixtral-8x7B-Instruct-v0.1"

max_tokens
integer

The maximum number of tokens to generate.

stop
string[]

A list of string sequences that will truncate (stop) inference text output.

temperature
number

Determines the degree of randomness in the response.

top_p
number

The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.

top_k
integer

The top_k parameter is used to limit the number of choices for the next predicted word or token.

repetition_penalty
number

A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

stream
boolean

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]. If false, return a single JSON object containing the results.

logprobs
integer

Determines the number of most likely tokens to return at each token position log probabilities to return

Required range: 0 <= x <= 1
echo
boolean

If set, the response will contain the prompt, and will also return prompt logprobs if set with logprobs.

n
integer

Number of generations to return

Required range: 1 <= x <= 128
min_p
number

The min_p parameter is a number between 0 and 1 and an alternative to temperature.

presence_penalty
number

The presence_penalty parameter is a number between -2.0 and 2.0 where a positive value will increase the likelihood of a model talking about new topics.

frequency_penalty
number

The frequency_penalty parameter is a number between -2.0 and 2.0 where a positive value will decrease the likelihood of repeating tokens that were mentioned prior.

logit_bias
object

The logit_bias parameter allows us to adjust the likelihood of specific tokens appearing in the generated output.

Example:
{ "105": 21.4, "1024": -10.5 }
response_format
object

Specifies the format of the response.

tools
object[]

A list of tools to be used in the query.

tool_choice
object

The choice of tool to use.

safety_model
string

The name of the safety model to use.

Example:

"safety_model_name"

Response

200

id
string
choices
object[]
usage
object | null
created
integer
model
string
object
enum<string>
Available options:
chat.completion