Create chat completion

curl --request POST \
  --url https://api.together.xyz/v1/chat/completions \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
  "max_tokens": 123,
  "stop": [
    "<string>"
  ],
  "temperature": 123,
  "top_p": 123,
  "top_k": 123,
  "repetition_penalty": 123,
  "stream": true,
  "logprobs": 0,
  "echo": true,
  "n": 64,
  "min_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logit_bias": {
    "105": 21.4,
    "1024": -10.5
  },
  "response_format": {
    "type": "json",
    "schema": {}
  },
  "tools": [
    {
      "type": "tool_type",
      "function": {
        "description": "A description of the function.",
        "name": "function_name",
        "parameters": {}
      }
    }
  ],
  "tool_choice": {
    "type": "tool_choice_type",
    "function": {
      "name": "function_name"
    }
  },
  "safety_model": "safety_model_name"
}'

{
  "id": "<string>",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "<string>"
      },
      "finish_reason": "stop",
      "logprobs": {
        "tokens": [
          "<string>"
        ],
        "token_logprobs": [
          123
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  },
  "created": 123,
  "model": "<string>",
  "object": "chat.completion"
}

POST

chat

completions

Create chat completion

curl --request POST \
  --url https://api.together.xyz/v1/chat/completions \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
  "max_tokens": 123,
  "stop": [
    "<string>"
  ],
  "temperature": 123,
  "top_p": 123,
  "top_k": 123,
  "repetition_penalty": 123,
  "stream": true,
  "logprobs": 0,
  "echo": true,
  "n": 64,
  "min_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logit_bias": {
    "105": 21.4,
    "1024": -10.5
  },
  "response_format": {
    "type": "json",
    "schema": {}
  },
  "tools": [
    {
      "type": "tool_type",
      "function": {
        "description": "A description of the function.",
        "name": "function_name",
        "parameters": {}
      }
    }
  ],
  "tool_choice": {
    "type": "tool_choice_type",
    "function": {
      "name": "function_name"
    }
  },
  "safety_model": "safety_model_name"
}'

{
  "id": "<string>",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "<string>"
      },
      "finish_reason": "stop",
      "logprobs": {
        "tokens": [
          "<string>"
        ],
        "token_logprobs": [
          123
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  },
  "created": 123,
  "model": "<string>",
  "object": "chat.completion"
}

Authorizations

Authorization

string

header

default:default

required

Body

application/json

messages

object[]

required

A list of messages comprising the conversation so far.

Show child attributes

model

string

required

The name of the model to query.

Example:

"mistralai/Mixtral-8x7B-Instruct-v0.1"

max_tokens

integer

The maximum number of tokens to generate.

stop

string[]

A list of string sequences that will truncate (stop) inference text output.

temperature

number

Determines the degree of randomness in the response.

top_p

number

The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.

top_k

integer

The top_k parameter is used to limit the number of choices for the next predicted word or token.

repetition_penalty

number

A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

stream

boolean

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]. If false, return a single JSON object containing the results.

logprobs

integer

Determines the number of most likely tokens to return at each token position log probabilities to return

Required range: 0 <= x <= 1

echo

boolean

If set, the response will contain the prompt, and will also return prompt logprobs if set with logprobs.

integer

Number of generations to return

Required range: 1 <= x <= 128

min_p

number

The min_p parameter is a number between 0 and 1 and an alternative to temperature.

presence_penalty

number

The presence_penalty parameter is a number between -2.0 and 2.0 where a positive value will increase the likelihood of a model talking about new topics.

frequency_penalty

number

The frequency_penalty parameter is a number between -2.0 and 2.0 where a positive value will decrease the likelihood of repeating tokens that were mentioned prior.

logit_bias

object

The logit_bias parameter allows us to adjust the likelihood of specific tokens appearing in the generated output.

Show child attributes

Example:

{ "105": 21.4, "1024": -10.5 }

response_format

object

Specifies the format of the response.

Show child attributes

tools

object[]

A list of tools to be used in the query.

Show child attributes

tool_choice

object

The choice of tool to use.

Show child attributes

safety_model

string

The name of the safety model to use.

Example:

"safety_model_name"

Response

200

string

choices

object[]

Show child attributes

usage

object | null

Show child attributes

created

integer

model

string

object

enum<string>

Available options:

chat.completion

Create completion

Chat

Completion

Embeddings

Models

Images

Files

Fine-tuning

Create chat completion

Authorizations

Body

Response