Create completion

curl --request POST \
  --url https://api.together.xyz/v1/completions \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "<s>[INST] What is the capital of France? [/INST]",
  "model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
  "max_tokens": 123,
  "stop": [
    "<string>"
  ],
  "temperature": 123,
  "top_p": 123,
  "top_k": 123,
  "repetition_penalty": 123,
  "stream": true,
  "logprobs": 0,
  "echo": true,
  "n": 64,
  "safety_model": "safety_model_name",
  "min_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logit_bias": {
    "105": 21.4,
    "1024": -10.5
  }
}
'

{
  "id": "<string>",
  "choices": [
    {
      "text": "The capital of France is Paris. It's located in the north-central part of the country and is one of the most populous and visited cities in the world, known for its iconic landmarks like the Eiffel Tower, Louvre Museum, Notre-Dame Cathedral, and more. Paris is also the capital of the Île-de-France region and is a major global center for art, fashion, gastronomy, and culture.",
      "finish_reason": "stop",
      "logprobs": {
        "tokens": [
          "<string>"
        ],
        "token_logprobs": [
          123
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  },
  "created": 123,
  "model": "<string>",
  "object": "text_completion",
  "prompt": [
    {
      "text": "<s>[INST] What is the capital of France? [/INST]",
      "logprobs": {
        "tokens": [
          "<string>"
        ],
        "token_logprobs": [
          123
        ]
      }
    }
  ]
}

POST

completions

Create completion

curl --request POST \
  --url https://api.together.xyz/v1/completions \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "<s>[INST] What is the capital of France? [/INST]",
  "model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
  "max_tokens": 123,
  "stop": [
    "<string>"
  ],
  "temperature": 123,
  "top_p": 123,
  "top_k": 123,
  "repetition_penalty": 123,
  "stream": true,
  "logprobs": 0,
  "echo": true,
  "n": 64,
  "safety_model": "safety_model_name",
  "min_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logit_bias": {
    "105": 21.4,
    "1024": -10.5
  }
}
'

{
  "id": "<string>",
  "choices": [
    {
      "text": "The capital of France is Paris. It's located in the north-central part of the country and is one of the most populous and visited cities in the world, known for its iconic landmarks like the Eiffel Tower, Louvre Museum, Notre-Dame Cathedral, and more. Paris is also the capital of the Île-de-France region and is a major global center for art, fashion, gastronomy, and culture.",
      "finish_reason": "stop",
      "logprobs": {
        "tokens": [
          "<string>"
        ],
        "token_logprobs": [
          123
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  },
  "created": 123,
  "model": "<string>",
  "object": "text_completion",
  "prompt": [
    {
      "text": "<s>[INST] What is the capital of France? [/INST]",
      "logprobs": {
        "tokens": [
          "<string>"
        ],
        "token_logprobs": [
          123
        ]
      }
    }
  ]
}

Authorizations

Authorization

string

header

default:default

required

Body

application/json

prompt

string

required

A string providing context for the model to complete.

Example:

"<s>[INST] What is the capital of France? [/INST]"

model

string

required

The name of the model to query.

Example:

"mistralai/Mixtral-8x7B-Instruct-v0.1"

max_tokens

integer

The maximum number of tokens to generate.

stop

string[]

A list of string sequences that will truncate (stop) inference text output.

temperature

number<float>

Determines the degree of randomness in the response.

top_p

number<float>

The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.

top_k

integer<int32>

The top_k parameter is used to limit the number of choices for the next predicted word or token.

repetition_penalty

number<float>

A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.

stream

boolean

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]

logprobs

integer

Determines the number of most likely tokens to return at each token position log probabilities to return

Required range: 0 <= x <= 1

echo

boolean

If set, the response will contain the prompt, and will also return prompt logprobs if set with logprobs.

integer

Number of generations to return

Required range: 1 <= x <= 128

safety_model

string

The name of the safety model to use.

Example:

"safety_model_name"

min_p

number<float>

The min_p parameter is a number between 0 and 1 and an alternative to temperature.

presence_penalty

number<float>

The presence_penalty parameter is a number between -2.0 and 2.0 where a positive value will increase the likelihood of a model talking about new topics.

frequency_penalty

number<float>

The frequency_penalty parameter is a number between -2.0 and 2.0 where a positive value will decrease the likelihood of repeating tokens that were mentioned prior.

logit_bias

object

The logit_bias parameter allows us to adjust the likelihood of specific tokens appearing in the generated output.

Show child attributes

Example:

{ "105": 21.4, "1024": -10.5 }

Response

200

string

required

choices

object[]

required

Show child attributes

usage

object

required

Show child attributes

created

integer

required

model

string

required

object

enum<string>

required

Available options:

text_completion

prompt

object[]

Show child attributes

Create chat completion Create embedding

⌘I

Chat

Completion

Embeddings

Models

Images

Files

Fine-tuning

Create completion

Authorizations

Body

Response