Authorizations
Body
A list of messages comprising the conversation so far.
The name of the model to query.
"mistralai/Mixtral-8x7B-Instruct-v0.1"
The maximum number of tokens to generate.
A list of string sequences that will truncate (stop) inference text output.
Determines the degree of randomness in the response.
The top_p
(nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.
The top_k
parameter is used to limit the number of choices for the next predicted word or token.
A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition.
If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]
. If false, return a single JSON object containing the results.
Determines the number of most likely tokens to return at each token position log probabilities to return
0 <= x <= 1
If set, the response will contain the prompt, and will also return prompt logprobs if set with logprobs.
Number of generations to return
1 <= x <= 128
The min_p
parameter is a number between 0 and 1 and an alternative to temperature
.
The presence_penalty
parameter is a number between -2.0 and 2.0 where a positive value will increase the likelihood of a model talking about new topics.
The frequency_penalty
parameter is a number between -2.0 and 2.0 where a positive value will decrease the likelihood of repeating tokens that were mentioned prior.
The logit_bias
parameter allows us to adjust the likelihood of specific tokens appearing in the generated output.
{ "105": 21.4, "1024": -10.5 }
Specifies the format of the response.
A list of tools to be used in the query.
The choice of tool to use.
The name of the safety model to use.
"safety_model_name"