POST /v2/chat
Generates a text response to a user message and streams it down, token by token. To learn how to use the Chat API with streaming follow our Text Generation guides.
Follow the Migration Guide for instructions on moving from API v1 to API v2.
Servers
- https://api.cohere.com
Request headers
Name | Type | Required | Description |
---|---|---|---|
Content-Type |
String | Yes |
The media type of the request body.
Default value: "application/json" |
X-Client-Name |
String | No |
The name of the project that is making the request. |
Request body fields
Name | Type | Required | Description |
---|---|---|---|
stream |
Boolean | No |
Defaults to When Streaming is beneficial for user interfaces that render the contents of the response piece by piece, as it gets generated. |
k |
Integer | No |
Ensures that only the top Default value: 0 |
safety_mode |
String | No |
Used to select the safety instruction inserted into the prompt. Defaults to Safety modes are not yet configurable in combination with Note: This parameter is only compatible newer Cohere models, starting with Command R 08-2024 and Command R+ 08-2024. Note: Possible values:
|
temperature |
Number | No |
Defaults to A non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations, and higher temperatures mean more random generations. Randomness can be further maximized by increasing the value of the |
p |
Number | No |
Ensures that only the most likely tokens, with total probability mass of Default value: 0.75 |
tools[] |
Array | No |
A list of tools (functions) available to the model. The model response may contain 'tool_calls' to the specified tools. Learn more in the Tool Use guide. |
tools[].function |
Object | No |
The function to be executed. |
tools[].function.name |
String | Yes |
The name of the function. |
tools[].function.description |
String | No |
The description of the function. |
tools[].function.parameters |
Object | Yes |
The parameters of the function as a JSON schema. |
tools[].type |
String | No |
Possible values:
|
response_format |
Object | No |
Configuration for forcing the model output to adhere to the specified format. Supported on Command R, Command R+ and newer models. The model can be forced into outputting JSON objects by setting A JSON Schema can optionally be provided, to ensure a specific structure. Note: When using Note: When Limitation: The parameter is not supported when used in combinations with the |
tool_choice |
String | No |
Used to control whether or not the model will be forced to use a tool when answering. When Note: This parameter is only compatible with models Command-r7b and newer. Note: The same functionality can be achieved in Possible values:
|
model |
String | Yes |
The name of a compatible Cohere model or the ID of a fine-tuned model. |
frequency_penalty |
Number | No |
Defaults to |
reasoning_effort |
String | No |
The reasoning effort level of the model. This affects the model's performance and the time it takes to generate a response. Possible values:
|
documents[] |
Array | No |
A list of relevant documents that the model can cite to generate a more accurate reply. Each document is either a string or document object with content and metadata. |
seed |
Integer | No |
If specified, the backend will make a best effort to sample tokens deterministically, such that repeated requests with the same seed and parameters should return the same result. However, determinism cannot be totally guaranteed. |
messages[] |
Array | Yes |
A list of chat messages in chronological order, representing a conversation between the user and the model. Messages can be from |
strict_tools |
Boolean | No |
When set to Note: The first few requests with a new set of tools will take longer to process. |
citation_options |
Object | No |
Options for controlling citation generation. |
citation_options.mode |
String | No |
Defaults to Note: Possible values:
|
presence_penalty |
Number | No |
Defaults to |
max_tokens |
Integer | No |
The maximum number of tokens the model will generate as part of the response. Note: Setting a low value may result in incomplete generations. |
logprobs |
Boolean | No |
Defaults to |
stop_sequences[] |
Array | No |
A list of up to 5 strings that the model will use to stop generation. If the model generates a string that matches any of the strings in the list, it will stop generating tokens and return the generated text up to that point not including the stop sequence. |
How to start integrating
- Add HTTP Task to your workflow definition.
- Search for the API you want to integrate with and click on the name.
- This loads the API reference documentation and prepares the Http request settings.
- Click Test request to test run your request to the API and see the API's response.