POST /v1/generate

This API is marked as "Legacy" and is no longer maintained. Follow the [migration guide](https://docs.cohere.com/docs/migrating-from-cogenerate-to-cochat) to start using the Chat API. Generates realistic text conditioned on a given input.

Servers

https://api.cohere.com

Request headers

Name	Type	Required	Description
`Content-Type`	String	Yes	The media type of the request body. Default value: "application/json"
`X-Client-Name`	String	No	The name of the project that is making the request.

Request body fields

Name	Type	Required	Description
`prompt`	String	Yes	The input text that serves as the starting point for generating the response. Note: The prompt will be pre-processed and modified before reaching the model.
`num_generations`	Integer	No	The maximum number of generations that will be returned. Defaults to `1`, min value of `1`, max value of `5`.
`stream`	Boolean	No	When `true`, the response will be a JSON stream of events. Streaming is beneficial for user interfaces that render the contents of the response piece by piece, as it gets generated. The final event will contain the complete response, and will contain an `is_finished` field set to `true`. The event will also contain a `finish_reason`, which can be one of the following: `COMPLETE` - the model sent back a finished reply `MAX_TOKENS` - the reply was cut off because the model reached the maximum number of tokens for its context length `ERROR` - something went wrong when generating the reply `ERROR_TOXIC` - the model generated a reply that was deemed toxic
`preset`	String	No	Identifier of a custom preset. A preset is a combination of parameters, such as prompt, temperature etc. You can create presets in the playground. When a preset is specified, the `prompt` parameter becomes optional, and any included parameters will override the preset's parameters.
`k`	Integer	No	Ensures only the top `k` most likely tokens are considered for generation at each step. Defaults to `0`, min value of `0`, max value of `500`.
`temperature`	Number	No	A non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations. See Temperature for more details. Defaults to `0.75`, min value of `0.0`, max value of `5.0`.
`p`	Number	No	Ensures that only the most likely tokens, with total probability mass of `p`, are considered for generation at each step. If both `k` and `p` are enabled, `p` acts after `k`. Defaults to `0.75`. min value of `0.01`, max value of `0.99`.
`truncate`	String	No	One of `NONE\|START\|END` to specify how the API will handle inputs longer than the maximum token length. Passing `START` will discard the start of the input. `END` will discard the end of the input. In both cases, input is discarded until the remaining input is exactly the maximum input token length for the model. If `NONE` is selected, when the input exceeds the maximum input token length an error will be returned. Possible values: `"START"` `"END"` `"NONE"` Default value: "END"
`model`	String	No	The identifier of the model to generate with. Currently available models are `command` (default), `command-nightly` (experimental), `command-light`, and `command-light-nightly` (experimental). Smaller, "light" models are faster, while larger models will perform better. Custom models can also be supplied with their full ID.
`frequency_penalty`	Number	No	Used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation. Using `frequency_penalty` in combination with `presence_penalty` is not supported on newer models.
`seed`	Integer	No	If specified, the backend will make a best effort to sample tokens deterministically, such that repeated requests with the same seed and parameters should return the same result. However, determinism cannot be totally guaranteed. Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
`return_likelihoods`	String	No	One of `GENERATION\|NONE` to specify how and if the token likelihoods are returned with the response. Defaults to `NONE`. If `GENERATION` is selected, the token likelihoods will only be provided for generated text. WARNING: `ALL` is deprecated, and will be removed in a future release. Possible values: `"GENERATION"` `"NONE"` `"ALL"` Default value: "NONE"
`end_sequences[]`	Array	No	The generated text will be cut at the beginning of the earliest occurrence of an end sequence. The sequence will be excluded from the text.
`presence_penalty`	Number	No	Defaults to `0.0`, min value of `0.0`, max value of `1.0`. Can be used to reduce repetitiveness of generated tokens. Similar to `frequency_penalty`, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies. Using `frequency_penalty` in combination with `presence_penalty` is not supported on newer models.
`max_tokens`	Integer	No	The maximum number of tokens the model will generate as part of the response. Note: Setting a low value may result in incomplete generations. This parameter is off by default, and if it's not specified, the model will continue generating until it emits an EOS completion token. See BPE Tokens for more details. Can only be set to `0` if `return_likelihoods` is set to `ALL` to get the likelihood of the prompt.
`stop_sequences[]`	Array	No	The generated text will be cut at the end of the earliest occurrence of a stop sequence. The sequence will be included the text.
`raw_prompting`	Boolean	No	When enabled, the user's prompt will be sent to the model without any pre-processing.

How to start integrating

Add HTTP Task to your workflow definition.
Search for the API you want to integrate with and click on the name.
- This loads the API reference documentation and prepares the Http request settings.
Click Test request to test run your request to the API and see the API's response.