POST /chat/{assistant_name}
Chat with an assistant and get back citations in structured form.
This is the recommended way to chat with an assistant, as it offers more functionality and control over the assistant's responses and references than the OpenAI-compatible chat interface.
For guidance and examples, see Chat with an assistant.
Servers
- https://{assistant_host}
Path parameters
| Name | Type | Required | Description |
|---|---|---|---|
assistant_name |
String | Yes |
The name of the assistant to be described. |
Request headers
| Name | Type | Required | Description |
|---|---|---|---|
Content-Type |
String | Yes |
The media type of the request body.
Default value: "application/json" |
X-Pinecone-Api-Version |
String | Yes |
Required date-based version header Default value: "2025-10" |
Request body fields
| Name | Type | Required | Description |
|---|---|---|---|
stream |
Boolean | No |
If false, the assistant will return a single JSON response. If true, the assistant will return a stream of responses. Default value: false |
include_highlights |
Boolean | No |
If true, the assistant will be instructed to return highlights from the referenced documents that support its response. Default value: false |
context_options |
Object | No |
Controls the context snippets sent to the LLM. |
context_options.top_k |
Integer | No |
The maximum number of context snippets to use. Default is 16. Maximum is 64. |
context_options.snippet_size |
Integer | No |
The maximum context snippet size. Default is 2048 tokens. Minimum is 512 tokens. Maximum is 8192 tokens. |
context_options.multimodal |
Boolean | No |
Whether or not to send image-related context snippets to the LLM. If Default value: true |
context_options.include_binary_content |
Boolean | No |
If image-related context snippets are sent to the LLM, this field determines whether or not they should include base64 image data. If Default value: true |
messages[] |
Array | Yes | |
messages[].content |
String | No |
Content of the message |
messages[].role |
String | No |
Role of the message such as 'user' or 'assistant' |
temperature |
Number | No |
Controls the randomness of the model's output: lower values make responses more deterministic, while higher values increase creativity and variability. If the model does not support a temperature parameter, the parameter will be ignored. Default value: 0.0 |
filter |
Object | No |
Optionally filter which documents can be retrieved using the following metadata fields. |
model |
String | No |
The large language model to use for answer generation Default value: "gpt-4o" |
json_response |
Boolean | No |
If true, the assistant will be instructed to return a JSON response. Cannot be used with streaming. Default value: false |
How to start integrating
- Add HTTP Task to your workflow definition.
- Search for the API you want to integrate with and click on the name.
- This loads the API reference documentation and prepares the Http request settings.
- Click Test request to test run your request to the API and see the API's response.