POST /v2/dedicated-inferences
Create a new Dedicated Inference for your team. Send a POST request to
/v2/dedicated-inferences with a spec object (version, name, region, vpc,
enable_public_endpoint, model_deployments) and optional access_tokens (e.g.
hugging_face_token for gated models). The response code 202 Accepted indicates
the request was accepted for processing; it does not indicate success or failure.
The token value is returned only on create; store it securely.
Servers
- https://api.digitalocean.com
Request headers
| Name | Type | Required | Description |
|---|---|---|---|
Content-Type |
String | Yes |
The media type of the request body.
Default value: "application/json" |
Request body fields
| Name | Type | Required | Description |
|---|---|---|---|
access_tokens |
Object | No |
Key-value pairs for provider tokens (e.g. Hugging Face). |
spec |
Object | Yes |
Structured configuration for a Dedicated Inference deployment. |
spec.region |
String | Yes |
DigitalOcean region where the Dedicated Inference is hosted. Valid values:
|
spec.name |
String | Yes |
Name of the Dedicated Inference. Must be unique within the team. |
spec.model_deployments[] |
Array | Yes |
At least one model deployment is required. |
spec.model_deployments[].workload_config |
Object | No |
Workload-specific configuration (e.g. ISL/OSL in future). |
spec.model_deployments[].model_id |
String | No |
Used to identify an existing deployment when updating; empty means create new. |
spec.model_deployments[].accelerators[] |
Array | No |
Accelerator configuration for this deployment. |
spec.model_deployments[].accelerators[].type |
String | Yes |
Accelerator type (e.g. prefill_decode). |
spec.model_deployments[].accelerators[].status |
String | No |
Current state of the Accelerator. Valid values:
|
spec.model_deployments[].accelerators[].scale |
Integer | Yes |
Number of accelerator instances. |
spec.model_deployments[].accelerators[].accelerator_slug |
String | Yes |
DigitalOcean GPU slug. |
spec.model_deployments[].model_slug |
String | No |
Model identifier (e.g. Hugging Face slug). |
spec.model_deployments[].model_provider |
String | No |
Model provider. Valid values:
|
spec.version |
Integer | Yes |
Spec version. |
spec.enable_public_endpoint |
Boolean | Yes |
Whether to expose a public LLM endpoint. |
spec.vpc |
Object | Yes | |
spec.vpc.uuid |
String | Yes |
VPC UUID for the Dedicated Inference. |
How to start integrating
- Add HTTP Task to your workflow definition.
- Search for the API you want to integrate with and click on the name.
- This loads the API reference documentation and prepares the Http request settings.
- Click Test request to test run your request to the API and see the API's response.