PATCH /v2/dedicated-inferences/{dedicated_inference_id}

Update an existing Dedicated Inference. Send a PATCH request to /v2/dedicated-inferences/{dedicated_inference_id} with updated spec and/or access_tokens. Status will move to updating and return to active when done.

Servers

Path parameters

Name Type Required Description
dedicated_inference_id String Yes

A unique identifier for a Dedicated Inference instance.

Request headers

Name Type Required Description
Content-Type String Yes The media type of the request body.

Default value: "application/json"

Request body fields

Name Type Required Description
access_tokens Object No

Provider tokens for model access (e.g. gated Hugging Face models).

access_tokens.hugging_face_token String No

Hugging Face token required for gated models.

spec Object No

Structured configuration for a Dedicated Inference deployment.

spec.region String Yes

DigitalOcean region where the Dedicated Inference is hosted.

Valid values:

  • "atl1"
  • "nyc2"
  • "tor1"
spec.name String Yes

Name of the Dedicated Inference. Must be unique within the team.

spec.model_deployments[] Array Yes

At least one model deployment is required.

spec.model_deployments[].workload_config Object No

Workload-specific configuration (e.g. ISL/OSL in future).

spec.model_deployments[].model_id String No

Used to identify an existing deployment when updating; empty means create new.

spec.model_deployments[].accelerators[] Array No

Accelerator configuration for this deployment.

spec.model_deployments[].accelerators[].type String Yes

Accelerator type (e.g. prefill_decode).

spec.model_deployments[].accelerators[].status String No

Current state of the Accelerator.

Valid values:

  • "active"
  • "provisioning"
  • "new"
spec.model_deployments[].accelerators[].scale Integer Yes

Number of accelerator instances.

spec.model_deployments[].accelerators[].accelerator_slug String Yes

DigitalOcean GPU slug.

spec.model_deployments[].model_slug String No

Model identifier (e.g. Hugging Face slug).

spec.model_deployments[].model_provider String No

Model provider.

Valid values:

  • "hugging_face"
spec.version Integer Yes

Spec version.

spec.enable_public_endpoint Boolean Yes

Whether to expose a public LLM endpoint.

spec.vpc Object Yes
spec.vpc.uuid String Yes

VPC UUID for the Dedicated Inference.

How to start integrating

  1. Add HTTP Task to your workflow definition.
  2. Search for the API you want to integrate with and click on the name.
    • This loads the API reference documentation and prepares the Http request settings.
  3. Click Test request to test run your request to the API and see the API's response.