PUT
/
cloud
/
v2
/
inference
/
deployments
/
{instance_id}
Update Inference Instance
curl --request PUT \
  --url https://api.gcore.com/cloud/v2/inference/deployments/{instance_id} \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "api_key_ids": [
    "3fa85f64-5717-4562-b3fc-2c963f66afa6"
  ],
  "auth_enabled": false,
  "command": [
    "nginx",
    "-g",
    "daemon off;"
  ],
  "containers": [
    {
      "region_id": 7,
      "scale": {
        "cooldown_period": 60,
        "max": 3,
        "min": 1,
        "triggers": {
          "cpu": {
            "threshold": 80
          },
          "memory": {
            "threshold": 70
          }
        }
      }
    }
  ],
  "description": "My first instance",
  "envs": {
    "DEBUG_MODE": "False",
    "KEY": "12345"
  },
  "flavor_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "image": "nginx:latest",
  "image_registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "listening_port": 8080,
  "name": "my-instance",
  "probes": {
    "liveness_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    },
    "readiness_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    },
    "startup_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    }
  },
  "project_id": 1,
  "timeout": 120
}'
{
  "address": "https://example.com",
  "api_key_ids": [
    "3fa85f64-5717-4562-b3fc-2c963f66afa6"
  ],
  "auth_enabled": false,
  "command": [
    "nginx",
    "-g",
    "daemon off;"
  ],
  "containers": [
    {
      "deploy_status": {
        "ready": 1,
        "total": 3
      },
      "region_id": 7,
      "scale": {
        "cooldown_period": 60,
        "max": 3,
        "min": 1,
        "triggers": {
          "cpu": {
            "threshold": 80
          },
          "memory": {
            "threshold": 70
          }
        }
      }
    }
  ],
  "created_at": "2023-08-22T11:21:00Z",
  "description": "My first instance",
  "envs": {
    "DEBUG_MODE": "False",
    "KEY": "12345"
  },
  "flavor_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "image": "nginx:latest",
  "image_registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "listening_port": 8080,
  "name": "my-instance",
  "probes": {
    "liveness_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    },
    "readiness_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    },
    "startup_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    }
  },
  "status": "ACTIVE",
  "timeout": 120
}

Authorizations

Authorization
string
header
required

API key for authentication. Make sure to include the word apikey, followed by a single space and then your token. Example: apikey 1234$abcdef

Path Parameters

instance_id
string
required

Inference Instance ID

Body

application/json
containers
ContainerInSerializer · object[]
required

List of containers for the inference instance.

Minimum length: 1
Examples:
[
{
"region_id": 7,
"scale": {
"cooldown_period": 60,
"max": 3,
"min": 1,
"triggers": {
"cpu": { "threshold": 80 },
"memory": { "threshold": 70 }
}
}
}
]
flavor_id
string<uuid>
required

Flavor ID for the inference instance.

Examples:

"3fa85f64-5717-4562-b3fc-2c963f66afa6"

image
string
required

Docker image for the inference instance. This field should contain the image name and tag in the format 'name:tag', e.g., 'nginx:latest'. It defaults to Docker Hub as the image registry, but any accessible Docker image URL can be specified.

Examples:

"nginx:latest"

listening_port
integer
required

Listening port for the inference instance.

Required range: 1 <= x <= 65535
Examples:

8080

name
string
required

Inference instance name.

Required string length: 4 - 30
Examples:

"my-instance"

api_key_ids
string<uuid>[]

List of API keys IDs to attach to the inference instance

Examples:
["3fa85f64-5717-4562-b3fc-2c963f66afa6"]
auth_enabled
boolean
default:false

Set to true to enable API key authentication for the inference instance. Manage API keys through the '/v1/inference_instances/keys' endpoint.

Examples:

false

command
string[] | null

Command to be executed when running a container from an image.

Examples:
["nginx", "-g", "daemon off;"]
description
string | null

Inference instance description.

Examples:

"My first instance"

envs
object

Environment variables for the inference instance.

Examples:
{ "DEBUG_MODE": "False", "KEY": "12345" }
image_registry_id
string<uuid> | null

Image registry ID for authentication in private registries. Leave this parameter empty if no authentication is required for the repository.

Examples:

"3fa85f64-5717-4562-b3fc-2c963f66afa6"

probes
object | null

Probes configured for all containers of the inference instance. If probes are not provided, and the image_name is from a the Model Catalog registry, the default probes will be used.

project_id
integer | null

Project ID. If not provided, your default project ID will be used.

Examples:

1

timeout
integer | null
default:120

Specifies the duration in seconds without any requests after which the containers will be downscaled to their minimum scale value as defined by scale.min. If set, this helps in optimizing resource usage by reducing the number of container instances during periods of inactivity. The default value when the parameter is not set is 120.

Required range: x >= 0
Examples:

120

Response

Inference instance

containers
ContainerOutSerializer · object[]
required

List of containers for the inference instance

Examples:
[
{
"deploy_status": { "ready": 1, "total": 3 },
"region_id": 7,
"scale": {
"cooldown_period": 60,
"max": 3,
"min": 1,
"triggers": {
"cpu": { "threshold": 80 },
"memory": { "threshold": 70 }
}
}
}
]
flavor_id
string<uuid>
required

Flavor ID for the inference instance

Examples:

"3fa85f64-5717-4562-b3fc-2c963f66afa6"

id
string<uuid>
required

Inference instance ID.

Examples:

"3fa85f64-5717-4562-b3fc-2c963f66afa6"

image
string
required

Docker image for the inference instance. This field should contain the image name and tag in the format 'name:tag', e.g., 'nginx:latest'. It defaults to Docker Hub as the image registry, but any accessible Docker image URL can be specified.

Examples:

"nginx:latest"

listening_port
integer
required

Listening port for the inference instance.

Examples:

8080

name
string
required

Inference instance name.

Examples:

"my-instance"

status
enum<string>
required

Inference instance status

Available options:
ACTIVE,
DELETED,
DELETING,
DEPLOYING,
DISABLED,
ERROR,
FAILED,
NEW,
PARTIALLYDEPLOYED,
PENDING
address
string<uri> | null

Address of the inference instance

Minimum length: 1
Examples:

"https://example.com"

api_key_ids
string<uuid>[]

List of API keys IDs attached to the inference instance

Examples:
["3fa85f64-5717-4562-b3fc-2c963f66afa6"]
auth_enabled
boolean
default:false

Set to true if instance uses API key authentication. Manage API keys through the '/v1/inference_instances/keys' endpoint.

Examples:

false

command
string[] | null

Command to be executed when running a container from an image.

Examples:
["nginx", "-g", "daemon off;"]
created_at
string<date-time> | null

Inference instance creation date in ISO 8601 format.

Examples:

"2023-08-22T11:21:00Z"

description
string | null

Inference instance description.

Examples:

"My first instance"

envs
object | null

Environment variables for the inference instance

Examples:
{ "DEBUG_MODE": "False", "KEY": "12345" }
image_registry_id
string<uuid> | null

Image registry ID for authentication in private registries. This parameter is empty if no authentication is required for the repository.

Examples:

"3fa85f64-5717-4562-b3fc-2c963f66afa6"

probes
object | null

Probes configured for all containers of the inference instance.

timeout
integer | null
default:120

Specifies the duration in seconds without any requests after which the containers will be downscaled to their minimum scale value as defined by scale.min. If set, this helps in optimizing resource usage by reducing the number of container instances during periods of inactivity.

Required range: x >= 0
Examples:

120