GET
/
cloud
/
v3
/
inference
/
{project_id}
/
deployments
/
{deployment_name}
Python
import os
from gcore import Gcore

client = Gcore(
    api_key=os.environ.get("GCORE_API_KEY"),  # This is the default and can be omitted
)
inference = client.cloud.inference.deployments.get(
    deployment_name="my-instance",
    project_id=1,
)
print(inference.project_id)
{
  "address": "https://example.com",
  "auth_enabled": false,
  "command": [
    "nginx",
    "-g",
    "daemon off;"
  ],
  "containers": [
    {
      "deploy_status": {
        "ready": 1,
        "total": 3
      },
      "region_id": 1,
      "scale": {
        "cooldown_period": 60,
        "max": 3,
        "min": 1,
        "triggers": {
          "cpu": {
            "threshold": 80
          },
          "memory": {
            "threshold": 70
          }
        }
      }
    }
  ],
  "created_at": "2023-08-22T11:21:00Z",
  "credentials_name": "dockerhub",
  "description": "My first instance",
  "envs": {
    "DEBUG_MODE": "False",
    "KEY": "12345"
  },
  "flavor_name": "inference-16vcpu-232gib-1xh100-80gb",
  "image": "nginx:latest",
  "ingress_opts": {
    "disable_response_buffering": true
  },
  "listening_port": 8080,
  "logging": {
    "destination_region_id": 1,
    "enabled": true,
    "retention_policy": {
      "period": 45
    },
    "topic_name": "my-log-name"
  },
  "name": "my-instance",
  "probes": {
    "liveness_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    },
    "readiness_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    },
    "startup_probe": {
      "enabled": true,
      "probe": {
        "exec": {
          "command": [
            "ls",
            "-l"
          ]
        },
        "failure_threshold": 3,
        "http_get": {
          "headers": {
            "Authorization": "Bearer token 123"
          },
          "host": "127.0.0.1",
          "path": "/healthz",
          "port": 80,
          "schema": "HTTP"
        },
        "initial_delay_seconds": 0,
        "period_seconds": 5,
        "success_threshold": 1,
        "tcp_socket": {
          "port": 80
        },
        "timeout_seconds": 1
      }
    }
  },
  "project_id": 1,
  "status": "ACTIVE",
  "timeout": 120
}

Authorizations

Authorization
string
header
required

API key for authentication. Make sure to include the word apikey, followed by a single space and then your token. Example: apikey 1234$abcdef

Path Parameters

project_id
integer
required

Project ID

Examples:

1

deployment_name
string
required

Inference instance name.

Required string length: 4 - 30
Examples:

"my-instance"

Response

200 - application/json

OK

address
string<uri> | null
required

Address of the inference instance

Minimum length: 1
Examples:

"https://example.com"

auth_enabled
boolean
required

true if instance uses API key authentication. "Authorization": "Bearer ****\*" or "X-Api-Key": "****\*" header is required for the requests to the instance if enabled.

Examples:

false

command
string | null
required

Command to be executed when running a container from an image.

Examples:
["nginx", "-g", "daemon off;"]
containers
ContainerOutSerializerV3 · object[]
required

List of containers for the inference instance

Examples:
[
{
"deploy_status": { "ready": 1, "total": 3 },
"region_id": 1,
"scale": {
"cooldown_period": 60,
"max": 3,
"min": 1,
"triggers": {
"cpu": { "threshold": 80 },
"memory": { "threshold": 70 }
}
}
}
]
created_at
string | null
required

Inference instance creation date in ISO 8601 format.

Examples:

"2023-08-22T11:21:00Z"

credentials_name
string
required

Registry credentials name

Examples:

"dockerhub"

description
string
required

Inference instance description.

Examples:

"My first instance"

envs
object | null
required

Environment variables for the inference instance

Examples:
{ "DEBUG_MODE": "False", "KEY": "12345" }
flavor_name
string
required

Flavor name for the inference instance

Examples:

"inference-16vcpu-232gib-1xh100-80gb"

image
string
required

Docker image for the inference instance. This field should contain the image name and tag in the format 'name:tag', e.g., 'nginx:latest'. It defaults to Docker Hub as the image registry, but any accessible Docker image URL can be specified.

Examples:

"nginx:latest"

ingress_opts
object | null
required

Ingress options for the inference instance

Examples:
{ "disable_response_buffering": true }
listening_port
integer
required

Listening port for the inference instance.

Examples:

8080

logging
object | null
required

Logging configuration for the inference instance

Examples:
{
"destination_region_id": 1,
"enabled": true,
"retention_policy": { "period": 45 },
"topic_name": "my-log-name"
}
name
string
required

Inference instance name.

Examples:

"my-instance"

probes
object | null
required

Probes configured for all containers of the inference instance.

project_id
integer
required

Project ID. If not provided, your default project ID will be used.

Examples:

1

status
enum<string>
required

Inference instance status. Value can be one of the following:

  • DEPLOYING - The instance is being deployed. Containers are not yet created.
  • PARTIALLYDEPLOYED - All containers have been created, but some may not be ready yet. Instances stuck in this state typically indicate either image being pulled, or a failure of some kind. In the latter case, the error_message field of the respective container object in the containers collection explains the failure reason.
  • ACTIVE - The instance is running and ready to accept requests.
  • DISABLED - The instance is disabled and not accepting any requests.
  • PENDING - The instance is running but scaled to zero. It will be automatically scaled up when a request is made.
  • DELETING - The instance is being deleted.
Available options:
ACTIVE,
DELETING,
DEPLOYING,
DISABLED,
PARTIALLYDEPLOYED,
PENDING
timeout
integer | null
required

Specifies the duration in seconds without any requests after which the containers will be downscaled to their minimum scale value as defined by scale.min. If set, this helps in optimizing resource usage by reducing the number of container instances during periods of inactivity.

Required range: x >= 0
Examples:

120