Authorizations
API key for authentication. Make sure to include the word apikey
, followed by a single space and then your token.
Example: apikey 1234$abcdef
Path Parameters
Inference Instance ID
Body
List of containers for the inference instance.
1
[
{
"region_id": 7,
"scale": {
"cooldown_period": 60,
"max": 3,
"min": 1,
"triggers": {
"cpu": { "threshold": 80 },
"memory": { "threshold": 70 }
}
}
}
]
Flavor ID for the inference instance.
"3fa85f64-5717-4562-b3fc-2c963f66afa6"
Docker image for the inference instance. This field should contain the image name and tag in the format 'name:tag', e.g., 'nginx:latest'. It defaults to Docker Hub as the image registry, but any accessible Docker image URL can be specified.
"nginx:latest"
Listening port for the inference instance.
1 <= x <= 65535
8080
Inference instance name.
4 - 30
"my-instance"
List of API keys IDs to attach to the inference instance
["3fa85f64-5717-4562-b3fc-2c963f66afa6"]
Set to true to enable API key authentication for the inference instance. Manage API keys through the '/v1/inference_instances
/keys' endpoint.
false
Command to be executed when running a container from an image.
["nginx", "-g", "daemon off;"]
Inference instance description.
"My first instance"
Environment variables for the inference instance.
{ "DEBUG_MODE": "False", "KEY": "12345" }
Image registry ID for authentication in private registries. Leave this parameter empty if no authentication is required for the repository.
"3fa85f64-5717-4562-b3fc-2c963f66afa6"
Probes configured for all containers of the inference instance. If probes are not provided, and the image_name
is from a the Model Catalog registry, the default probes will be used.
Project ID. If not provided, your default project ID will be used.
1
Specifies the duration in seconds without any requests after which the containers will be downscaled to their minimum scale value as defined by scale.min
. If set, this helps in optimizing resource usage by reducing the number of container instances during periods of inactivity. The default value when the parameter is not set is 120.
x >= 0
120
Response
Inference instance
List of containers for the inference instance
[
{
"deploy_status": { "ready": 1, "total": 3 },
"region_id": 7,
"scale": {
"cooldown_period": 60,
"max": 3,
"min": 1,
"triggers": {
"cpu": { "threshold": 80 },
"memory": { "threshold": 70 }
}
}
}
]
Flavor ID for the inference instance
"3fa85f64-5717-4562-b3fc-2c963f66afa6"
Inference instance ID.
"3fa85f64-5717-4562-b3fc-2c963f66afa6"
Docker image for the inference instance. This field should contain the image name and tag in the format 'name:tag', e.g., 'nginx:latest'. It defaults to Docker Hub as the image registry, but any accessible Docker image URL can be specified.
"nginx:latest"
Listening port for the inference instance.
8080
Inference instance name.
"my-instance"
Inference instance status
ACTIVE
, DELETED
, DELETING
, DEPLOYING
, DISABLED
, ERROR
, FAILED
, NEW
, PARTIALLYDEPLOYED
, PENDING
Address of the inference instance
1
"https://example.com"
List of API keys IDs attached to the inference instance
["3fa85f64-5717-4562-b3fc-2c963f66afa6"]
Set to true if instance uses API key authentication. Manage API keys through the '/v1/inference_instances
/keys' endpoint.
false
Command to be executed when running a container from an image.
["nginx", "-g", "daemon off;"]
Inference instance creation date in ISO 8601 format.
"2023-08-22T11:21:00Z"
Inference instance description.
"My first instance"
Environment variables for the inference instance
{ "DEBUG_MODE": "False", "KEY": "12345" }
Image registry ID for authentication in private registries. This parameter is empty if no authentication is required for the repository.
"3fa85f64-5717-4562-b3fc-2c963f66afa6"
Probes configured for all containers of the inference instance.
Specifies the duration in seconds without any requests after which the containers will be downscaled to their minimum scale value as defined by scale.min
. If set, this helps in optimizing resource usage by reducing the number of container instances during periods of inactivity.
x >= 0
120