Authorizations
API key for authentication. Make sure to include the word apikey
, followed by a single space and then your token.
Example: apikey 1234$abcdef
Path Parameters
Project ID
1
Inference instance name.
4 - 30
"my-instance"
Response
OK
Address of the inference instance
1
"https://example.com"
true
if instance uses API key authentication. "Authorization": "Bearer ****\*"
or "X-Api-Key": "****\*"
header is required for the requests to the instance if enabled.
false
Command to be executed when running a container from an image.
["nginx", "-g", "daemon off;"]
List of containers for the inference instance
[
{
"deploy_status": { "ready": 1, "total": 3 },
"region_id": 1,
"scale": {
"cooldown_period": 60,
"max": 3,
"min": 1,
"triggers": {
"cpu": { "threshold": 80 },
"memory": { "threshold": 70 }
}
}
}
]
Inference instance creation date in ISO 8601 format.
"2023-08-22T11:21:00Z"
Registry credentials name
"dockerhub"
Inference instance description.
"My first instance"
Environment variables for the inference instance
{ "DEBUG_MODE": "False", "KEY": "12345" }
Flavor name for the inference instance
"inference-16vcpu-232gib-1xh100-80gb"
Docker image for the inference instance. This field should contain the image name and tag in the format 'name:tag', e.g., 'nginx:latest'. It defaults to Docker Hub as the image registry, but any accessible Docker image URL can be specified.
"nginx:latest"
Ingress options for the inference instance
{ "disable_response_buffering": true }
Listening port for the inference instance.
8080
Logging configuration for the inference instance
{
"destination_region_id": 1,
"enabled": true,
"retention_policy": { "period": 45 },
"topic_name": "my-log-name"
}
Inference instance name.
"my-instance"
Probes configured for all containers of the inference instance.
Project ID. If not provided, your default project ID will be used.
1
Inference instance status. Value can be one of the following:
DEPLOYING
- The instance is being deployed. Containers are not yet created.PARTIALLYDEPLOYED
- All containers have been created, but some may not be ready yet. Instances stuck in this state typically indicate either image being pulled, or a failure of some kind. In the latter case, theerror_message
field of the respective container object in thecontainers
collection explains the failure reason.ACTIVE
- The instance is running and ready to accept requests.DISABLED
- The instance is disabled and not accepting any requests.PENDING
- The instance is running but scaled to zero. It will be automatically scaled up when a request is made.DELETING
- The instance is being deleted.
ACTIVE
, DELETING
, DEPLOYING
, DISABLED
, PARTIALLYDEPLOYED
, PENDING
Specifies the duration in seconds without any requests after which the containers will be downscaled to their minimum scale value as defined by scale.min
. If set, this helps in optimizing resource usage by reducing the number of container instances during periods of inactivity.
x >= 0
120