Overview
Inference API
agents
- GETList all agents
- POSTCreate agent by name
- GETGet agent by name
- PUTUpdate agent by name
- DELDelete agent by name
- GETList all agent deployments
- GETGet agent deployment by environment name
- PUTCreate or update agent deployment by environment name
- DELDelete agent deployment by environment name
- GETGet agents deployments history
- GETGet agents deployments history 1
- PUTUpdate agent's history by request ID
- DELDelete agents deployments history
- GETGet agents deployments logs
- GETGet agents deployments metrics
- GETGet agent metrics
- POSTCreate release for a agent from an environment
configurations
environments
functions
- GETList all functions
- POSTCreate function
- GETGet function by name
- PUTUpdate function by name
- DELDelete function by name
- GETList all function deployments
- GETGet function deployment by environment name
- PUTCreate or update function deployment by environment name
- DELDelete function deployment by environment name
- GETGet functions deployments logs
- GETGet functions deployments metrics
- GETGet function metrics
- POSTCreate release for a function from an environment
integrations
locations
model_providers
models
- GETList models
- POSTCreate model
- GETGet model
- PUTCreate or update model
- DELDelete model
- GETList model deployments
- GETGet model deployment
- PUTCreate or update model deployment
- DELDelete model deployment
- GETGet models deployments logs
- GETGet models deployments metrics
- GETGet model metrics
- POSTRelease model from an environment
invitations
service_accounts
store
workspaces
- GETList users in workspace
- POSTInvite user to workspace
- PUTUpdate user role in workspace
- DELRemove user from workspace or revoke invitation
- GETList workspaces
- POSTCreate worspace
- GETGet workspace
- PUTUpdate workspace
- DELDelete workspace
- POSTDecline invitation to workspace
- POSTAccept invitation to workspace
- DELLeave workspace
Create agent by name
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Logical object representing an agent but with deployment definition inside
The date and time when the resource was created
The user or service account who created the resource
The date and time when the resource was updated
The user or service account who updated the resource
Agent display name
Labels
Agent name
Workspace name
Agent deployments
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Logical object representing an agent but with deployment definition inside
The date and time when the resource was created
The user or service account who created the resource
The date and time when the resource was updated
The user or service account who updated the resource
Agent display name
Agent name
Workspace name
Agent deployments
The date and time when the resource was created
The user or service account who created the resource
The date and time when the resource was updated
The user or service account who updated the resource
The name of the agent
Agent configuration, this is a key value storage. In your agent you can retrieve the value with config[key]
Agent description, very important to have a clear description for your agent if you want to make it work with agent chaining
Whether the agent deployment is enabled
The name of the environment
Functions used by the agent, those functions needs to be created before setting it here
The integration connections for the model deployment
Model beamlit to use for agent, it should be compatible with function calling
The pod template, should be a valid Kubernetes pod template
Set of configurations for a deployment
The arguments to pass to the deployment runtime
The command to run the deployment
The environment variables to set in the deployment. Should be a list of Kubernetes EnvVar types
The Docker image for the deployment
The slug name of the origin model. Only used if the deployment is a ModelDeployment
The readiness probe. Should be a Kubernetes Probe type
The resources for the deployment. Should be a Kubernetes ResourceRequirements type
The type of origin for the deployment
Configuration for a serverless deployment
The minimum amount of time that the last replica will remain active AFTER a scale-to-zero decision is made
The maximum number of replicas for the deployment.
Metric watched to make scaling decisions. Can be "cpu" or "memory" or "rps" or "concurrency"
The minimum number of replicas for the deployment. Can be 0 or 1 (in which case the deployment is always running in at least one location).
The time window which must pass at reduced concurrency before a scale-down decision is applied. This can be useful, for example, to keep containers around for a configurable duration to avoid a cold start penalty if new requests come in.
The minimum number of replicas that will be created when the deployment scales up from zero.
The sliding time window over which metrics are averaged to provide the input for scaling decisions
Target value for the watched metric
Create from a store registered function
The workspace the agent deployment belongs to
Response
Agent parent of AgentDeployment
The date and time when the resource was created
The user or service account who created the resource
The date and time when the resource was updated
The user or service account who updated the resource
Agent display name
Agent name
Workspace name