Inference API

Whenever you deploy a model on Beamlit, an inference endpoint is generated on Global Inference Network.

The inference URL looks like this:

run.beamlit.dev/{your-workspace}/models/{your-model}

There is one distinct endpoint for each model deployment, i.e. for each combination of a model and an environment on which it is deployed.

For example, if you have one version of model “your-model” deployed on the production environment and one version deployed on the development environment:

run.beamlit.dev/{your-workspace}/models/{your-model}?environment=production will call the production deployment
run.beamlit.dev/{your-workspace}/models/{your-model}?environment=development will call the development deployment

If you do not specify the environment in the inference request, it will call the production environment by default. If the model is not deployed on the production environment, it will return an error.

Product documentation

Read our product guide on querying a model.

Overview

agents

configurations

environments

functions

history

integrations

locations

metrics

model_providers

models

policies

invitations

service_accounts

store

workspaces

Inference API

Product documentation