Inference API
Run inferences on your Beamlit deployments.
Whenever you deploy a model on Beamlit, an inference endpoint is generated on Global Inference Network.
The inference URL looks like this:
run.beamlit.dev/{your-workspace}/models/{your-model}
There is one distinct endpoint for each model deployment, i.e. for each combination of a model and an environment on which it is deployed.
For example, if you have one version of model “your-model” deployed on the production environment and one version deployed on the development environment:
-
run.beamlit.dev/{your-workspace}/models/{your-model}?environment=production
will call the production deployment -
run.beamlit.dev/{your-workspace}/models/{your-model}?environment=development
will call the development deployment
If you do not specify the environment in the inference request, it will call the production environment by default. If the model is not deployed on the production environment, it will return an error.