Whenever you deploy a model on Beamlit, an inference endpoint is generated on Global Inference Network.

The inference URL looks like this:

run.beamlit.dev/{your-workspace}/models/{your-model}

There is one distinct endpoint for each model deployment, i.e. for each combination of a model and an environment on which it is deployed.

For example, if you have one version of model “your-model” deployed on the production environment and one version deployed on the development environment:

  • run.beamlit.dev/{your-workspace}/models/{your-model}?environment=production will call the production deployment

  • run.beamlit.dev/{your-workspace}/models/{your-model}?environment=development will call the development deployment

If you do not specify the environment in the inference request, it will call the production environment by default. If the model is not deployed on the production environment, it will return an error.

Product documentation

Read our product guide on querying a model.