Monitoring
Get visibility over all inference requests on your models.
Beamlit stores inference logs for model deployments to give you visibility and troubleshooting abilities. Only model deployments on Beamlit Global Inference Network generate logs. Logs for models you have deployed privately/on-premise are not collected, but logs for inferences that overflow on Beamlit are.
Logs overview
Logs about what happens inside the inference runtime while processing an inference request are collected. Logs contain typically the following information:
- Inference runtime initialization
- Model download
- inference duration (broken down per tokenization, queue, inference time)
- Idle period finished
Deployment mode
Depending on how you are deployed on Global Inference Network, logs storage and querying will behave differently.
- Models that are deployed on Beamlit, either by direct deployment or when overflowing on Beamlit, automatically export inference logs. They can be viewed per specific model deployment.
- Models that are deployed on your own clusters as an origin to be offloaded on Beamlit do not export logs. Please contact us if you would be interested to centralize these logs with Beamlit logs, either on Beamlit or in your cluster.
Retention duration
The default duration during which inference logs are kept and accessible is 72 hours. After this retention period, logs are erased from Beamlit.
Accessing logs
Using the Beamlit console
On the Beamlit console, open a model deployment and get to the Logs tab. You can search for logs, and filter on a specific time window on the page.
Using the API
Read our reference for the metrics API.