v1.69.0-stable - Loadbalance Batch API Models
Deploy this version​
- Docker
- Pip
docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
docker.litellm.ai/berriai/litellm:main-v1.69.0-stable
pip install litellm
pip install litellm==1.69.0.post1
Key Highlights​
LiteLLM v1.69.0-stable brings the following key improvements:
- Loadbalance Batch API Models: Easily loadbalance across multiple azure batch deployments using LiteLLM Managed Files
- Email Invites 2.0: Send new users onboarded to LiteLLM an email invite.
- Nscale: LLM API for compliance with European regulations.
- Bedrock /v1/messages: Use Bedrock Anthropic models with Anthropic's /v1/messages.
Batch API Load Balancing​
This release brings LiteLLM Managed File support to Batches. This is great for:
- Proxy Admins: You can now control which Batch models users can call.
- Developers: You no longer need to know the Azure deployment name when creating your batch .jsonl files - just specify the model your LiteLLM key has access to.
Over time, we expect LiteLLM Managed Files to be the way most teams use Files across /chat/completions, /batch, /fine_tuning endpoints.
Email Invites​
This release brings the following improvements to our email invite integration:
- New templates for user invited and key created events.
- Fixes for using SMTP email providers.
- Native support for Resend API.
- Ability for Proxy Admins to control email events.
For LiteLLM Cloud Users, please reach out to us if you want this enabled for your instance.
New Models / Updated Models​
- Gemini (VertexAI + Google AI Studio)
- Perplexity:
- Azure OpenAI:
- Fixed passing through of azure_ad_token_provider parameter - PR
- OpenAI:
- Added support for pdf url's in 'file' parameter - PR
- Sagemaker:
- Fix content length for
sagemaker_chatprovider - PR
- Fix content length for
- Azure AI Foundry:
- Added cost tracking for the following models PR
- DeepSeek V3 0324
- Llama 4 Scout
- Llama 4 Maverick
- Added cost tracking for the following models PR
- Bedrock:
- OpenAI: Added
reasoning_effortsupport foro3models - PR - Databricks:
- Fixed issue when Databricks uses external model and delta could be empty - PR
- Cerebras: Fixed Llama-3.1-70b model pricing and context window - PR
- Ollama:
- 🆕 Nscale:
- Added support for chat, image generation endpoints - PR
LLM API Endpoints​
- Messages API:
- Moderations API:
- Fixed bug to allow using LiteLLM UI credentials for /moderations API - PR
- Realtime API:
- Fixed setting 'headers' in scope for websocket auth requests and infinite loop issues - PR
- Files API:
- Batches API:
Spend Tracking / Budget Improvements​
- Bug Fix - PostgreSQL Integer Overflow Error in DB Spend Tracking - PR
