lorax
https://github.com/predibase/lorax
Python
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported1 Subscribers
Add a CodeTriage badge to lorax
Help out
- Issues
- Add helm chart to OCI repo
- Error: Warmup(Generation("'bool' object has no attribute 'dtype'"))
- Ensure api_token is not included in the response on error
- Quickstart example not working
- Question regarding Punica integeration
- Fuse q,k,v LoRAs
- Server error: This model was initialized with the adapter xxx and therefore does not support dynamic adapter loading. Please initialize a new model instance from the base model in order to use the dynamic adapter loading feature.
- Multiple base models
- mixtral adapters returning broadcast shape error
- performance issue
- Docs
- Python not yet supported