lorax
https://github.com/predibase/lorax
Python
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported1 Subscribers
Add a CodeTriage badge to lorax
Help out
- Issues
- otel fixups
- Passing a `--revision` causes failure in loading tokenizer config
- if LoRAX is based on punica kernels will it be able to support LoRA Adapters for Mistral NeMO 12B?
- Support Lora Adapter generated from mistral-finetune
- LORAX_USE_GLOBAL_HF_TOKEN is not applied at the first time of calling adapter from huggingface private hub
- LORAX_USE_GLOBAL_HF_TOKEN to be applied correctly even though request doesn't have api_token
- Stop word is included on phi-2
- RuntimeError: CUDA error: no kernel image is available for execution on the device
- Refactor the lora load function for clarity and simplicity
- Adding Whisper model
- Docs
- Python not yet supported