deepspeed
https://github.com/microsoft/deepspeed
Python
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported11 Subscribers
Add a CodeTriage badge to deepspeed
Help out
- Issues
- [BUG] Deepspeed inference time distribution and max tokens
- [BUG] ZeRO Stage 2 seems to train MoE models incorrectly
- Fixed bug with hybrid engine generation when inference_tp_size > 1
- [BUG] The code for deepspeed.comm.comm.monitored_barrier()
- [BUG] High memory usage on first GPU, despite perfectly-balanced stages in pipeline
- How to get average loss across all ranks using custom loss function
- [BUG] container dose
- [BUG] Activation Offloading with Residual-Type Connections
- [REQUEST]: Add Meta-Transformer: A Unified Framework for Multimodal Learning
- Fix assert on Lamb optimizers with BF16
- Docs
- Python not yet supported