Orientation Paper Reading List | Notion

<aside> <img src="/icons/info-alternate_gray.svg" alt="/icons/info-alternate_gray.svg" width="40px" /> Papers are ordered by importance in each section. Start with reading first one.

</aside>

Decentralized Training and Inference

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Decentralized Training of Foundation Models in Heterogeneous Environments
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
HexGen: Generative Inference of Large-Scale Foundation Model over Heterogeneous Decentralized Environment
Distributed Deep Learning In Open Collaborations
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts
DiLoCo: Distributed Low-Communication Training of Language Models
FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPU

Model Stealing/Imitation

Byzantine Gradient Robustness