fairscale

We have hosted the application fairscale in order to run this application in our online workstations with Wine or directly.

Run fairscale online

Quick description about fairscale:

FairScale is a collection of PyTorch performance and scaling primitives that pioneered many of the ideas now used for large-model training. It introduced Fully Sharded Data Parallel (FSDP) style techniques that shard model parameters, gradients, and optimizer states across ranks to fit bigger models into the same memory budget. The library also provides pipeline parallelism, activation checkpointing, mixed precision, optimizer state sharding (OSS), and auto-wrapping policies that reduce boilerplate in complex distributed setups. Its components are modular, so teams can adopt just the sharding optimizer or the pipeline engine without rewriting their training loop. FairScale puts emphasis on correctness and debuggability, offering hook points, logging, and reference examples for common trainer patterns. Although many ideas have since landed in core PyTorch, FairScale remains a valuable reference and a practical toolbox for squeezing more performance out of multi-GPU and multi-node jobs.

Features:

Fully Sharded Data Parallel style parameter, grad, and optimizer sharding
Pipeline parallelism utilities with schedule control
Activation checkpointing to trade compute for memory
Optimizer State Sharding (OSS) drop-in optimizers
Mixed precision and auto-wrap policies for easy adoption
Examples and hooks for production-grade distributed training

Programming Language: Python.
Categories:

Libraries

Page navigation:

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.