Learn about the distributed techniques of Colossal-AI to maximize the runtime performance of your large neural networks.
Learn about the distributed techniques of Colossal-AI to maximize the runtime performance of your large neural networks.
Download and installation
Quick demo
Usage examples
Overview
Distributed Training
Paradigms of Parallelism
Train GPT Using Hybrid Parallelism
Meet Gemini:The Heterogeneous
Memory Manager of Colossal-AI
Introduction
Launch distributed jobs
Tensor Parallel Micro-Benchmarking
Introduction
Feature specification
Global hyper-parameter
Submit your Colossal-AI project
Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart distributed training and inference in a few lines.
Data Parallelism
Pipeline Parallelism
1D, 2D, 2.5D, 3D Tensor Parallelism
Sequence Parallelism
Zero Redundancy Optimizer (ZeRO)
Auto-Parallelism
PatrickStar.
Parallelism based on the configuration file
An open-source solution for cloning ChatGPT with a complete RLHF pipeline. [code] [blog] [demo] [tutorial]
70 billion parameter LLaMA3 model training accelerated by 18%