1. Model Serving (Local & Global)


2. Continuous Batching


3. Parallelism Strategies (Pros, Cons, Uniqueness)

A. Data Parallelism

B. Tensor (Model) Parallelism