ws-sglang

SGLang: An Efficient Open-Source Framework for Large-Scale LLM Serving

September 14

•

10:15 - 10:50

Location: Venue 3 - 268

SGLang is an efficient open-source framework for large-scale LLM serving. Over the past year, SGLang has experienced rapid iteration and significant advancements. This presentation presents an overview of SGLang's leading features, including KV Cache Reuse, Zero-overhead Batch Scheduling, Speculative Decoding, Prefill/Decode Disaggregation, and Large-scale Expert Parallelism.

Speakers

Yi Zhang

Software Engineer, Alibaba Cloud Computing Co., Ltd.