Filter

ws-sglang

High-Performance Inference Practice for Large Language Models on iFLYTEK MaaS Platform

September 14

11:25 - 12:00

Location: Venue 3 - 268

Technical introduction of iFLYTEK's PD (Prefill-Decode) separation approach based on open-source engines combined with their own inference service framework.

Speakers