High-Performance Inference Practice for Large Language Models on iFLYTEK MaaS Platform
September 14
•
11:25 - 12:00
Location: Venue 3 - 268
Technical introduction of iFLYTEK's PD (Prefill-Decode) separation approach based on open-source engines combined with their own inference service framework.