Multimodal Large Language Model Practices at Ant Group
September 14
•
11:25 - 12:00
Location: Venue 4 - 338
Introduction to Ant Group's Ming-Omni series open-source work in the large model field, sharing practices and progress in multimodal large model architecture evolution, cross-modal fusion, and unified generation and understanding. Through joint design and optimization of model architecture and training processes, we are committed to building comprehensive modal capabilities for models, achieving multimodal foundation models that can see, hear, speak, and draw.