High-acceptance-rate speculative inference schemes represented by EAGLE/MTP are driving the deployment of speculative inference. Speculative inference computes multiple tokens in one inference, which can fully leverage Ascend's high computational density-to-bandwidth ratio. To this end, we developed the high-performance inference framework omniinfer to fully unleash Ascend's performance. Targeting the model structural characteristics of speculative inference schemes like EAGLE and MTP, we optimized the speculative inference scheduling framework, reducing Ascend's idle time, and optimized sampling methods to maintain model accuracy while improving acceptance rates. We also implemented hardware-specific optimizations for Ascend's characteristics.