May 07, 2026
Performance is the key to scaling AI. The MAX Engine, part of the Modular platform, is a high-performance inference engine designed to squeeze every last drop of performance out of your existing CPUs and GPUs, regardless of the model architecture.
MAX Engine provides a single, unified API for running models from PyTorch, TensorFlow, and ONNX. It uses advanced compiler technology to optimize these models for your specific hardware, resulting in significantly lower latency and higher throughput compared to traditional runtimes.
As AI models and hardware continue to evolve, MAX Engine provides a stable and high-performance foundation. Its ability to run the most advanced models with maximum efficiency makes it an essential tool for any organization looking to build a long-term AI strategy.