The DARPA MOCHA program aims to reinvent compiler technology for the age of heterogeneous computing. Modern systems combine CPUs, GPUs, and specialized accelerators—but today’s compilers require significant manual engineering to support each new hardware type. MOCHA seeks to reduce the cost of supporting next-generation hardware: using machine learning and advanced optimization to automatically model new architectures, select optimizations, and generate efficient code with minimal human intervention.
DARPA’s program-level goals include:
- Reducing human effort to adapt compilers to new hardware by 90%
- Improving code performance (throughput, power, and memory efficiency) by up to 5× over current compilers
- Building an open, extensible compiler foundation for future heterogeneous architectures
Our Approach
Aarno Labs leads the Compiler 2.0 team under MOCHA, partnering with MIT CSAIL and the University of Illinois Urbana–Champaign (UIUC) to create the nnext generation of data-driven, verifiable compiler infrastructure. Our project leverages neuro-symbolic techniques, deep neural networks with formal verification and data-frugal performance modeling, all built into robust compiler infrastructures.
Key Technical Contributions
- Representations (MIT – Prof. Saman Amarasinghe) Development of a new intermediate representations (IR) enabling more expressive and optimizable compiler transformations.
- Rewrite System and Optimization Synthesis (MIT – Profs. Amarasinghe & Ragan-Kelley) Automated generation of equivalence-preserving rewrite rules using LLMs, combined with an equality-saturation engine that explores and selects optimal program variants guided by learned cost models.
- Formal Validation (MIT – Prof. Adam Chlipala) A Rocq-verified proof framework ensures all rewrite rules and extracted optimizations are semantically correct, creating a mathematically trusted compiler optimization pipeline.
- Performance Modeling and Autotuning (UIUC – Profs. Charith Mendis & Edgar Solomonik) Data-frugal cost modeling using tensor completion and transfer learning techniques to predict performance from limited samples, enabling fast adaptation to new architectures.
- Architecture Description via Rewrite System (MIT – Profs. Amarasinghe & Armando Solar-Lezama) A novel method for expressing hardware instruction sets as rewrite rules into a generic IR, allowing automatic retargeting to new accelerators with minimal manual effort.
- Learning-Guided Optimization and Surrogate Compilation (MIT – Prof. Michael Carbin) Development of machine learning–based frameworks that learn optimization policies and performance surrogates directly from empirical data while maintaining verified correctness through integration with the rewrite and validation framework.
- Infrastructure Integration and Program Management (Aarno Labs – Dr. Michael Gordon) Aarno integrates all research components into production-quality capabilities, ensuring robust, tested, and documented open-source releases. Aarno manages the end-to-end toolchain, DARPA reporting, and Year 1–3 evaluation readiness across MIT and UIUC partners.
Impact
MOCHA will produce next-generation open-source compiler frameworks that learn to optimize for new hardware and formally guarantees correctness.
By combining machine learning with verified transformation, Compiler 2.0 will reduce the human effort required to support emerging architectures, accelerate compiler retargeting, and enable trustworthy AI-assisted code generation for heterogeneous systems.
Funding Source
DARPA: Machine Learning and Optimization-Guided Compilers for Heterogeneous Architectures (MOCHA)
Program Dates
Start: September, 2025
End: September, 2028