Home | News & Analysis | Alif adds ExecuTorch support to Ensemble and Balletto MCUs

Alif adds ExecuTorch support to Ensemble and Balletto MCUs

News & Analysis

30 October 2025

Alif adds ExecuTorch support to Ensemble MCUs for Edge AI development

Alif Semiconductor has announced that its latest Ensemble microcontroller products are now compatible with Meta’s ExecuTorch library. ExecuTorch is designed to reduce the size of models developed in PyTorch so that they fit within the limited compute and memory resources typical of embedded Edge systems.

PyTorch is an open-source machine learning framework used in applications such as computer vision, deep learning research, and natural language processing. ExecuTorch extends this ecosystem to resource-constrained devices, including embedded systems, wearables, mobile products, and microcontrollers, enabling developers to deploy PyTorch models at the Edge.

I spoke to Henrik Flodell, Senior Product Marketing Director at Alif Semiconductor, in this Q&A to learn more about the company’s decision to support ExecuTorch, and what this means for its Edge-focused customers.

Why does Alif see PyTorch as a key framework for Edge AI developers?

Although the TensorFlow framework still holds an edge in deployed solutions, we have seen a significant uptick in interest for PyTorch over the last few years among researchers and data scientists for the development of new AI-based use cases (Google Trends, industry adoption vs research interest study), and it is generally expected that PyTorch deployments will soon eclipse TensorFlow in commercial products. ExecuTorch’s main goal is to simplify development and deployment on resource-constrained devices by allowing developers to take models created in PyTorch and run them directly on edge devices such as wearables and microcontrollers, without loss of model accuracy due to intermediate conversions to formats like TFLite or ONNX.

How does ExecuTorch make PyTorch models suitable for embedded Edge systems?

ExecuTorch facilitates a very straightforward migration path from Cloud-based model deployment to deployment directly on embedded devices without the need for complex, intermediate conversion steps. This allows developers to author and train models in the familiar PyTorch environment and then deploy on resource-constrained hardware, helping developers to overcome memory limits, keep power consumption in check, and preserve processing speed and model accuracy.

What challenges did you face in making Ensemble MCUs compatible with ExecuTorch?

Thanks to the built-in support for TOSA (Tensor Operator Set Architecture) in ExecuTorch, we did not run into any significant issues while bringing it up on our silicon. ExecuTorch uses TOSA as an intermediate representation to define and compile models for specific hardware, like the Ethos-U NPUs Alif uses. The ExecuTorch runtime then takes this compiled TOSA program (a .pte file) and executes it on the target device.

Can you explain how PyTorch and ExecuTorch helped achieve high-performance AI on Ensemble MCUs?

ExecuTorch delivers a direct, practical impact on the embedded development lifecycle by allowing engineers to move from a trained model to a working prototype on a target device in a fraction of the time. Developers working in PyTorch can now target a larger ecosystem that includes Edge devices, such as Alif’s Ensemble and Balletto MCUs and fusion processors, reducing development effort and minimising fragmentation.

AOT compilation and memory planning yield higher performance and lower latency, essential for real-time applications, and PyTorch’s robust debugging tools improve developer productivity.

Are there specific types of AI models or applications that perform especially well on Ensemble MCUs using ExecuTorch?

As an example, wearable healthcare devices can be developed with the ability to provide personalised health coaching. Smartwatches and fitness rings can use generative AI to analyse a user’s biometric data (like heart rate, sleep patterns, and activity levels) to provide personalised, real-time coaching, goal setting, and motivation.

Clinical wearables, such as continuous glucose monitors or implantable sensors, can also leverage Edge AI to analyse patient data locally for real-time diagnostics and early warning alerts. Processing the data locally also ensures sensitive patient data remains private and reduces reliance on consistent cloud connectivity.

There are also several use cases in consumer electronics, such as next-generation smart speakers and personal assistants that will be able to move beyond simple command-and-response. Using small, local language models (LLMs), they can generate more natural, conversational, and context-aware responses without sending all voice data to the Cloud. Again, improving privacy and performance.

What do the new demo designs showcase about Ensemble MCUs’ capabilities with PyTorch models?

Generative AI on battery-powered devices enables developers to create highly personalised, context-aware products that operate with enhanced privacy and autonomy. Local data processing provides faster responses and minimises energy and bandwidth requirements by reducing or eliminating the need for transmitting data to the cloud.

How does supporting ExecuTorch benefit Alif’s customers in practical terms, for example, development speed or system efficiency?

One of Alif’s value propositions since day one has been to provide low-power characteristics, not only when the device is in its sleep state, but also when applications are running, data is being collected, and computational workloads are executed. Alif has developed a unique system architecture that enables applications to run in a “High-Efficiency” domain for the always-on part of its duty cycle, and in “High-Performance” mode when there is computationally intensive work to be done. Hardware accelerators for various tasks, including AI operations, graphics, and image processing etc., are available for both modes of operation, letting applications finish the work that needs to be done quickly, and spend more time in low-power states. Alif provides integrated Ethos-based NPUs in their devices that can provide hardware acceleration for convolutional, recurrent, as well as transformer-based networks, which enables ExecuTorch models to be deployed across any device in the Balletto and Ensemble product families.

Will existing Ensemble MCU users be able to adopt ExecuTorch easily, or are there hardware/software prerequisites?

Developers will be able to deploy AI models using ExecuTorch on all Alif devices in the Balletto and Ensemble family.

How do you see the adoption of PyTorch and ExecuTorch shaping the future of Edge AI development for microcontrollers?

The rapid expansion of AI capabilities we have seen over the last few years shows no signs of slowing down, and besides hardware to run the models that are being created, tools and software must be able to keep up with the advancements that are being made. Platforms like ExecuTorch play a key role in this by streamlining the task of moving models from cloud to endpoint.