TinyML, the Next (Big) Thing

In the expansive landscape of machine learning, where attention often gravitates towards the Large Language Models (LLMs), a quiet revolution is underway — the ascent of TinyML. In stark contrast to the grandeur of LLMs, TinyML represents the complete opposite, focusing on deploying machine learning models directly onto microcontrollers.

What are Microcontrollers?

Microcontrollers are compact integrated circuits designed to execute specific tasks within embedded systems. It can be seen as tiny computers operating with a modest footprint, allowing them to be even run on battery. They are used in various applications, ranging from household appliances to industrial machinery.

Here’s follows a comparison of key aspects of microcontrollers versus traditional computing devices:

Aspect	Microcontrollers	Traditional Computing
Memory	8KB - 8MB	GBs
Flash Storage	32KB - 8MB	GBs - TBs
Power Consumption	Milliwatts	Watts - Kilowatts
Processing Power	30 - 400 MHz	GHz
Price	< 15€	1k - 10k€

As evident from the comparison, microcontrollers operate with orders of magnitude difference in these key aspects. This vast disparity highlights the unique position microcontrollers occupy in the computing landscape.

However, this dissimilarity brings forth significant challenges when it comes to deploying machine learning models on microcontrollers.

Challenges in deploying ML on microcontrollers

When contemplating the implementation of machine learning on microcontrollers, significant challenges arise. For instance, in the case of image classification, even lightweight models are often too large for microcontrollers. An illustrative example is MobileNet-V2, which has a model size of 14.0 MB.

The primary question revolves around reducing the model size to make it compatible with microcontrollers. This involves addressing specific limitations:

Limited Memory: Microcontrollers typically operate within memory constraints ranging from 8KB to 8MB, a significant departure from the gigabytes of RAM found in traditional computing devices.
Flash Storage Constraints: Microcontrollers offer only a fraction of the storage space available in traditional computing devices, with flash storage capacities ranging from 32KB to 8MB.
Modest Processing Power: Operating in the range of 30 to 400 MHz, microcontrollers exhibit a substantial reduction in processing power compared to traditional computing devices operating in the gigahertz range.

Given those limitations, this prompts the fundamental question: Why opt for TinyML, i.e. for deploying machine learning models on microcontrollers?

Why opt for TinyML?

Well, it offers several advantages:

Reduced Data Transmission: Processing data directly on the edge minimizes the need for transmitting large volumes of raw data over networks, conserving bandwidth and enhancing privacy.
Lower Latency: On-device processing eliminates the latency associated with sending data to remote servers, crucial for real-time decision-making in applications like autonomous systems and IoT devices.
Cost Efficiency: Edge computing with TinyML reduces reliance on powerful cloud-based servers, leading to cost savings in infrastructure and data storage. This makes machine learning more accessible to resource-constrained applications and industries.

The synergy of addressing microcontroller constraints and adopting on-device processing places TinyML as a important field of research, introducing an era where intelligence is embedded at the edge rather than centralized.

How to shrink ML models?

Efforts to reduce the model size and accommodate microcontrollers involve various optimization techniques. However, it’s crucial to note that these optimizations may come at a cost, potentially affecting the model’s performance. Here are some strategies:

Lighter Neural Network Architectures: Opt for models with fewer parameters and reduced operational complexity. Tailor ML architectures to the specific constraints of microcontrollers, emphasizing efficiency.
Quantization: Reduce the precision of numerical values in the model.
Pruning: Remove unnecessary connections in the neural network, reducing its size.
Distillation: Train a smaller model to replicate a larger, more complex model.

When selecting optimization techniques, consider the specific requirements of your case and be mindful of the potential impact on the model’s capabilities.

Shrinking ML models: Beyond microcontrollers

The ability to reduce the footprint of machine learning models holds significance for various reasons:

Cost efficiency: Shrinking model size leads to reduced infrastructure costs. With smaller models, you require less computational power to execute tasks, translating into cost savings for cloud-based or on-premises AI solutions.
Lower latency: Smaller models contribute to decreased latency in AI applications. Faster model inference times enable quicker responses, enhancing the overall user experience in real-time applications.
Bandwidth conservation: Reduced model size translates to smaller model files, conserving bandwidth during model deployment and updates. This is particularly crucial for applications operating in bandwidth-limited scenarios or with constraints on data transmission.

In summary, mastering the art of shrinking model size not only benefits microcontroller applications but is a fundamental skill for anyone involved in AI development. It aligns with the broader goals of cost-effectiveness, low-latency experiences, and efficient bandwidth utilization.

Conclusion

In the field of machine learning, TinyML arises as a crucial force, enabling always-on machine learning solutions directly on microcontrollers.

One of the key task in TinyML revolves around reducing the size of machine learning models to make them fit the microcontrollers constraints. This involves leveraging well-established optimization techniques such as pruning, quantization, and distillation.

Mastering the skill of shrinking model size is not only crucial for microcontroller enthusiasts but also an essential proficiency for AI developers, aligning with broader goals of efficiency to any AI solution, whether deployed on the cloud or on premises. This approach holds the promise of significant benefits, including reduced infrastructure costs, lower latency, and bandwidth conservation.

Moreover, ongoing research is likely to unveil new optimization techniques, enhancing not only the capabilities of TinyML but also contributing to advancements in the broader field of machine learning.

At Arhs Spikeseed, we’re dedicated to tackling the challenges of each project. We carefully consider our clients’ hardware, budget, and latency requirements, whether we’re working on in-house solutions, cloud setups, or edge devices projects. As an integral part of our process, we incorporate model size reduction, ensuring our solutions are production-ready.

In conclusion, TinyML represents the democratization of intelligence, extending accessibility to machine learning across industries. Within the realm of TinyML, the blend of innovation, creativity, and efficiency points to a future where intelligence transcends constraints.