Mamba-Enhanced Emotion Analysis TinyML Models for Embedded Devices Deployment
Article 2025 en
Authors
XJ
Xing Jin
SK
Shakir Khan
MH
Mehdi Hosseinzadeh
Abstract
1 min read
The accuracy of emotion analysis has rapidly improved thanks to the breakthroughs of convolutional neural networks (CNNs) and Transformers. Moreover, the multi-head self-attention (MSA) mechanism of Transformer perfectly fits the modeling of the dependency relationship between expressions and different facial regions. However, CNNs struggle to capture global dependencies, and Transformers’ quadratic complexity poses a big challenge to deploy on low-power devices. To resolve these issues, we design a robust and efficient hybrid Tiny machine learning (TinyML) model named HCMTMM for emotion recognition in ultra-low-power embedded devices. Specifically, we propose a hybrid deep model by combining a CNN and Mamba module, which relies on the state space models (SSM) framework, which can effectively exploit the local and global dependencies of different facial regions to enhance emotional recognition performance with linear computational complexity. Moreover, we leverage multi-loss distillation learning to enhance recognition performance. We conducted extensive comparative experiments on four publicly available datasets, and the experimental results showed that when running the family of CNNs, our proposed solution outperforms any other implementation in terms of accuracy and model size. Moreover, we port and test the proposed model on the embedded device ESP32 Cam platform. Our proposed model achieves remarkable results in inference speed.
Discussion(0)
No comments yet. Be the first to comment.