INTEGRATING MULTIPLE MODALITIES FOR ACCURATE EMOTION RECOGNITION: A DEEP LEARNING ENSEMBLE APPROACH

Neeraj Kumar

doi:10.56726/irjmets33651

Back

Shared by

Neeraj Kumar

INTEGRATING MULTIPLE MODALITIES FOR ACCURATE EMOTION RECOGNITION: A DEEP LEARNING ENSEMBLE APPROACH

Article 2023 en

Authors

Neeraj Kumar

Abstract

1 min read

Recognizing emotions on multimodal datasets is a complex task, especially in fields such as Human-Computer Interaction (HCI). This study proposes a multimodal approach for emotion recognition using the MELD dataset, which includes audio, text, and facial features. However, only audio and text features are used in this research. To process the audio data, it is transformed into MFCC and used as input for a bidirectional LSTM that performs emotion classification. For the text data, BERT is used to tokenize the text, which is then fed into another bidirectional LSTM for emotion classification. The results from both modalities are combined using a voting ensemble method, and the model's performance is evaluated using F1-score and confusion matrices. The unimodal audio model achieved an F1-score of 41.69%, while the unimodal text model achieved 47.29%. The voting ensemble model achieved an F1-score of 47.47%. Additionally, this paper discusses potential future research that involves improving deep learning models and combining them with ensemble models to enhance emotion recognition on multimodal datasets.

Discussion(0)

No comments yet. Be the first to comment.

Related publications

Article2020

Article2022

Recognition of Human Inner Emotion Based on Two-Stage FCA-ReliefF Feature Optimization

Lizheng Pan, Shunchao Wang, Zeming Yin, Aiguo Song

Article2024

WISNet: A deep neural network based human activity recognition system

H. Sharen, L. Jani Anbarasi, P. Rukmani, Amir Gandomi, R. Neeraja, Modigari Narendra

Expert Systems with Applications

INTEGRATING MULTIPLE MODALITIES FOR ACCURATE EMOTION RECOGNITION: A DEEP LEARNING ENSEMBLE APPROACH

Abstract

Discussion(0)

Related publications

Outlier Processing in Multimodal Emotion Recognition

Multimodal Information-Based Broad and Deep Learning Model for Emotion Understanding

A Channel-Fused Dense Convolutional Network for EEG-Based Emotion Recognition

Recognition of Human Inner Emotion Based on Two-Stage FCA-ReliefF Feature Optimization

WISNet: A deep neural network based human activity recognition system