Jiby Mariya Jose Profile Jiby Mariya Jose

Energy-reduced bio-inspired 1d-cnn for audio emotion recognition

  • Authors Details :  
  • Jiby Mariya Jose,  
  • Jeeva Jose

Journal title : International Journal of Scientific Research in Computer Science, Engineering and Information Technology

Publisher : Technoscience Academy

Online ISSN : 2456-3307

Page Number : 1034-1054

Journal volume : 11

Journal issue : 3

1K Views Original Article

This paper proposes EPyNet, a deep learning architecture designed for energy reduced audio emotion recognition.In the domain of audio based emotion recognition, where discerning emotional cues from audio input is crucial, the integration of artificial intelligence techniques has sparked a transformative shift in accuracy and performance.Deep learning , renowned for its ability to decipher intricate patterns, spearheads this evolution. However, the energy efficiency of deep learning models, particularly in resource-constrained environments, remains a pressing concern. Convolutional operations serve as the cornerstone of deep learning systems. However, their extensive computational demands leading to energy-inefficient computations render them as not ideal for deployment in scenarios with limited resources. Addressing these challenges, researchers came up with one-dimensional convolutional neural network (1D CNN) array convolutions, offering an alternative to traditional two-dimensional CNNs, with reduced resource requirements. However , this array-based operation reduced the resource requirement, but the energy-consumption impact was not studied. To bridge this gap, we introduce EPyNet, a deep learning architecture crafted for energy efficiency with a particular emphasis on neuron reduction. Focusing on the task of audio emotion recognition, We evaluate EPyNet on five public audio corpora-RAVDESS, TESS, EMO DB, CREMA D, and SAVEE.We propose three versions of EPyNet, a lightweight neural network designed for efficient emotion recognition, each optimized for different trade-offs between accuracy and energy efficiency. Experimental results demonstrated that the 0.06M EPyNet reduced energy consumed by 76.5% while improving accuracy by 5% on RAVDESS, 25% on TESS, and 9.75% on SAVEE. The 0.2M and 0.9M models reduced energy consumed by 64.9% and 70.3%, respectively. Additionally, we compared our Proposed 0.06M system with the MobileNet models on the CIFAR-10 dataset and achieved significant improvements. The 1035 proposed system reduces energy by 86.2% and memory by 95.7% compared to MobileNet, with a slightly lower accuracy of 0.8%. Compared to MobileNetV2, it improves accuracy by 99.2% and reduces memory by 93.8%. When compared to MobileNetV3, it achieves 57.2% energy reduction, 85.1% memory reduction, and a 24.9% accuracy improvement. We further test the scalability and robustness of the proposed solution on different data dimensions and frameworks.

Article DOI & Crossmark Data

DOI : https://doi.org/10.32628/CSEIT25113386

Article Subject Details


Article Keywords Details



Article File

Full Text PDF



More Article by Jiby Mariya Jose

An efficient sarcasm detection in audio using parameter-reduced depthwise cnn

In this study, we implement a lightweight cnn for sarcasm detection using audio input. to achieve this goal, we propose depthfire block. we propose a lightweight version of the tra...

A lightweight deep learning framework using resource-efficient batch normalization for sarcasm detection

Communication is not always direct; it often involves nuanced elements like humor, irony, and sarcasm. this study introduces a novel two-level approach for sarcasm detection, lever...

Optimizing neural network energy efficiency through low-rank factorisation and pde-driven dense layers

S deep learning models continue to grow in complexity, the computational and energy demands associated with their training and deployment are becomingincreasingly significant, part...