Speech Emotion Recognition (SER) dengan Metode Bidirectional LSTM

Maryamah Maryamah; Nicholas Juan Kalvin Pradiptamurty; Hafiyyah Khayyiroh Shafro; Mohammad Sihabudin Al Qurtubi; Giovanny Alberta Tambahjong; Qothrotunnidha' Almaulidiyah

doi:10.33005/senada.v3i1.105

Maryamah Maryamah Universitas Airlangga
Nicholas Juan Kalvin Pradiptamurty Universitas Airlangga
Hafiyyah Khayyiroh Shafro Universitas Airlangga
Mohammad Sihabudin Al Qurtubi Universitas Airlangga
Giovanny Alberta Tambahjong Universitas Airlangga
Qothrotunnidha' Almaulidiyah Universitas Airlangga

DOI: https://doi.org/10.33005/senada.v3i1.105

Keywords: Speech Emotion Recognition, Bidirectional Long short-term memory (Bi-LSTM), Audio Classification

Abstract

Emotions are a part of humans as a form of response to experienced events. Emotion analysis or known as speech emotion recognition (SER) is a field many researchers are interested in because voice recognition systems can assist in criminal investigations, monitoring, and detection of potentially dangerous events, and assisting the health care system. Therefore, this study proposes the detection of SER using the Bidirectional Long short-term memory (Bi-LSTM) model approach. The dataset used was scraped on the YouTube platform. The dataset is manually labeled then feature extraction is performed using the Mel Frequency Cepstral Coefficients (MFCC). The experiment using the Bi-LSTM method has an AUC ROC value of 0.97 and an f1-score value of 0.878. Based on these results, it can be concluded that the performance of the proposed method succeeded in predicting SER better than other comparison methods. This model also proved to be more precise in classifying human voices based on four types of emotions, namely happy, sad, angry, and neutral.

Downloads

Download data is not yet available.

Speech Emotion Recognition (SER) dengan Metode Bidirectional LSTM

Abstract

Downloads

Most read articles by the same author(s)