Speech Emotion Recognition (SER) dengan Metode Bidirectional LSTM

  • Maryamah Maryamah Universitas Airlangga
  • Nicholas Juan Kalvin Pradiptamurty Universitas Airlangga
  • Hafiyyah Khayyiroh Shafro Universitas Airlangga
  • Mohammad Sihabudin Al Qurtubi Universitas Airlangga
  • Giovanny Alberta Tambahjong Universitas Airlangga
  • Qothrotunnidha' Almaulidiyah Universitas Airlangga
Keywords: Speech Emotion Recognition, Bidirectional Long short-term memory (Bi-LSTM), Audio Classification

Abstract

Emotions are a part of humans as a form of response to experienced events. Emotion analysis or known as speech emotion recognition (SER) is a field many researchers are interested in because voice recognition systems can assist in criminal investigations, monitoring, and detection of potentially dangerous events, and assisting the health care system. Therefore, this study proposes the detection of SER using the Bidirectional Long short-term memory (Bi-LSTM) model approach. The dataset used was scraped on the YouTube platform. The dataset is manually labeled then feature extraction is performed using the Mel Frequency Cepstral Coefficients (MFCC). The experiment using the Bi-LSTM method has an AUC ROC value of 0.97 and an f1-score value of 0.878. Based on these results, it can be concluded that the performance of the proposed method succeeded in predicting SER better than other comparison methods. This model also proved to be more precise in classifying human voices based on four types of emotions, namely happy, sad, angry, and neutral.

Downloads

Download data is not yet available.
Published
2023-11-07
How to Cite
Maryamah, M., Pradiptamurty, N., Shafro, H., Al Qurtubi, M., Tambahjong, G., & Almaulidiyah, Q. (2023, November 7). Speech Emotion Recognition (SER) dengan Metode Bidirectional LSTM. PROSIDING SEMINAR NASIONAL SAINS DATA, 3(1), 153-161. https://doi.org/https://doi.org/10.33005/senada.v3i1.105

Most read articles by the same author(s)