Recognition Of Emotion In Speech Using Variogram Based Features

Main Article Content

Zeynab Esmaileyan
Hosein Marvi

Abstract

Speech Emotion Recognition (SER) is a relatively new and challenging branch in speech processing area. In this study, we propose new features derived from speech spectrogram using image processing techniques for emotion recognition. For this purpose, variogram graphs are calculated from speech spectrogram. The significant Discrete Cosine Transform (DCT) coefficients of variogram are used as proposed features. The contribution of these features as a complementary for the widely used prosodic and spectral features is also investigated. The feature selection is performed using Fisher Discriminant Ratio (FDR) filtering method. Finally, a linear Support Vector Machine (SVM) classifier is employed. All results are achieved under the 10 fold cross-validation on the Berlin and PDREC speech databases. Our results show that combining the proposed features with prosodic and spectral features significantly improves the classification accuracy. For Berlin database, when the proposed features were added to the prosodic and spectral ones, the recognition rates were improved from 83.18% and 89.36% to 86.82% and 90.43% for females and males, respectively. Also, on the PDREC, combining the proposed features with the prosodic and spectral features improve the recognition rate of females and males by 3.72% and 0.27%, respectively. For this database, the best classification accuracy of 63.18% and 57.37% were obtained for females and males, respectively.

Downloads

Download data is not yet available.

Article Details

How to Cite
Esmaileyan, Z., & Marvi, H. (2014). Recognition Of Emotion In Speech Using Variogram Based Features. Malaysian Journal of Computer Science, 27(3), 156–170. Retrieved from https://jml.um.edu.my/index.php/MJCS/article/view/6811
Section
Articles