Analysis of Speech Features for Gender Identification in Tai Language
DOI:
https://doi.org/10.26713/cma.v15i1.2450Keywords:
Machine learning methods, Neural networks, Gender identification, MFCC, Pitch, Formant frequency, ChromaAbstract
The vast number of information packed into the human speech signal makes analysis a tough undertaking. This intricacy is notably noticeable in tasks like speaker recognition, especially when it comes to gender distinction. In this paper, we address this issue by conducting a thorough examination of the effectiveness of various speech features, namely Pitch, Formant Frequency, MFCC (Mel Frequency Cepstral Coefficients), and Chroma, in the context of gender identification in the Tai Language, which is spoken by the Tai people of Assam. In this study, we use machine learning (SVM, KNN, Decision Tree, Neural Network) to analyze speech features (Pitch, Formant Frequency, MFCC, Chroma) for gender identification in the Tai language spoken by the Tai people of Assam. Our results show that MFCC consistently outperforms other features, delivering the highest accuracy rates across all approaches. This demonstrates MFCC’s ability to extract gender information from Tai Language speech signals, suggesting more accurate gender identification systems. Beyond gender identification, our study extends voice analysis in linguistics and improves the application of spoken language data, allowing for improved communication systems and linguistic insights. In summary, our findings highlight the critical significance of MFCC in gender identification in the Tai language, with ramifications that extend far beyond its local context, promising advances in voice analysis and improving our understanding of language and human communication.
Downloads
References
A. A. Abdulsatar, V. V. Davydov, V. V. Yushkova, A. P. Glinushkin and V. Y. Rud, Age and gender recognition from speech signals, Journal of Physics: Conference Series 1410 (2019), 012073, DOI: 10.1088/1742-6596/1410/1/012073.
R. S. Alkhawaldeh, DGR: Gender recognition of human speech using one-dimensional conventional neural network, Scientific Programming 2019(1) (2019), 7213717, DOI: 10.1155/2019/7213717.
A. A. Alnuaim, M. Zakariah, C. Shashidhar, W. A. Hatamleh, H. Tarazi, P. K. Shukla and R. Ratna, Speaker gender recognition based on deep neural networks and ResNet50, Wireless Communications and Mobile Computing 2022 (2022), 4444388, 13 pages, DOI: 10.1155/2022/4444388.
I. Dagher and F. Azar, Improving the SVM gender classification accuracy using clustering and incremental learning, Expert Systems 36(3) (2019), e12372, DOI: 10.1111/exsy.12372.
D. S. Deiv, Gaurav and M. Bhattacharya, Automatic gender identification for hindi speech recognition, International Journal of Computer Applications 31(5) (2011), 1 – 8.
S. Ghosh and S. K. Bandyopadhyay, SVM classifier for human gender classification, International Journal of Applied Research on Information Technology and Computing 7(2) (2016), 100 – 105, DOI: 10.5958/0975-8089.2016.00010.5.
N. P. Gohain and A. Gohain, Tai Bhashar Kathopakathan (Prathamik Path), Publication Centre for Studies in Language, Dibrugarh University, Assam, India (2023).
M. R. Hasan, M. M. Hasan and M. Z. Hossain, How many Mel-frequency cepstral coefficients to be utilized in speech recognition? A study with the Bengali language, The Journal of Engineering 2021(12) (2021), 817 – 827, DOI: 10.1049/tje2.12082.
S. Khanum and M. Sora, Speech based gender identification using feed forward neural networks, in: Proceedings of the National Conference on Recent Trends in Information Technology, Foundation of Computer Science USA, Vol. NCIT2015 (2016), pp. 5 – 8.
O. Mamyrbayev, A. Toleu, G. Tolegen, N. Mekebayev and D. Pham, Neural architectures for gender detection and speaker identification, Cogent Engineering 7(1) (2020), Article: 1727168, DOI: 10.1080/23311916.2020.1727168.
B. Medhi, Analysis of formant frequency F1, F2, and F3 in Assamese vowel phonemes using LPC Model, International Journal of Engineering Research & Technology 6(5) (2017), 616 – 618.
A. Raahul, R. Sapthagiri, K. Pankaj and V. Vijayarajan, Voice based gender classification using machine learning, IOP Conference Series: Materials Science and Engineering 263(4), 042083, DOI: 10.1088/1757-899X/263/4/042083.
L. R. Rabiner and R. W. Schafer, Introduction to digital speech processing, Foundations and Trends® in Signal Processing 1(1-2) (2007), 1 – 194, DOI: 10.1561/2000000001.
E. Ramdinmawii and V. K. Mittal, Gender identification from speech signal by examining the speech production characteristics, in: Proceedings of the 2016 International Conference on Signal Processing and Communication (ICSC), Noida, India (2016), pp. 244 – 249, DOI: 10.1109/ICSPCom.2016.7980584.
P. B. Ramteke, A. A. Dixit, S. Supanekar, N. V. Dharwadkar and S. G. Koolagudi, Gender identification from children’s speech, in: Proceedings of the 2018 Eleventh International Conference on Contemporary Computing (IC3), Noida, India (2018), pp. 1 – 6, DOI: 10.1109/IC3.2018.8530666.
H. A. Sánchez-Hevia, R. Gil-Pita, M. Utrilla-Manso and M. Rosa-Zurera, Age group classification and gender recognition from speech with temporal convolutional neural networks, Multimedia Tools and Applications 81 (2022), 3535 – 3552, DOI: 10.1007/s11042-021-11614-4, 2022.
T. J. Sefara and T. B. Mokgonyane, Gender identification in Sepedi speech corpus, in: 2021 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa (2021), pp. 1 – 6, DOI: 10.1109/icABCD51485.2021.9519308.
S. Sharma, A. Shukla and P. Mishra, Speaker and gender identification on Indian languages using multilingual speech, International Journal of Innovative Science, Engineering & Technology 1(4) (2014), 522 – 525.
M. A. Uddin, R. K. Pathan, M. S. Hossain and M. Biswas, Gender and region detection from human voice using the three-layer feature extraction method with 1D CNN, Journal of Information and Telecommunication 6(1) (2022), 27 – 42, DOI: 10.1080/24751839.2021.1983318.
B. Zhong, Y. Liang, J. Wu, B. Quan, C. Li, W. Wang, J. Zhang and Z. Li, Gender recognition of speech based on decision tree model, in: Proceedings of the 3rd International Conference on Computer Engineering, Information Science & Application Technology (ICCIA 2019), Advances in Computer Science Research series, Vol. 90, 2019, Atlantis Press, DOI: 10.2991/iccia-19.2019.91.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a CCAL that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.