Accent based speech recognition: A critical overview

Downloads

DOI:

https://doi.org/10.26637/MJM0804/0070

Abstract

An incredible amount of research has been conducted in speech recognition and accent-based speech recognition during recent decades. Automatic Speech Recognition in various dialects in any natural language is examined as one among the most complicated domains in Automatic Speech Recognition (ASR). The increasing significance of speech recognition in any dialect is attributable to the ever-developing interest for applications that handle humanmachine interaction through geographically influenced natural languages. The objective of this paper is to provide an overview of recent developments in dialect or accent-based speech processing. This paper concentrates on the study of accent-based speech recognition techniques in various languages and the technologies used for the same.

Keywords:

Spoken language identification, Dialect identification, Accent recognition, Speech recognition, acoustic modeling, HMM, DNN, Acoustic features

Mathematics Subject Classification:

Mathematics
  • Pages: 1743-1750
  • Date Published: 01-10-2020
  • Vol. 8 No. 04 (2020): Malaya Journal of Matematik (MJM)

J. K. Chambers and P. Trudgill, Dialectology, Cambridge University Press, Cambridge, 1998.

H. Singh and A.K. Bathla, A survey on speech recognition, Int. J. Adv. Res. Comput. Eng. Technol., 2(6), (2013), 2186-2189.

Y. Zhang, Speech recognition using deep learning algorithms, Stanford Univ., Stanford, CA,USA,Tech.Rep., (2013), 1-5.

M. Liu, B. Xu, T. Hunng, Y. Deng and C. Li, Mandarin accent adaptation based on context-independent/contextdependent pronunciation modeling, In: Proceedings acoustics, speech, and signal processing, 2(2000), 10251028.

M. A. Anusuya, S. K. Katti, Speech Recognition by Machine: A Review, International Journal of Computer Science and Information Security, 6(3), (2009).

A. P. Singh, R. Nath, and S. Kumar, A Survey: Speech Recognition Approaches and Techniques, 2018 5th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON), Gorakhpur, (2018), 1-4.

BhuvaneshwariJolad and R. RajashriKhanai, An Art of Speech Recognition: A Review, 2019 2nd International Conference on Signal Processing and Communication (ICSPC).

M. Levent and JHL. Hansen, Language accent classification in American English, Speech Commun., 18(4), (1996), 353-367.

PA. Torres-Carrasquillo, TP. Gleason and DA. Reynolds, Dialect identification using Gaussian mixture models,

In: ODYSSEY 04-The speaker and language recognition workshop, (2004), 297-300.

A. Hanani, MJ. Russell and MJ. Carey, Human and computer recognition of regional accents and ethnic groups from British English speech, Comput Speech Lang., 27(1), (2013), 59-74.

M. Benzeghiba, R. De Mori, O. Deroo, S. Dupont, T. Erbes, D. Jouvet and C. Wellekens, Automatic speech recognition and speech variability: A review, Speech Communication, 49(10-11), (2007), 763-786.

I. Kardava, J. Antidze and N. Gulua, Solving the problem of the accents for speech recognition systems, International Journal of Signal Processing Systems, 4(3), (2016), 235-238.

F. Biadsy, Automatic dialect and accent recognition and its application to speech recognition, Ph.D. thesis, Graduate School Arts Sci., Columbia Univ., New York City, NY, USA, (2011), 1-171.

A. Pedro, Torres-Carrasquillo, Douglas Sturim, A. Douglas, Reynolds and Alan McCree, Eigen-channel Compensation and Discriminatively Trained Gaussian Mixture Models for Dialect and Accent Recognition, MIT Lincoln Laboratory, Information Systems Technology Group, Lexington, MA, USA.

N. D. Londhe, M. K. Ahirwal and P. Lodha, Machine Learning Paradigms for Speech Recognition of an Indian Dialect, International Conference on Communication and Signal Processing, 2016, India, IEEE.

Ahmed Ali1, Peter Bell, James Glass, YacineMessaoui, Hamdy Mubarak, Steve Renals and Yifan Zhang, The mgb-2 challenge:arabic multi-dialect broadcast media recognition, 2016.

S. Yoo, I. Song and Y. Bengio, A Highly Adaptive Acoustic Model for Accurate Multi-dialect Speech Recognition, ICASSP 2019-IEEE International Conference on Acoustics, Speech and Signal Processing, 2019.

V. V. Sreeraj and R. Rajan, Automatic dialect recognition using feature fusion, 2017 IEEE International Conference on Trends in Electronics and Informatics, 2017.

Q. Zhang and J. H. L. Hansen, Language/Dialect Recognition Based on Unsupervised Deep Learning, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(5), (2018), 873-882.

A. B. Nassif, I. Shahin, I. Attili, M. Azzeh and K. Shaalan, Speech Recognition Using Deep Neural Networks: a Systematic Review, IEEE Access, 2019.

G. Hintonet al, Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE SignalProcess. Mag., 29(6), (2012), 82-97.

K. Rao, and H. Sak, Multi-accent speech recognition with hierarchical grapheme-based models, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.

S. Lokesh, and M. R. Devi, Speech recognition sys-tem using enhanced mel frequency cepstral coefficient with windowing and framing method, Cluster Computing, Springer, 2017.

Yishan Jiao1, Ming Tu, Visar Berisha and Julie Liss, Accent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features, INTERSPEECH, 2016.

K. Kumpf and R. W. King, Foreign speaker accent classification using phoneme-dependent accent discrimination models and comparisons with human perception benchmarks, In Proc. Euro Speech, 4(1997), 2323-2326.

T. Chen, C. Huang, E. Chang and J. Wang, Automatic accent identification using gaussian mixture models, In Automatic Speech Recognition and Understanding, IEEE Workshop on. Madonna di Campiglio, Italy: IEEE, (2001), 343-346.

Y. Zheng, R. Sproat, L. Gu, I. Shafran, H. Zhou, Y. Su, D. Juraf sky, R. Starr, and S. Y. Yoon, Accent detection and speech recognition for shanghai-accented mandarin, In Interspeech, Lisbon, Portugal: Citeseer, (2005), 217220.

H. Tang and A. A. Ghorbani, Accent classification using support vector machine and hidden Markov model, In Advances in Artificial Intelligence, Springer, (2003), 629631.

S. Deshpande, S. Chikkerur and V. Govindaraju, Accent classification in speech, In Automatic Identification Advanced Technologies, Fourth IEEE Workshop on. Buffalo, NY, USA: IEEE, (2005), 139-143.

G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen and T. N. Sainathet, Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, Signal Processing Magazine, IEEE, 29(6), (2012), 82-97.

H. Zen and H. Sak, Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis, In Acoustics, Speech and Signal Processing (I-CASSP), IEEE International Conference on Brisbane, Australia: IEEE, (2015), 44704474.

Y. Xu, J. Du, L. R. Dai, and C. H. Lee, An experimental study on speech enhancement based on deep neural networks, Signal Processing Letters, IEEE, 21(1), (2014), 65-68.

Y. Jiao, M. Tu, V. Berisha, and J. Liss, Online speaking rate estimation using recurrent neural networks, In acoustics, Speech and Signal Processing, IEEE International Conference on Shanghai, China: IEEE, 2016.

A. Rabiee and S. Setayeshi, Persian accents identification using an adaptive neural network, In Second International Workshop on Education Technology and Computer Science, Wuhan, China: IEEE, (2010), 7-10.

S. Sinha, A. Jain and S. S Agrawal, Empirical analysis of linguistic and paralinguistic information for automaticdialect classification, 2017.

A. Etman, and A. A. L. Beex, Language and Dialect Identification: A survey, SAI Intelligent Systems Conference (IntelliSys), 2015.

A. A. Nti, Studying dialects to understand Human Languages, M.S. thesis Massachusetts Institute of Technology, 2009.

Y. Kumar and N. Singh, A Comprehensive View of Automatic Speech Recognition System - A Systematic Literature Review, 2019 International Conference on Automation, Computational, and Technology Management (ICACTM), 2019.

Z. Tan, X. Fan, H. Zhu and E. Lin, Addressing Accent Mismatch In Mandarin-English Code-Switching Speech Recognition, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020.

V. K. Muneer, K. P. Muhamed Basheer and Rababa Kareem Kollathodi, Smart device controlling through voice commands given in Malayalam language, Malaya Journal of Matematik, 5(1), (2019), 445-450.

Gao, J. Feng, Y. Liu, L. Hou, X. Pan, and Y. Ma, Codeswitching sentence generation by bert and generative adversarial networks, Proc. Interspeech, (2019), 35253529.

U. G. Patil, S. D. Shirbahadurkar, and A. N. Paithane, Automatic speech recognition models: A characteristic and performance review, 2016 International Conference on Computing Communication Control and Automation (ICCUBEA), 2016.

Baran Uslu, Hakan Tora, Turkish Regional Dialect Recognition Using Acoustic Features of Voiced Segments, International Journal of Signal Processing Systems, 6(2), (2018).

Haoye Lua, Haolong Zhang, Amit Nayak, A Deep Neural Network for Audio Classification with a Classifier Attention Mechanism, arxiv.org,2006, 2020.

  • NA

Metrics

Metrics Loading ...

Published

01-10-2020

How to Cite

Rizwana Kallooravi Thandil, and K. P. Mohamed Basheer. “Accent Based Speech Recognition: A Critical Overview”. Malaya Journal of Matematik, vol. 8, no. 04, Oct. 2020, pp. 1743-50, doi:10.26637/MJM0804/0070.