Non-Audible Speech Classification Using Deep Learning Approaches
Research advancement of human-computer interaction (HCI) has recently been made to help post-stroke victims dealing with physiological problems such as speech impediments due to aphasia. This paper investigates different deep learning approaches used for non-audible speech recognition using electromyography (EMG) signals with a novel approach employing continuous wavelet transforms (CWT) and convolutional neural networks (CNNs). To compare its performance with other popular deep learning approaches, we collected facial surface EMG bio-signals from subjects with binary and multi-class labels, trained and tested four models, including a long-short term memory(LSTM) model, a bi-directional LSTM model, a 1-D CNN model, and our proposed CWT-CNN model. Experimental results show that our proposed approach performs better than the LSTM models, but is less efficient than the 1-D CNN model on our collected data set. In comparison with previous research, we gained insights on how to improve the performance of the model for binary and multi-class silent speech recognition.
Rommel Fernandes, Lei Huang, and Gustavo Vejarano, "Non-Audible Speech Classification Using Deep Learning Approaches," in 6th Annual Conference on Computational Science & Computational Intelligence (CSCI'19), Las Vegas, NV, USA, Dec. 2019
LMU users: use the following link to login and access the article via LMU databases.