Please use this identifier to cite or link to this item: http://202.28.34.124/dspace/handle123456789/2165
Title: The Automated Sign Language Translation System Using Deep Learning
ระบบแปลภาษามืออัตโนมัติโดยใช้การเรียนรู้เชิงลึก
Authors: Siriwiwat Lata
สิริวิวัฒน์ ละตา
Olarik Surinta
โอฬาริก สุรินต๊ะ
Mahasarakham University
Olarik Surinta
โอฬาริก สุรินต๊ะ
olarik.s@msu.ac.th
olarik.s@msu.ac.th
Keywords: Thai Fingerspelling Recognition
Hand Gesture Recognition
Sign Language Recognition
Dynamic Fingerspelling Recognition
Convolutional Neural Network
Hand Detection
YOLO Architecture
Long Short-Term Memory
Sequence Pattern
Word Sing Language Recognition
Long Short Term Memory
Extend Data Frames
Connection Temporal Classification
Dynamic Thai Fingerspelling Dataset
Thai Sign Language Sentence Dataset
Thai Sign Language Recognition
Issue Date:  31
Publisher: Mahasarakham University
Abstract: Sign language is essential for communication with the hearing impaired. It's difficult for normal people to understand so that people can communicate or interpret sign language. This thesis purposes to invent an automatic Thai sign language system that can translate sign language using the proposed deep learning techniques. This method recognized Thai Sign Language covering both Static and Dynamic spellings in Thai Sign Language, including the words in the sentence. In the first approach, we proposed an end-to-end method of recognizing sign language with deep learning based on static Thai sign language (1-stage) from complex images. This approach was based on hand detection from the YOLO v3 architecture to detect the region of interest (ROI) in hand only. We performed feature extraction and quantified the recognition efficiency of five CNNs: MobileNetV2, DenseNet121, InceptionResNetV2, NASNetMobile, and EfficientNetB2 for training. The results showed that DenseNet121 and MobileNetV2 outperformed other CNN models that have high accuracy in image recognition of Thai Sign Language.  In the second approach, Thai Sign Language required a variety of gestures (from 2-Step up) to interpret meaning. We proposed Dynamic Sign Language Recognition based on the deep learning approaches: the YOLOv5 algorithm for human detection and a combination of two deep learning methods of the Convolutional Neural Network and Recurrent Neural Network to aid sequence-based image recognition. The RNN used two different LSTM and GRU and numbers of units: 32 and 64. We created various CNN models based on three architectures: MobileNetV2, ResNet50, and DenseNet201. It was called the CNN-LSTM architecture. We decided to add the extra densely connected layer with 64 units between the GAP layer and softmax activation function. The results obtained from the ResNet50-LSTM architecture achieved the highest accuracy on the validation set.  In the third approach, Thai Sign Language was generally used to communicate in words that express gestures in a continuous sentence. To increase the number of datasets, we selected 32 and extended data frames 100 times to obtain different datasets. In addition, Thai sign language sentence recognition was used for predicting words in sentences of the Thai sign language base on three CNN: RestNet50, DenseNet121, and VGG16 architectures. The sequence patterns were given to the Conv1D and LSTM to classify words in sentences. Also, we compared the performance of the number of layers of Conv1D consisting of 1-4 layers and BiRNN input sizes 64,128,256,512, respectively. Finally, we used the SoftMax activation function and CTC decoding to help answer sentences before making predictions. The best results using 4 layers of Conv1D after following two RNNs architecture: GRU and LSTM (called BiRNN) were found BiLSTM to be more efficient than BiGRU and RNN size equal to 128.We tested performance with the words error rate (WER) value. Overall, DenseNet121 BiLSTM-based architecture with input RNN size=128, 4 conv1D layers had the highest efficiency WER is 0.4286.
-
URI: http://202.28.34.124/dspace/handle123456789/2165
Appears in Collections:The Faculty of Informatics

Files in This Item:
File Description SizeFormat 
61011261004.pdf4.07 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.