Please use this identifier to cite or link to this item:
http://202.28.34.124/dspace/handle123456789/2165
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor | Siriwiwat Lata | en |
dc.contributor | สิริวิวัฒน์ ละตา | th |
dc.contributor.advisor | Olarik Surinta | en |
dc.contributor.advisor | โอฬาริก สุรินต๊ะ | th |
dc.contributor.other | Mahasarakham University | en |
dc.date.accessioned | 2023-09-07T14:25:21Z | - |
dc.date.available | 2023-09-07T14:25:21Z | - |
dc.date.created | 2023 | |
dc.date.issued | 31/5/2023 | |
dc.identifier.uri | http://202.28.34.124/dspace/handle123456789/2165 | - |
dc.description.abstract | Sign language is essential for communication with the hearing impaired. It's difficult for normal people to understand so that people can communicate or interpret sign language. This thesis purposes to invent an automatic Thai sign language system that can translate sign language using the proposed deep learning techniques. This method recognized Thai Sign Language covering both Static and Dynamic spellings in Thai Sign Language, including the words in the sentence. In the first approach, we proposed an end-to-end method of recognizing sign language with deep learning based on static Thai sign language (1-stage) from complex images. This approach was based on hand detection from the YOLO v3 architecture to detect the region of interest (ROI) in hand only. We performed feature extraction and quantified the recognition efficiency of five CNNs: MobileNetV2, DenseNet121, InceptionResNetV2, NASNetMobile, and EfficientNetB2 for training. The results showed that DenseNet121 and MobileNetV2 outperformed other CNN models that have high accuracy in image recognition of Thai Sign Language. In the second approach, Thai Sign Language required a variety of gestures (from 2-Step up) to interpret meaning. We proposed Dynamic Sign Language Recognition based on the deep learning approaches: the YOLOv5 algorithm for human detection and a combination of two deep learning methods of the Convolutional Neural Network and Recurrent Neural Network to aid sequence-based image recognition. The RNN used two different LSTM and GRU and numbers of units: 32 and 64. We created various CNN models based on three architectures: MobileNetV2, ResNet50, and DenseNet201. It was called the CNN-LSTM architecture. We decided to add the extra densely connected layer with 64 units between the GAP layer and softmax activation function. The results obtained from the ResNet50-LSTM architecture achieved the highest accuracy on the validation set. In the third approach, Thai Sign Language was generally used to communicate in words that express gestures in a continuous sentence. To increase the number of datasets, we selected 32 and extended data frames 100 times to obtain different datasets. In addition, Thai sign language sentence recognition was used for predicting words in sentences of the Thai sign language base on three CNN: RestNet50, DenseNet121, and VGG16 architectures. The sequence patterns were given to the Conv1D and LSTM to classify words in sentences. Also, we compared the performance of the number of layers of Conv1D consisting of 1-4 layers and BiRNN input sizes 64,128,256,512, respectively. Finally, we used the SoftMax activation function and CTC decoding to help answer sentences before making predictions. The best results using 4 layers of Conv1D after following two RNNs architecture: GRU and LSTM (called BiRNN) were found BiLSTM to be more efficient than BiGRU and RNN size equal to 128.We tested performance with the words error rate (WER) value. Overall, DenseNet121 BiLSTM-based architecture with input RNN size=128, 4 conv1D layers had the highest efficiency WER is 0.4286. | en |
dc.description.abstract | - | th |
dc.language.iso | en | |
dc.publisher | Mahasarakham University | |
dc.rights | Mahasarakham University | |
dc.subject | Thai Fingerspelling Recognition | en |
dc.subject | Hand Gesture Recognition | en |
dc.subject | Sign Language Recognition | en |
dc.subject | Dynamic Fingerspelling Recognition | en |
dc.subject | Convolutional Neural Network | en |
dc.subject | Hand Detection | en |
dc.subject | YOLO Architecture | en |
dc.subject | Long Short-Term Memory | en |
dc.subject | Sequence Pattern | en |
dc.subject | Word Sing Language Recognition | en |
dc.subject | Long Short Term Memory | en |
dc.subject | Extend Data Frames | en |
dc.subject | Connection Temporal Classification | en |
dc.subject | Dynamic Thai Fingerspelling Dataset | en |
dc.subject | Thai Sign Language Sentence Dataset | en |
dc.subject | Thai Sign Language Recognition | en |
dc.subject.classification | Computer Science | en |
dc.subject.classification | Information and communication | en |
dc.subject.classification | Computer science | en |
dc.title | The Automated Sign Language Translation System Using Deep Learning | en |
dc.title | ระบบแปลภาษามืออัตโนมัติโดยใช้การเรียนรู้เชิงลึก | th |
dc.type | Thesis | en |
dc.type | วิทยานิพนธ์ | th |
dc.contributor.coadvisor | Olarik Surinta | en |
dc.contributor.coadvisor | โอฬาริก สุรินต๊ะ | th |
dc.contributor.emailadvisor | olarik.s@msu.ac.th | |
dc.contributor.emailcoadvisor | olarik.s@msu.ac.th | |
dc.description.degreename | Doctor of Philosophy (Ph.D.) | en |
dc.description.degreename | ปรัชญาดุษฎีบัณฑิต (ปร.ด.) | th |
dc.description.degreelevel | Doctoral Degree | en |
dc.description.degreelevel | ปริญญาเอก | th |
dc.description.degreediscipline | สาขาเทคโนโลยีสารสนเทศ | en |
dc.description.degreediscipline | สาขาเทคโนโลยีสารสนเทศ | th |
Appears in Collections: | The Faculty of Informatics |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
61011261004.pdf | 4.07 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.