Deep Learning Approach for Food Image Recognition

Please use this identifier to cite or link to this item: http://202.28.34.124/dspace/handle123456789/1508

Title:	Deep Learning Approach for Food Image Recognition กระบวนการเรียนรู้เชิงลึกสำหรับการรู้จำรูปภาพอาหาร
Authors:	Sirawan Phiphitphatphaisit ศิรวรรณ พิพิธพัฒน์ไพสิฐ Olarik Surinta โอฬาริก สุรินต๊ะ Mahasarakham University. The Faculty of Informatics
Keywords:	Food Image Recognition Convolution Neural Network Data Augmentation Deep Feature Extraction Method Long short-term memory Adaptive Feature Fusion Technique Spatial and Temporal Features
Issue Date:	14
Publisher:	Mahasarakham University
Abstract:	Food image recognition plays an important role in healthcare applications that monitor eating habits, dietary, nutrition, etc. Therefore, different deep learning approaches are proposed to address food image recognition. This dissertation presents three methods to deal with several challenges in recognizing food images. Chapter 1 briefly introduces food image recognition systems and the research questions. Additionally, the objectives of the dissertation and contributions are described. Chapter 2 proposed a new CNN model that modified MobileNetV1 architecture by decreasing the parameters but still achieved high accuracy. I replaced the average pooling layer and the fully connected layer (FC) with the global average pooling layer (GAP), followed by the batch normalization layer (BN) and rectified linear unit (ReLU) activation function. Moreover, I added the dropout layer to consider avoiding overfitting. The experimental results show that modified MobileNetV1 architecture significantly outperforms other architectures when the data augmentation techniques are combined. Chapter 3 concentrated extracted robust features using the deep feature extraction technique. Firstly, I extracted the spatial features using CNN architectures. The spatial features were transferred into the Conv1D-LSTM network to extract the temporal feature. Finally, the deep features were classified using the softmax function. I presented six state-of-the-art CNN architectures, VGG16, VGG19, ResNet50, DenseNet201, MobileNetV1, and MobileNetV2, to extract the robust spatial features. The experimental results found that the ResNet50+Conv1D-LSTM network significantly outperformed other CNNs on the ETH food-101 dataset. Chapter 4 presented an adaptive feature fusion network (ASTFF-Net) combining state-of-the-art CNN models and the LSTM network. Firstly, I extracted the spatial features using state-of-the-art ResNet50 architecture. Secondly, the temporal features were extracted using the LSTM network. Thirdly, the spatial-temporal features mapped to a similar resolution before concatenating. The experimental results showed that the ASTFF-Net achieved the best performances and outperformed other methods on Food11, UEC Food-100, UEC Food-256, and ETH Food-101. Chapter 5 comprises two main sections: the answers to the research questions and suggestions for future work. This chapter briefly explains the proposed approaches and answers three main research questions in food image recognition. Two main methods are planned and will be focused on in future work. The first is to reduce the training data size by applying the instance selection techniques to decrease computation time. The second is to focus on an instance segmentation technique that can segment and learn only at the exact food location, which will improve the performance of the food image recognition system. -
Description:	Doctor of Philosophy (Ph.D.) ปรัชญาดุษฎีบัณฑิต (ปร.ด.)
URI:	http://202.28.34.124/dspace/handle123456789/1508
Appears in Collections:	The Faculty of Informatics

Files in This Item:

File	Description	Size	Format
61011261005.pdf		2.91 MB	Adobe PDF	View/Open

Show full item record