Facial Action Unit Detection with Deep Convolutional Neural Networks
General Material Designation
[Thesis]
First Statement of Responsibility
Padwal, Siddhesh
Subsequent Statement of Responsibility
Mahoor, Mohammad
.PUBLICATION, DISTRIBUTION, ETC
Name of Publisher, Distributor, etc.
University of Denver
Date of Publication, Distribution, etc.
2020
PHYSICAL DESCRIPTION
Specific Material Designation and Extent of Item
84
DISSERTATION (THESIS) NOTE
Dissertation or thesis details and type of degree
M.S.
Body granting the degree
University of Denver
Text preceding or following the note
2020
SUMMARY OR ABSTRACT
Text of Note
The facial features are the most important tool to understand an individual's state of mind. Automated recognition of facial expressions and particularly Facial Action Units defined by Facial Action Coding System (FACS) is challenging research problem in the field of computer vision and machine learning. Researchers are working on deep learning algorithms to improve state of the art in the area. Automated recognition of facial action units has man applications ranging from developmental psychology to human robot interface design where companies are using this technology to improve their consumer devices (like unlocking phone) and for entertainment like FaceApp. Recent studies suggest that detecting these facial features, which is a multi-label classification problem, can be solved using a problem transformation approach in which multi-label problems converted into single-label problem with BinaryRelevance classifier. In this thesis, convolutional neural network is used as it can go substantially deeper, more accurate, though requires lots of data to train the algorithm. It usually results in a significant feature map obtained from each layer of the network. We introduce Modified DenseNet considering DenseNet as a baseline model. Averaging all the features obtained from each block of DenseNet gives importance to each level of features which can get lost during concatenating the layers in DenseNet and other state of the art classification models. Detection of Facial Action Units (AUs) can be determined by selecting threshold for the probabilities obtained by training the Modified DenseNet model. Threshold selection can be done with the help of Matthew Correlation Coefficient. Using Matthew Correlation Coefficient, AU correlation can take into account which was missing for previous studies using BinaryRelevance classifier as it does not consider label's correlation because it treats every target variable independently. Modifying DenseNet model helped to improve results by reusing features and alleviating the vanishing-gradient problem. We evaluated our proposed architecture on a competitive Facial Action Unit Detection task (EmotioNet) database which includes 950,000 images with annotated AUs. Modified DenseNet obtain significant improvements over the state-of-the-art methods on most of them by comparing with the accuracy and other metrics of evaluation and requiring less computation time as compared to problem transformation methods.