3D-CNN for Facial Emotion Recognition in Videos

7 Dec 2020  ·  Jad Haddad, Olivier Lezoray, Philippe Hamel ·

In this paper, we present a video-based emotion recognition neural network operating on three dimensions. We show that 3D convolutional neural networks (3D-CNN) can be very good for predicting facial emotions that are expressed over a sequence of frames. We optimize the 3D-CNN architecture through hyper-parameters search, and prove that this has a very strong influence on the results, even if architecture tuning of 3D CNNs has not been much addressed in the literature. Our proposed resulting architecture improves over the results of the state-of-the-art techniques when tested on the CK+ and Oulu-CASIA datasets. We compare the results with cross-validation methods. The designed 3D-CNN yields a 97.56% using Leave-One-Subject-Out cross-validation, and 100% using 10-fold cross-validation on the CK+ dataset, and 84.17% using 10-fold cross-validation on the Oulu-CASIA dataset.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods