Self-Supervised Learning via Multi-view Facial Rendezvous for 3D/4D Affect Recognition

In this paper, we present Multi-view Facial Rendezvous (MiFaR): a novel multi-view self-supervised learning model for 3D/4D facial affect recognition. Our self-supervised learning architecture has the capability to learn collaboratively via multi-views. For each view, our model learns to compute the embeddings via different encoders and robustly aims to correlate two distorted versions of the input batch. We additionally present a novel loss function that not only leverages the correlation associated with the underlying facial patterns among multi-views but it is also robust and consistent towards different batch sizes. Finally, our model is equipped with distributed training to ensure better learning along with computational convenience. We conduct extensive experiments and report ablations to validate the competence of our model on widely-used datasets for 3D/4D FER.

Behzad Muzammil, Zhao Guoying

A4 Article in conference proceedings

Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, Jodhpur, India (virtual event), December 15-18, 2021

M. Behzad and G. Zhao, "Self-Supervised Learning via Multi-view Facial Rendezvous for 3D/4D Affect Recognition," 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), 2021, pp. 1-5, doi: 10.1109/FG52635.2021.9666942

https://doi.org/10.1109/FG52635.2021.9666942 http://urn.fi/urn:nbn:fi-fe202201146969