Self-supervised pain intensity estimation from facial videos via statistical spatiotemporal distillation

Recently, automatic pain assessment technology, in particular automatically detecting pain from facial expressions, has been developed to improve the quality of pain management, and has attracted increasing attention. In this paper, we propose self-supervised learning for automatic yet efficient pain assessment, in order to reduce the cost of collecting large amount of labeled data. To achieve this, we introduce a novel similarity function to learn generalized representations using a Siamese network in the pretext task. The learned representations are finetuned in the downstream task of pain intensity estimation. To make the method computationally efficient, we propose Statistical Spatiotemporal Distillation (SSD) to encode the spatiotemporal variations underlying the facial video into a single RGB image, enabling the use of less complex 2D deep models for video representation. Experiments on two publicly available pain datasets and cross-dataset evaluation demonstrate promising results, showing the good generalization ability of the learned representations.

Authors:

Publication type:
A1 Journal article – refereed

Place of publication:

Keywords:
Pain Assessment, representation learning, self-supervised learning, statistical spatiotemporal distillation

Published:

Full citation:
Mohammad Tavakolian, Miguel Bordallo Lopez, Li Liu, Self-supervised pain intensity estimation from facial videos via statistical spatiotemporal distillation, Pattern Recognition Letters, Volume 140, 2020, Pages 26-33, ISSN 0167-8655, https://doi.org/10.1016/j.patrec.2020.09.012

DOI:
https://doi.org/10.1016/j.patrec.2020.09.012

Read the publication here:
http://urn.fi/urn:nbn:fi-fe2020112593022