PainDiffusion: Can robot express pain?

Ritsumeikan University
Submitted to ICRA 2025

Abstract

Pain is a more intuitive and user-friendly way of communicating problems, making it especially useful in rehabilitation nurse training robots. While most previous methods have focused on classifying or recognizing pain expressions, we introduce PainDiffusion, a model that generates facial expressions in response to pain stimuli, with controllable expressiveness and emotion status. PainDiffusion utilizes diffusion forcing to roll out predictions over arbitrary lengths using a conditioned temporal U-Net. It operates as a latent diffusion model within EMOCA's facial expression latent space. For training, we process the BioVid Heatpain Database, extracting expression codes and subject identity configurations. We also propose a novel set of metrics to evaluate pain expressions, covering expressiveness, diversity, and the appropriateness of model-generated outputs. Finally, we demonstrate that PainDiffusion outperforms the autoregressive method, both qualitatively and quantitatively.

Methodology

Output Comparison

FSQ-VAE Autoregressive baseline

PainDiffusion w/ Full-seq Diffusion

Pain Diffusion w/ Diffusion Forcing

Ground Truth

Stimuli Signal

We randomly sample from the validation set and compare the output of different baselines.

Video Index: 1

Controlability Experiment

Stimuli Signal

Predicted Facial Expression

Stimuli level

Expressiveness index

Emotion index

The area of movement for each controllable condition.

Emotion

Anger

Anger

Contempt

Contempt

Disgust

Disgust

Fear

Fear

Happiness

Happiness

Neutral

Neutral

Sadness

Sadness

Surprise

Surprise

Stimuli

Stimuli 1

Stimuli 1

Stimuli 2

Stimuli 2

Stimuli 3

Stimuli 3

Stimuli 4

Stimuli 4

Expressiveness

Expressiveness 5

Expressiveness 5

Expressiveness 6

Expressiveness 6

Expressiveness 7

Expressiveness 7

Expressiveness 8

Expressiveness 8

Expressiveness 9

Expressiveness 9

Expressiveness 10

Expressiveness 10

Expressiveness 11

Expressiveness 11

Dataset Analysis

For every chunk (with a length of 64 frames) of the videos in the training dataset, we compute the average values of pain expressiveness, emotion configuration, and the maximum stimulus value. We plot histograms of 50 bins to check their distributions.

Pain Expressiveness
Pain Expressiveness
Stimulus
Stimulus
Emotion
Emotion

The BioVid Heatpain part C contains pain expressiveness mostly in the range of 7 to 10, while there's a small group of video chunks that have pain expressiveness above 10. In terms of maximum stimulus value, most of them focus on the range from 48 degrees Celsius to 52. The average emotion distribution indicates that many samples are neutral, sad or contempt.

BibTeX

@article{dam2024paindiffusion,
        title={PainDiffusion: Can robot express pain?}, 
        author={Quang Tien Dam and Tri Tung Nguyen Nguyen and Dinh Tuan Tran and Joo-Ho Lee},
        year={2024},
        url={https://arxiv.org/abs/2409.11635}, 
      }