Which measures of vocal stability are consistently linked to distress across multiple studies informing SER and Synthesis?

Answer

Jitter or shimmer

The academic investigation into the foundational mapping between acoustic features and emotional categories continues to refine the objective targets for emotion-aware algorithms. Specific technical measures exist to quantify the stability and regularity of vocal fold vibration, which are highly sensitive indicators of vocal effort and physiological state. Jitter refers to the cycle-to-cycle variation in the fundamental frequency (pitch), while shimmer refers to the cycle-to-cycle variation in the amplitude (loudness). When a person experiences distress, stress, or strong physiological arousal, these measures often show increased irregularity or variation. Finding that specific variations in jitter or shimmer are consistently correlated with distress across multiple independent studies provides concrete, measurable targets that algorithms in both SER (for detection) and emotion-aware TTS (for replication) can utilize for robust performance.

Which measures of vocal stability are consistently linked to distress across multiple studies informing SER and Synthesis?
inventiontechnologyspeechemotion