Emily Mower Provost

Assistant Professor, Computer Science and Engineering

Emily Mower Provost

Assistant Professor, Computer Science and Engineering

My Portrait

Emily Mower Provost

Assistant Professor

University of Michigan
EECS Department
Computer Science & Engineering
3620 BBB
Ann Arbor, MI 48109-2121
Tel: 734-647-1802
Email:



My Portrait

Computational Human-Centered Analysis and Integration

Research

The CHAI lab focuses on behavior recognition from audio-visual speech. We have two main areas of study:

  1. Emotion modeling (classification and perception)
  2. Assistive technology (aphasia and bipolar disorder)

Emotion Modeling: Classification

Emotion has intrigued researchers for generations. This fascination has permeated the engineering community, motivating the development of affective computational models for classification. However, human emotion remains notoriously difficult to interpret in part due to the presence of complex emotions, emotions that contain shades of multiple affective classes. Proper representations of emotion would ameliorate this problem by introducing multidimensional characterizations of the data that permit the quantification and description of the varied affective components of each utterance. We focus on methods to characterize emotion, focusing on quantifying the presence of multiple shades of affect and avoiding the need for hard-labeled assignments. This set of techniques can be used to determine a most likely assignment for an utterance, to map out the evolution of the emotional tenor of an interaction, or to interpret utterances that have multiple affective components.

Recent publications (from 2013 to present):

  1. Biqiao Zhang, Emily Mower Provost, Georg Essl. “Cross-Corpus Acoustic Emotion Recognition From Singing And Speaking: A Multi-Task Learning Approach.” International Conference on Acoustics, Speech and Signal Processing (ICASSP). Shanghai, China, March 2016. [pdf]

  2. Duc Le and Emily Mower Provost. “Data Selection for Acoustic Emotion Recognition: Analyzing and Comparing Utterance and Sub-Utterance Selection Strategies.” Affective Computing and Intelligent Interaction (ACII). Xi’an, China, September 2015. [Note: oral presentation][pdf]

  3. Biqiao Zhang, Georg Essl, and Emily Mower Provost Georg Essl. “Recognizing Emotion from Singing and Speaking Using Shared Models.” Affective Computing and Intelligent Interaction (ACII). Xi’an, China, September 2015. [Note: oral presentation][pdf]

  4. Yuan (June) Shangguan and Emily Mower Provost. “EmoShapelets: Capturing Local Dynamics of Audiovisual Affective Speech.” Affective Computing and Intelligent Interaction (ACII). Xi’an, China, September 2015. [Note: oral presentation][pdf]

  5. Yelin Kim and Emily Mower Provost. “Leveraging Inter-rater Agreement for Audio-Visual Emotion Recognition. Affective Computing and Intelligent Interaction (ACII). Xi’an, China, September 2015.[pdf]

  6. Yelin Kim, Jixu Chen, Ming-Ching Chang, Xin Wang, Emily Mower Provost, Siwei Lyu. “Modeling Transition Patterns Between Events for Temporal Human Action Segmentation and Classification." IEEE International Conference on Automatic Face and Gesture Recognition (FG). Ljubljana, Slovenia, May, 2015. [Note: oral presentation][pdf]

  7. Biqiao Zhang, Emily Mower Provost, Robert Swedberg, Georg Essl. "Predicting Emotion Perception Across Domains: A Study of Singing and Speaking." AAAI. Austin, TX, USA, January 2015. [pdf]

  8. Yelin Kim and Emily Mower Provost. "Say Cheese vs. Smile: Reducing
  9. Speech-Related Variability for Facial Emotion Recognition." Proceedings of the ACM International Conference on Multimedia. Florida, USA, November, 2014. [pdf, winner, best student paper!]

  10. Duc Le and Emily Mower Provost. “Emotion Recognition From Spontaneous Speech Using Hid- den Markov Models With Deep Belief Networks.” Automatic Speech Recognition and Understanding (ASRU). Olomouc, Czech Republic. December, 2013. [pdf]

  11. Theodora Chaspari, Emily Mower Provost, and Shrikanth S. Narayanan. "Analyzing the Structure of Parent-Moderated Narratives from Children with ASD Using an Entity-Based Approach." Interspeech. Lyon, France. August, 2013. [pdf]

  12. Yelin Kim and Emily Mower Provost. "Emotion Classification via Utterance-Level Dynamics: A Pattern-Based Approach to Characterizing Affective Expressions." International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Vancouver, British Columbia, Canada. May, 2013. [pdf]

  13. Yelin Kim, Honglak Lee, and Emily Mower Provost. "Deep Learning for Robust Feature Generation in Audio-Visual Emotion Recognition." International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Vancouver, British Columbia, Canada. May, 2013. [pdf]

  14. Emily Mower Provost. "Identifying Salient Sub-Utterance Emotion Dynamics Using Flexible Units and Estimates of Affective Flow." International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Vancouver, British Columbia, Canada. May, 2013. [pdf]

Emotion Modeling: Perception

Humans integrate audio and visual information when they make perceptual judgments of emotion. However, we do not yet have computational models that can describe this process. Our work focuses on modeling emotional perception given a backdrop of the Emotional McGurk Effect paradigm, in which clips of mismatched emotional content (e.g., an angry face and a happy voice) are used as stimuli during perception experiments (examples). This knowledge will facilitate design improvements for the design of emotional behaviors for use in human-computer and human-robot interaction, necessary for wide-spread adoption of this affective technology.

Recent publications (from 2013 to present):
  1. Emily Mower Provost, Yuan (June) Shangguan, Carlos Busso, "UMEME: University of Michigan Emotional McGurk Effect Dataset," IEEE Transactions of Affective Computing, 10:1(395-409), 2015. [pdf]

  2. Emily Mower Provost, Irene Zhu, and Shrikanth Narayanan. "Using Emotional Noise to Uncloud Audio-Visual Emotion Perceptual Evaluation." International Conference on Multimedia and Expo (ICME). San Jose, CA. July, 2013. [pdf][Examples][Data Mining Workshop Talk]

Assistive Technology: Aphasia

Aphasia is a common language disorder which can severely affect an individual’s ability to communicate with others. Aphasia rehabilitation requires intensive practice accompanied by appropriate feedback, the latter of which is difficult to satisfy outside of therapy. We investigate methods to forward the development of intelligent systems capable of providing automatic feedback to patients with aphasia. We have collected (and continue to collect) a speech corpus in collaboration with the University of Michigan Aphasia Program (UMAP). We investigate methods to automate transcription and methods to automatically estimate speech intelligibility based on human perceptual judgment.

Recent publications (from 2013 to present):
  1. Duc Le and Emily Mower Provost. " Modeling Pronunciation, Rhythm, and Intonation for Automatic Assessment of Speech Quality in Aphasia Rehabilitation." Interspeech. Singapore. September, 2014. [pdf]

  2. Duc Le, Keli Licata, Elizabeth Mercado, Carol Persad, Emily Mower Provost. ``Automatic Analysis of Speech Quality for Aphasia Treatment.'' International Conference on Acoustics, Speech and Signal Processing (ICASSP). Florence, Italy. May 2014. [pdf]

Assistive Technology: Bipolar Disorder

Speech patterns are modulated by the emotional and neurophysiological state of the speaker. There exists a growing body of work that examines this modulation in patients suffering from depression, autism, and post-traumatic stress disorder. However, the majority of the work in this area focuses on the analysis of structured speech collected in controlled environments. Here we expand on the existing literature by examining bipolar disorder (BP). BP is characterized by mood transitions, varying from a healthy euthymic state to states characterized by mania or depression. The speech patterns associated with these mood states provide a unique opportunity to study the modulations characteristic of mood variation. We explore methodology to collect unstructured speech continuously and unobtrusively via the recording of day-to-day cellular phone conversations and to model these data to estimate the mood of individuals.

Recent publications (from 2013 to present):
  1. John Gideon, Emily Mower Provost, Melvin McInnis. “Mood State Prediction From Speech Of Varying Acoustic Quality For Individuals With Bipolar Disorder.” International Conference on Acoustics, Speech and Signal Processing (ICASSP). Shanghai, China, March 2016. [pdf]

  2. Zahi N Karam, Emily Mower Provost, Satinder Singh, Jennifer Montgomery, Christopher Archer, Gloria Harrington, Melvin Mcinnis. ``Ecologically Valid Long-term Mood Monitoring of Individuals with Bipolar Disorder Using Speech.'' International Conference on Acoustics, Speech and Signal Processing (ICASSP). Florence, Italy. May 2014. [pdf]