A key challenge for interactive artificial agents is to produce communicative multimodal behavior that is communicatively effective and robust in a given, dynamically evolving interaction context. This project investigates the automatic generation of speech and gesture. We develop cognitive, generative models that incorporate information about the realtime interaction context to allow for adaptive multimodal behavior that can steer and support the conversational interaction. Our goal is to (a) learn models for generating speech and meaningful (representational) gestures in realtime, (b) make these models adaptive to changes of the interlocutor’s behavior, and (c) validate them empirically in human-agent studies.
Contact: Hendric Voß