Recent progress in end-to-end (E2E) open-domain dialogue agents has shown a dramatic improvement in the ability of these models to learn patterns and mimic conversational behaviors from the training data. Unfortunately these models still exhibit many concerning problems. First, they may learn undesirable features present in the training data such as biased, toxic, or otherwise harmful language. Second, they do not respond reliably in safety critical situations, such as the user asking for medical advice, indicating intention of self-harm, or during public health crises such as COVID-19.
These problems are difficult, as what is deemed as “offensive” or even “sensitive” is both contextually and culturally dependent, and picking up on more subtle examples of unsafe language often requires a level of language understanding that is well beyond current capabilities.
A first workshop was held to discuss these topics on October 2020. Following on from that, this special session will take place at SIGDIAL 2021.