Although LLM-based conversational agents demonstrate strong fluency and coherence, they continue to exhibit undesirable behaviors, such as logical inconsistencies and factual inaccuracies. Detecting and mitigating these errors is critical for developing trustworthy systems. However, current response correction methods rely heavily on large language models (LLMs), which require information about the nature of an error or hints about its occurrence for accurate detection. This limits their ability to identify errors not defined in their instructions or covered by external tools, such as those arising from updates to the response-generation model or shifts in user behavior. In this work, we introduce Automated Error Discovery, a framework for detecting and defining errors in conversational AI, and propose SEEED (Soft-clustering Extended Encoder-Based Error Detection), an encoder-based alternative to LLMs for error detection. We enhance the Soft Nearest Neighbor Loss by amplifying distance weighting for negative samples and introduce Label-Based Sample Ranking to select highly contrastive examples for better representation learning. SEEED outperforms adapted baselines across multiple error-annotated dialogue datasets, improving the accuracy for detecting novel errors by up to 8 points and demonstrating strong generalization to unknown intent detection.
@article{petrak2025towards,
title={Towards Automated Error Detection: A Study in Conversational AI},
author={Petrak, Dominic and Tran, Thy and Gurevych, Iryna},
journal={arXiv preprint arXiv:2509.10833},
url={http://arxiv.org/abs/2509.10833},
year={2025}
}