Abstract: In task-oriented dialogue systems, intent recognition and entity extraction are key for driving system understanding and state updates. However, traditional structured systems often show ...
Abstract: Multimodal large language models (MLLMs) act as essential interfaces, connecting humans with AI technologies in multimodal applications. However, current MLLMs face challenges in accurately ...