Large Language Models (LLMs) have demonstrated remarkable capabilities in generating dialogues and handling a broad range of user queries. However, their effectiveness as end-to-end Task-Oriented Dialogue (TOD) systems remains limited due to their reliance on static parametric memory, which fails to accommodate evolving knowledge bases (KBs). This paper investigates a scalable function-calling approach that enables LLMs to retrieve only the necessary KB entries via schema-guided queries, rather than embedding the entire KB into each prompt. This selective retrieval strategy reduces prompt size and inference time while improving factual accuracy in system responses. We evaluate our method on the MultiWOZ 2.3 dataset and compare it against a full-KB baseline that injects the entire KB into every prompt. Experimental results show that our approach consistently outperforms the full-KB method in accuracy, while requiring significantly fewer input tokens and considerably less computation time, especially when the KB size increases.

Task-Oriented Dialogue Systems through Function Calling

Labruna, Tiziano
;
Bonetta, Giovanni;Magnini, Bernardo
2025-01-01

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities in generating dialogues and handling a broad range of user queries. However, their effectiveness as end-to-end Task-Oriented Dialogue (TOD) systems remains limited due to their reliance on static parametric memory, which fails to accommodate evolving knowledge bases (KBs). This paper investigates a scalable function-calling approach that enables LLMs to retrieve only the necessary KB entries via schema-guided queries, rather than embedding the entire KB into each prompt. This selective retrieval strategy reduces prompt size and inference time while improving factual accuracy in system responses. We evaluate our method on the MultiWOZ 2.3 dataset and compare it against a full-KB baseline that injects the entire KB into every prompt. Experimental results show that our approach consistently outperforms the full-KB method in accuracy, while requiring significantly fewer input tokens and considerably less computation time, especially when the KB size increases.
File in questo prodotto:
File Dimensione Formato  
2025.ranlp-1.72.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Creative commons
Dimensione 353.09 kB
Formato Adobe PDF
353.09 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/371647
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact