I hea- umm think that's what they say: A Dataset of Inferences from Natural Language Dialogues

Abstract:

In this paper we describe a dataset for Natural Language Inference in the dialogue domain and present several baseline models that predict whether a given hypothesis can be inferred from the dialogue. We describe an approach for collecting hypotheses in the ENTAILMENT, CONTRADICTION and NEUTRAL categories, based on transcripts of natural spoken dialogue. We present the dataset and perform experiments using a flat-concatenating and a hierarchical neural network. We then compare these to baseline models that exploit lexical regularities at the utterance level. We also pre-train BERT with additional dialogue data and find that pre-training with additional data helps. Our experiments show that hierarchical models perform better when using a random split of the data, while flat-concatenation models perform better on Out-of-Domain data. Lastly, LLM prompting is performed on two models, Llama 2 and Zephyr, the former barely exceeding the baseline, while the latter showing an incremental increase in performance as context length increases.

Research areas:

Year:

2024

Type of Publication:

In Proceedings

Book title:

Proceedings of the 28th Workshop on the Semantics and Pragmatics of Dialogue - Full Papers

ISSN:

2308-2275

Digital version [Bibtex]

Back