1 minute read

In this paper, the authors investigate the efficacy of Instruction Tuning (IT) in Large Language Models (LLMs) for conversational agents. Instruction Tuning represents the process of training large language models (LLMs) using instruction-response pairs and is the widely used method for transforming base pre-trained LLMs into conversational agents. The goal of this work is to uncover limitations of instruction tuning through a series of experiments, focusing on how these limitations affect the performance and capabilities of LLMs.

Overview

The authors employ a mixed-method approach for evaluation and design a comprehensive set of experiments that allow for an in-depth analysis of IT’s impact across various dimensions, including knowledge retention, hallucination rates, and conversational abilities. The experiments are conducted on different LLMs and IT datasets to ensure the findings are robust and generalizable. The authors use LoRA fine-tuning (LFT) and standard full-parameter fine-tuning (SFT). LFT works by approximating the model’s weight matrices with low-rank matrices, reducing the number of parameters that need to be fine-tuned. This makes the fine-tuning process faster. Contrary, SFT works by adjusting all or most of the model’s weights.

The main keypoints are:

  • LoRA fine-tuning (LFT) preserves the pre-training token distribution while SFT doesn’t. This means that using LFT, post fine-tuning the model still heavily relies on the pre-training and doesn’t acquire new information.

  • Dataset scaling is ineffective for LFT

  • LoRA fine-tuning mainly enhances response initiation and style without substantial knowledge enhancement.

  • Full-parameter fine-tuning tends to degrade LLM knowledge base and increase hallucination occurrences.

  • Popular methods and adjustments fail to significantly outperform simple LoRA fine-tuned models in terms of conversational quality and accuracy.

Conclusion

This paper shows that Instruction Tuning exhibits limitations in enhancing the conversational capabilities and knowledge base of LLMs. These limitations underscore the need for further research and development in this area. More details in the full paper.

Congrats to the authors for their work!

Ghosh, Sreyan et al. “A Closer Look at the Limitations of Instruction Tuning.” _ArXiv_abs/2402.05119 (2024): n. pag.

Updated: