1 minute read

Today’s paper studies how to improve large language models’ (LLMs) ability to acquire and access knowledge from new documents through continued training. Specifically, it proposes a pre-instruction (PIT) tuning approach.

Method Overview

Let’s first look at a comparison between different training procedures. This is highlighted below:

So, given a pre-trained LLM, the standard procedure continues the pre-training on potential new documents followed by instruction-tuning on question-answer pairs (QA). In contrast, the proposed pre-instruction tuning (PIT) first instruction-tunes on QA pairs then continues the training on new documents. The authors study various pre-instruction tuning variants as shown below:

The authors argue that by mastering QA pairs first, the LLM can more effectively absorb facts from intricate documents. The order of QA vs document training is studied through different arrangements to show prioritizing QA understanding is key. The dotted line represents training with QA and associated documents sequentially.

In order to test the approach the authors collect a dataset from Wikipedia 2023 articles which has a very low chance of being used as the data for the base pre-trained LLM. In this way, they take into account cases where the LLM would answer the questions based on prior data.

Results

Comprehensive experiments on the Wiki2023 benchmark demonstrate pre-instruction tuning substantially boosts LLM performance in absorbing and recalling knowledge from new documents. Specifically, it improves QA accuracy by up to 17.8% over standard continued pre-training and instruction tuning baselines.

Conclusion

This paper introduces the idea of pre-instruction tuning, where QA pairs are used for fine-tuning before learning on new documents. It achieves quite promising gains in performance as opposed to standard training. More details in the paper.

Congrats to the authors for their work!

Jiang, Zhengbao et al. “Instruction-tuned Language Models are Better Knowledge Learners.” (2024).

Updated: