2 minute read

Tag-LLM is a new method from Microsoft Research designed to enhance the performance of Large Language Models (LLMs) in specialized domains. This approach addresses the challenge of applying general-purpose LLMs to tasks in areas such as the physical and biomedical sciences, where specific domain knowledge is crucial and often not well-represented in the models’ pretraining data.

Overview of Tag-LLM

Tag-LLM introduces a model-agnostic mechanism that incorporates custom input tags as continuous vectors into the LLM’s embedding layer. These tags are categorized into two types: domain tags, which provide context relevant to a specific area of expertise, and function tags, which guide the model in following function-solving instructions. Below you can see an overview of the process for the task of protein-drug binding affinity prediction:

Overview of Tag-LLM for the task of protein-drug binding affinity prediction

Tag-LLM injects the domain tags **__** and **__** and a _function tag_ **_<Binding Affinity (BA)>_** to the input. These tags are mapped to specially trained embeddings. Finally, the model's last hidden state is passed to a task-specific head to generate predictions.

More exactly, in order to distinguish between general domain information and task-specific instructions, the methods enhances the model’s vocabulary with these input tags. More exactly, it aims to provide to the pre-trained model a new embedding matrix that is augmented with these new tags. So, the tags simply appended to the LLM’s embedding layer. In order to train these tag specific embeddings, the process is split into three stages: Stage 1 for domain tags which use unsupervised next-token prediction for in-domain data to learn the embedding. In Stage 2, the method learns single-domain function tags embeddings using supervised single domain-data while in Stage 3, the method tackles cross-domain function tags, again trained on supervised cross-domain data. In this way, the system enables the LLM to adapt to specialized tasks by conditioning it with the appropriate context and instructions, thereby enhancing its performance and generalization capabilities across various domains.

Results

Tag-LLN not only maintains the LLM’s general capabilities but also improves its performance on specialized tasks, as evidenced by the model outperforming/matching dedicated expert systems in predicting protein properties, chemical properties, and modeling drug-target interactions. Also, as it can be seen, there is a significant boost as opposed to the base tuning-free LLM.

Results on different specialized tasks

Conclusion

Tag-LLM represents a new step forward in the application of LLMs to specialized domains, offering a flexible and effective method for enhancing model performance in tasks that require specific domain knowledge. For more details please read the full paper.

Congrats to the authors for their work!

Shen, Junhong et al. “Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains.” (2024).

Updated: