How to Precisely Update Large Language Models Knowledge While Avoiding Catastrophic Forgetting-Deep Science

How to Precisely Update Large Language Models Knowledge While Avoiding Catastrophic Forgetting

Preprint |

Published: 24 Jun 2024

10.55415/deep-2024-0007.v1

This is not the most recent version. There is anewer versionof this content available.

fei ding#*

APUS

# contributed equally to this work, * Corresponding author

Abstract

Recent advancements in Large Language Models (LLMs) have showcased their remarkable capabilities in text understanding and generation. However, even stronger LLMs are susceptible to acquiring erroneous or obsolete information from the training corpus. Direct secondary fine-tuning with data containing new knowledge may be ineffective in updating knowledge due to the conflict between old and new knowledge. In this paper, we propose a new paradigm for fine-tuning called DFT.This method utilizes parametric arithmetic to precisely pinpoint the location of knowledge and update only the minimal set of relevant parameters . Experimental results on two publicly available datasets demonstrate that our proposed DFT can obviously improve the knowledge updating performance of full fine-tuning , simultaneously outperforming the existing baselines in most cases.

Download PDF

CITE

Comments

Download PDF CITE

DOI

10.55415/deep-2024-0007.v1

Keywords

fine-tuning ; Update Knowledge

Subject Area

Artificial Intelligence & Robotics

Version History

24 Jun 2024 18:18 Version 1

Scores

Rapid Rating

Your professional field is different from the direction of this article. Go Settings!

Level of Quality
Is the publication of relevance for the academic community and does it provide important insights? Is the language correct and easy to understand for an academic in the field? Are the figures well displayed and captions properly described? Is the article systematically and logically organized?

0.0
Level of Repeatability
Is the hypothesis clearly formulated? Is the argumentation stringent? Are the data sound, well-controlled and statistically significant? Is the interpretation balanced and supported by the data? Are appropriate and state-of-the-art methods used?

0.0
Level of Innovation
Does the work represent a novel approach or new findings in comparison with other publications in the field?

0.0
Level of Impact
Does the work have potential huge impact to the related research area?

0.0

Submit

Metrics

Abstracts

1910
PDF Downloads

452

Reviewer's Comments on this Article.

Not yet Peer Reviewed

License

The content is available under CC BY-NC-ND 4.0 License CreativeCommons.org

Competing Interest Statement

The author(s) have declared they have no conflict of interest with regard to this content

How to Precisely Update Large Language Models Knowledge While Avoiding Catastrophic Forgetting

Abstract

Rapid Rating Times: 0