LoRA: Low-Rank Adaptation of Large Language Models

Hu, Edward J.; Shen, Yelong; Wallis, Phillip; Allen‑Zhu, Zeyuan; Li, Yuanzhi; Wang, Shean; Wang, Lu; Chen, Weizhu

LoRA: Low-Rank Adaptation of Large Language Models

dc.contributor.author	Hu, Edward J.
dc.contributor.author	Shen, Yelong
dc.contributor.author	Wallis, Phillip
dc.contributor.author	Allen‑Zhu, Zeyuan
dc.contributor.author	Li, Yuanzhi
dc.contributor.author	Wang, Shean
dc.contributor.author	Wang, Lu
dc.contributor.author	Chen, Weizhu
dc.date.accessioned	2025-06-02T13:28:10Z
dc.date.available	2025-06-02T13:28:10Z
dc.date.issued	2021-06-17
dc.description	모델 파라미터를 고정한 상태에서 LoRA라는 저차원 학습 구조를 주입하여, 대규모 언어모델의 파인튜닝 효율성과 메모리 사용을 크게 개선한 방법을 제안합니다. 10,000배 적은 학습 파라미터와 3배 낮은 메모리 사용량을 보입니다 ©2021 Microsoft Research
dc.description.abstract	An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than fine-tuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and, unlike adapters, no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA. We release a package that facilitates the integration of LoRA with PyTorch models and provide our implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 at this https URL.
dc.description.sponsorship	Microsoft Research
dc.identifier.uri	https://arxiv.org/abs/2106.09685
dc.identifier.uri	http://data.inu.ac.kr/handle/123456789/1959
dc.language.iso	en_US
dc.publisher	arXiv
dc.subject	LoRA
dc.subject	Low-Rank Adaptation
dc.subject	Efficient Fine-Tuning
dc.subject	Transformer
dc.subject	NLP
dc.title	LoRA: Low-Rank Adaptation of Large Language Models
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2106.09685v2.pdf
Size:: 1.53 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 97 B
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Natural Language Processing