自动加载 base model(如 LLaMA、Qwen、GPT-NeoX):通过在 adapter config 中保存的 base model 路径(或你传的路径);
然后在此基础上加载 adapter 权重;
得到一个完整的、可运行的 PEFT 模型。
换句话说,传入的 args.output_dir 实际上是包含:
adapter 权重
adapter 配置
一个可解析的 base_model_name_or_path 字段(指向原始模型)
Using TRL with PEFT
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
from peft import LoraConfig from transformers import AutoModelForCausalLM
# Load model with PEFT config lora_config = LoraConfig( r=16, lora_alpha=32, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM" )
# Load model on specific device model = AutoModelForCausalLM.from_pretrained( "your-model-name", load_in_8bit=True, # Optional: use 8-bit precision device_map="auto", peft_config=lora_config )
Practice
Parameters
定义 PEFT 参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
from peft import LoraConfig
# TODO: Configure LoRA parameters # r: rank dimension for LoRA update matrices (smaller = more compression) rank_dimension = 6 # lora_alpha: scaling factor for LoRA layers (higher = stronger adaptation) lora_alpha = 8 # lora_dropout: dropout probability for LoRA layers (helps prevent overfitting) lora_dropout = 0.05
peft_config = LoraConfig( r=rank_dimension, # Rank dimension - typically between 4-32 lora_alpha=lora_alpha, # LoRA scaling factor - typically 2x rank lora_dropout=lora_dropout, # Dropout probability for LoRA layers bias="none", # Bias type for LoRA. the corresponding biases will be updated during training. target_modules="all-linear", # Which modules to apply LoRA to task_type="CAUSAL_LM", # Task type for model architecture )
将 PEFT 参数添加到 SFTTrainer 中。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
max_seq_length = 1512 # max sequence length for model and packing of the dataset
# Create SFTTrainer with LoRA configuration trainer = SFTTrainer( model=model, args=args, train_dataset=dataset["train"], peft_config=peft_config, # LoRA configuration max_seq_length=max_seq_length, # Maximum sequence length tokenizer=tokenizer, packing=True, # Enable input packing for efficiency dataset_kwargs={ "add_special_tokens": False, # Special tokens handled by template "append_concat_token": False, # No additional separator needed }, )
Load adapters from the Hub
首先加载的是 PeftConfig
然后基于 PeftConfig 加载 base model
最后加载带 LoRA 的模型
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
import torch import transformers from datasets import load_dataset from peft import LoraConfig, get_peft_model from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PromptTuningConfig, TaskType, get_peft_model from transformers import AutoModelForCausalLM, AutoTokenizer
# Load base model model = AutoModelForCausalLM.from_pretrained("your-base-model") tokenizer = AutoTokenizer.from_pretrained("your-base-model")
# Configure prompt tuning peft_config = PromptTuningConfig( task_type=TaskType.CAUSAL_LM, num_virtual_tokens=8, # Number of trainable tokens prompt_tuning_init="TEXT", # Initialize from text prompt_tuning_init_text="Classify if this text is positive or negative:", tokenizer_name_or_path="your-base-model", )
# Create prompt-tunable model model = get_peft_model(model, peft_config)