【训练问题】SimPO 训练完,推理结果重复(复读机问题) #6572
Replies: 2 comments
-
方便看一下推理的命令吗 |
Beta Was this translation helpful? Give feedback.
-
import json model_name = "/mnt4/ckpt/chonghan.ll/dpo/qwen2.5_72b_fanzhou/full/20250105_212352/checkpoint-148" model = AutoModelForCausalLM.from_pretrained( text = ''' ''' model_inputs = tokenizer([text], return_tensors="pt").to(model.device) |
Beta Was this translation helpful? Give feedback.
-
Reminder
System Info
Linux , llama-factory 最新版
Reproduction
Dpo 训练正常,SimPO 出现复读机问题。
示例:
训练命令
Others
No response
Beta Was this translation helpful? Give feedback.
All reactions