You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This discussion was converted from issue #6560 on January 08, 2025 08:09.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Reminder
System Info
llamafactory
version: 0.9.2.dev0Reproduction
如题,请问如何进行迭代的多轮SFT训练呢?假设第一轮训练后的lora权重目录为saved_models_1,则进行第二轮训练时,adapter_name_or_path设为 saved_models_1应该可以实现,那后续的第三轮应该如何设置呢?
以及,第二轮训练结束后,能否不进行合并,而基于两轮 SFT 的lora权重直接使用 scripts/vllm_infer.py 脚本进行推理?
提出这个issue的主要原因是:
经过一轮训练,合并lora权重后,再进行第二轮训练,随后使用llamafactory-cli train进行推理和scripts/vllm_infer.py 脚本进行推理的效果相差过大
附上脚本以及yaml配置文件
Others
No response
Beta Was this translation helpful? Give feedback.
All reactions