May 26, 2026

May 26, 2026

framework

LlamaFactory v0.9.4 Raises the Floor on Fine-Tuning Infrastructure

LlamaFactory v0.9.4 drops Python 3.9 and 3.10, migrates to uv, and ships OFT, Megatron-LM, KTransformers, and over 20 new model integrations. Here is what changes for teams running fine-tuning pipelines today.

LlamaFactory v0.9.4 ships with several breaking changes that affect every team running the framework in production. Act on these before upgrading.

First, the repository itself is renamed. It was LLaMA-Factory; it is now LlamaFactory. Update any internal docs, CI scripts, or install references accordingly.

Second, Python 3.9 and 3.10 are deprecated. The framework now requires Python 3.11 to 3.13. If your training environment pins an older interpreter, you need to upgrade it before this release is usable.

Third, the project migrates from pip to uv. The new install command is uv pip install llamafactory. Swap this into your Dockerfiles and environment setup scripts now.

What is new on the technique side

The headline addition is Orthogonal Fine-Tuning (OFT), a parameter-efficient method that joins the existing adapter toolkit. Alongside it, Semantic Initialization for newly added tokens arrives, which should matter to anyone extending a model's vocabulary for domain-specific tasks.

For teams training at scale, Megatron-LM support lands via MCoreAdapter, and DeepSpeed AutoTP is also now supported. Both target distributed training setups where you need more than a single node.

FP8 training is now supported, opening the door to lower-precision training runs on hardware that can handle it.

The KTransformers backend is now available as an inference option, contributed by the KTransformers team.

The framework also picks up support for Transformers v5, TRL 0.24, the MPO algorithm, efficient NPU fused kernels, and reasoning and plaintext in function call messages.

Model coverage expands broadly

This release adds integrations for a large number of models, including Falcon H1, Kimi-VL, GLM-4.5V, Gemma3n, Granite4, Qwen3-2507, MiniCPM-V 4.0, Qwen3-VL, Qwen3-Omni, ERNIE-4.5-Text, ERNIE-4.5-VL, InternVL-3.5, MiniMax-M1, MiniMax-M2, and others. If your team has been waiting for any of these to land in a fine-tuning framework, they are here.

The project also launches an official blog at blog.llamafactory.net, which is worth bookmarking for release notes and technique writeups going forward.

What to do today

Audit your environment before upgrading. Pin Python 3.11 or higher, replace your pip install commands with uv pip install llamafactory, and update any CI references to the old repository name. If you are running vocabulary-extended models, test Semantic Initialization on a small experiment first. If you are training on NPUs or planning a distributed run, the new kernel and Megatron-LM support are worth evaluating on your next job.