Ollama v0.30.6 ships QAT-optimized Gemma 4 models for lower memory usage, plus a new coding agent integration via `ollama launch omp`. Apple Silicon users also get an MLX quantization improvement.
June 7, 2026
Three infrastructure updates worth a quick look today. vLLM v0.22.1 patches multi-node Ray serving hangs and DeepSeek-V4 init failures, making it a worthwhile upgrade if you run distributed deployments. Ollama v0.30.6 adds QAT-optimized Gemma 4 weights that reduce memory pressure and introduces a coding agent hook via `ollama launch omp`. On the security side, Dify v1.14.2 tightens tenant isolation and locks tool credential updates to admins, with fixes worth reviewing if you run multi-tenant workflows.
Ollama v0.30.6 ships QAT-optimized Gemma 4 models for lower memory usage, plus a new coding agent integration via `ollama launch omp`. Apple Silicon users also get an MLX quantization improvement.
Dify v1.14.2 tightens tenant isolation, locks down tool credential updates to admins, and fixes a cluster of workflow reliability bugs including broken tracing after human-in-the-loop resume. Here is what changed and what to act on now.
vLLM v0.22.1 ships targeted bug fixes for multi-node Ray serving hangs, DeepSeek-V4 initialization failures, and several model-loading regressions, plus new support for JetBrains' Mellum v2 and zentorch-accelerated inference on AMD Zen CPUs.