A new method called DelTA reshapes how reinforcement learning updates propagate to individual tokens during LLM training, boosting math reasoning scores by over 3 points on top baselines. Engineers building reasoning-focused models now have a concrete technique to reduce noise from high-frequency formatting tokens polluting gradient updates.