ZO-Act is a zeroth-order fine-tuning method that bypasses backpropagation by analyzing input activation patterns to build low-rank subspaces, optimizing only a lightweight coefficient matrix via forward-pass loss evaluation to drastically reduce memory and compute overhead.

Why does ZO-Act matter?

It enables compatibility with momentum optimizers like Adam, natively supports quantized models, and offers an efficient path for real-time model adaptation on edge devices and resource-constrained terminals without prohibitive backward-pass costs.

What should we watch next?

As zeroth-order optimization theory matures and hardware accelerators advance, ZO-Act is poised to become a standard method for efficient LLM fine-tuning, accelerating AI deployment across diverse real-world scenarios.

ZO-Act：基於激活資訊的零階高效微調方法

本文提出了一種名為ZO-Act的高效零階（Zeroth-Order）微調方法，旨在解決大語言模型在反向傳播不可用或顯存受限場景下的優化難題。現有零階方法通常對全量權重或隨機子空間進行擾動，導致梯度估計方差高且性能有限。ZO-Act創新性地利用輸入激活值構建低秩子空間，僅在初始化時計算一次激活基，隨後僅優化輕量級的係數矩陣。該方法通過前向傳播損失評估實現優化，顯著降低了有效擾動維度，使變數兼容Adam等動量優化器，並天然支援量化模型的微調。在Llama-3-8B、OPT-13B及INT4量化版本上的實驗表明，ZO-Act在語言理解、問答及常識推理任務上均顯著優於現有強基線，證明了其在資源受限環境下微調大模型的巨大潛力。

Sources

arXiv