What is the Matching Principle in machine learning?

It unifies robustness and alignment, proving that an encoder's Jacobian regularization must cover deployment perturbation covariance.

Why does this framework matter for AI research?

It replaces heuristic tricks with a rigorous geometric foundation, explaining why specific regularizations work and guiding robust algorithm design.

What do experiments reveal about its limits and future impact?

Tested on 13 models including Qwen2.5-7B, it passed 12 tests using the new TDI metric. Future work must address feature gaps and improve alignment over standard DPO.

匹配原理：面向干擾魯棒表示學習的損失函數幾何理論

本文提出「匹配原理」，將魯棒性、域適應、不變性及對齊等分散問題統一為估計標籤保持部署干擾的協方差矩陣。核心貢獻在於證明編碼器雅可比矩陣的正則化範圍必須覆蓋該協方差。理論層面，在線性高斯模型中推導了閉式最優解及立方根水填充策略，並證明範圍覆蓋對二次雅可比懲罰的必要性。實驗層面，引入無標籤探针指標TDI，在從經典機器學習到Qwen2.5-7B的十三個預註冊塊中驗證了理論預測。結果表明，遵循匹配原理的方法在幾何結構和部署漂移上表現優異，十二項測試通過，僅Office-31因特徵間隙失敗。在7B大模型中，匹配風格正則化提升了選擇性誠實度並保留了風格TDI，而標準DPO則導致退化。該工作為理解現有魯棒性方法提供了統一的幾何視角。

Sources

arXiv