What is SHERLOC and how does it solve code repair localization bottlenecks?

SHERLOC is a training-free code repair framework that uses structured hypothesis-driven exploration and self-recovery mechanisms, enabling reasoning LLMs to precisely locate bugs and generate diagnostic explanations.

What is the practical impact of SHERLOC on code repair?

Injecting SHERLOC's localization results into a repair agent raises the SWE-Bench Verified resolution rate by 5.95 percentage points, while reducing localization token usage by 36.7% and total tokens by 23.1%.

What innovations does SHERLOC bring compared to traditional methods?

With no fine-tuning or complex orchestration needed, SHERLOC achieves 84.33% accuracy on SWE-Bench Lite at 30B parameters, proving optimized reasoning and tool design can outperform most agent-based approaches.

SHERLOC：無需訓練的程式修復智能體——結構化診斷定位框架

大型語言模型智能體在解決倉庫級程式任務時，往往將一半以上的運算預算浪費在故障定位階段。現有的定位框架多被簡化為檔案檢索，缺乏修復所需的診斷脈絡。本文提出 SHERLOC，一種無需訓練、無需微調且無需多智能體協調的結構化診斷定位框架。該框架結合推理性 LLM 與緊湊的倉庫工具及自我恢復機制，在 SWE-Bench Lite 上達到 84.33% 的 accuracy@1，在 SWE-Bench Verified 上達到 81.27% 的 recall@1，在約 30B 參數規模下超越多數智能體方法。將其定位結果注入修復智能體後，SWE-Bench Verified 的解決率平均提升 5.95 個百分點，同時定位與總 Token 消耗分別降低 36.7% 和 23.1%，顯著提升了程式修復的效率與準確性。

Sources

arXiv