What is the new source attribution framework for LLM deep research agents?

It is the first scalable evaluation framework using a reproducible AST parser to extract and audit inline citations from LLM-generated Markdown reports.

Why does reliable citation matter in AI research agents?

Current agents often fabricate or misattribute sources, risking bias. Verifying accessibility, relevance, and factual consistency ensures trustworthy automated research.

What should developers watch for in the coming months?

Expect rapid competitor responses, shifting infrastructure demand, and a market shift from model capability races toward verifiable, commercially ready AI research tools.

引用但未驗證：解析與評估大型語言模型深度研究代理中的來源歸因

大型語言模型驅動深度研究代理，能夠從數百個網絡源綜合資訊並生成帶引用的報告，但這些引用無法被可靠驗證。當前方法要麼依賴模型自我準確引用（可能帶來偏見風險），要麼使用檢索增強生成（RAG）但無法驗證源的可訪問性、相關性和事實一致性。我們引入了第一個來源歸因評估框架，使用可復現的AST解析器大規模提取和評估LLM生成的Markdown報告中的行內引用。該方法不同於僅驗證URL可訪問性的方法，而是從抽象語法樹層面解析引用結構，並系統性評估每個引用的可訪問性、與引用的相關性以及事實一致性。

Sources

arXiv