LCGuard: Secure KV Cache Sharing via Latent Communication in Multi-Agent Systems

This paper addresses the privacy leakage risks introduced by using Transformer key-value (KV) caches for latent communication in large language model-based multi-agent systems, and proposes the LCGuard framework. While existing studies have shown that KV cache-based communication can improve efficiency and preserve rich information, its nature as a transparent channel may inadvertently propagate sensitive content. LCGuard treats the shared KV cache as a latent working memory and blocks sensitive information propagation by learning representation-level transformations. The approach formally defines reconstruction-based sensitive information leakage and employs an adversarial training strategy where a defender learns transformations that preserve task semantics while minimizing reconstructable information, and an attacker attempts to reconstruct the original sensitive input. Experiments across multiple model families and multi-agent benchmarks demonstrate that LCGuard significantly reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance comparable to standard KV sharing baselines, offering a new paradigm for secure multi-agent collaboration.

Background and Context

The rapid proliferation of large language model-based multi-agent systems has fundamentally altered how complex computational tasks are decomposed and executed. In these architectures, agents must frequently exchange intermediate states to coordinate effectively, moving beyond simple natural language exchanges to more nuanced forms of interaction. Recent research has highlighted the efficiency gains derived from utilizing Transformer key-value (KV) caches as a medium for latent communication between agents. This mechanism allows agents to share rich, high-dimensional representations of context and reasoning states without the overhead of token-by-token text generation, thereby significantly accelerating collaborative workflows. However, this efficiency comes at a steep cost to privacy. The KV cache inherently encodes not only the immediate input context but also the internal reasoning trajectories and sensitive data processed by each agent. Consequently, the shared cache acts as a transparent channel through which confidential information can inadvertently propagate across the system, bypassing explicit textual filters or safety guardrails that typically monitor natural language outputs.

This vulnerability has emerged as a critical bottleneck for deploying multi-agent systems in sensitive industrial environments, such as automated legal research, healthcare diagnostics, or enterprise workflow automation. In these domains, the ability of one agent to reconstruct the private inputs or internal states of another agent poses a severe risk to data sovereignty and regulatory compliance. Traditional security measures, which focus primarily on input sanitization and output filtering, are insufficient because they do not address the leakage occurring at the representation level within the shared memory structures. The lack of formal definitions for such leakage mechanisms has further complicated the development of robust defenses, leaving researchers and engineers without a standardized framework to quantify or mitigate these risks. This gap between the demand for high-efficiency collaborative AI and the need for rigorous privacy preservation necessitates a new approach to securing latent communication channels.

Deep Analysis

To address these challenges, the LCGuard framework introduces a novel paradigm that treats the shared KV cache as a latent working memory requiring active protection rather than passive sharing. The core innovation lies in the formalization of reconstruction-based sensitive information leakage, which defines a security breach as any instance where an adversarial decoder can successfully reconstruct specific, sensitive inputs from the shared cache fragments. By establishing this operational definition, the framework shifts the focus from opaque security heuristics to a quantifiable metric of information exposure. This formalization enables the development of targeted defense mechanisms that can be rigorously evaluated against specific attack vectors, providing a clear mathematical boundary for what constitutes a secure versus an insecure communication state within the multi-agent system.

LCGuard implements this security model through a sophisticated adversarial training strategy that pits a defender against an attacker in a continuous optimization game. The attacker component is designed to maximize the reconstruction of sensitive data from the shared KV cache, simulating a realistic threat model where malicious agents or external observers attempt to reverse-engineer private information. In response, the defender, integrated into the LCGuard framework, learns representation-level transformations that alter the cached data before it is shared. Crucially, these transformations are not random noise injections; they are carefully optimized to minimize the reconstructability of sensitive features while preserving the semantic integrity required for task completion. This balance is achieved through a loss function that penalizes both reconstruction success by the attacker and the degradation of task-relevant information, ensuring that the agents can still collaborate effectively despite the security measures.

The technical architecture of LCGuard involves a dual-objective optimization process that dynamically adjusts the transformation parameters based on the feedback from the attacker. This adversarial dynamic ensures that the defense mechanism adapts to the evolving capabilities of potential attackers, preventing overfitting to specific reconstruction techniques. By learning to obfuscate sensitive patterns without destroying the underlying logical structure of the agent's reasoning, LCGuard effectively breaks the link between the shared cache and the original private inputs. This approach represents a significant advancement in secure AI, demonstrating that it is possible to maintain the high throughput and informational richness of KV cache communication while simultaneously enforcing strict privacy boundaries. The method does not rely on discarding information, which would impair performance, but rather on transforming it into a form that is useful for collaboration but useless for reconstruction.

Industry Impact

The implications of LCGuard extend beyond academic research, offering tangible benefits for the industrial deployment of multi-agent AI systems. For enterprises operating in regulated industries, the ability to secure latent communication channels is a prerequisite for adopting advanced AI workflows. LCGuard provides a reusable, open-source framework that allows organizations to integrate privacy-preserving mechanisms directly into their agent architectures, reducing the need for custom, error-prone security implementations. This standardization accelerates the development of secure-by-design multi-agent systems, encouraging a shift in industry practices where privacy is considered a fundamental component of system architecture rather than an afterthought. By addressing the root cause of information leakage in shared memory structures, LCGuard helps organizations mitigate legal and reputational risks associated with data breaches in collaborative AI environments.

Furthermore, LCGuard stimulates further research into the security of non-textual communication channels in AI. By formally defining and demonstrating a new attack vector through KV cache reconstruction, the framework highlights the vulnerabilities inherent in high-efficiency latent communication methods. This insight is likely to spur the development of additional security protocols for other forms of internal agent communication, such as shared attention maps or hidden state vectors. The open nature of the research encourages the broader AI community to scrutinize and improve the security of emerging collaborative AI paradigms. As multi-agent systems become more prevalent in critical infrastructure and decision-making processes, the availability of robust, proven defense mechanisms like LCGuard will be essential for maintaining public trust and ensuring the reliability of AI-driven operations.

The framework also serves as a benchmark for evaluating the privacy risks of existing multi-agent systems. By providing a standardized method for measuring reconstruction-based leakage, LCGuard allows developers to quantify the security posture of their systems and compare different defense strategies. This capability is particularly valuable for researchers and engineers who need to make informed decisions about the trade-offs between communication efficiency and privacy protection. The ability to demonstrate that security measures do not significantly degrade task performance, as shown in the experimental results, provides a compelling argument for the adoption of such frameworks in production environments. It validates the feasibility of integrating advanced cryptographic or obfuscation techniques into real-time AI systems without compromising their operational effectiveness.

Outlook

Looking ahead, the principles underlying LCGuard are poised to influence the design of next-generation distributed AI systems. As multi-agent architectures grow in complexity, involving hundreds or thousands of agents interacting in dynamic environments, the need for scalable and robust privacy mechanisms will become even more critical. Future research may explore the integration of LCGuard with other security paradigms, such as differential privacy or secure multi-party computation, to provide layered protection against increasingly sophisticated attacks. Additionally, the framework could be extended to support heterogeneous agent systems, where different agents use varying model architectures or training data, requiring more flexible transformation strategies. The adaptability of the adversarial training approach suggests that it can be tailored to specific domain requirements, such as healthcare or finance, where the definition of sensitive information may vary.

The long-term vision for LCGuard includes its application in open-source AI ecosystems, where transparency and community-driven security audits are paramount. By providing a transparent and reproducible framework, LCGuard empowers the community to identify and patch vulnerabilities in shared AI components. This collaborative approach to security is essential for building a trustworthy AI infrastructure that can support the widespread adoption of autonomous agents in society. As the technology matures, we may see the emergence of standardized protocols for secure latent communication, with LCGuard serving as a foundational reference implementation. These protocols would enable seamless and secure interoperability between different AI systems, fostering a global network of collaborative agents that respect privacy boundaries while maximizing collective intelligence.

Ultimately, the success of LCGuard hinges on its ability to evolve alongside the threats it seeks to mitigate. Continuous monitoring of new attack vectors and the development of adaptive defense mechanisms will be necessary to maintain its effectiveness. The research community must remain vigilant in exploring the intersection of efficiency and security in AI systems, ensuring that the pursuit of performance does not come at the expense of fundamental rights like privacy. LCGuard represents a significant step in this direction, offering a practical and theoretically sound solution to one of the most pressing challenges in multi-agent AI. Its adoption and further refinement will play a crucial role in shaping the future of secure, collaborative artificial intelligence, enabling a world where AI systems can work together seamlessly without compromising the confidentiality of the data they process.