Imagine a world where AI decides who gets funding or makes critical decisions—but one simple trick can let attackers take control. Ethereum co-founder Vitalik Buterin warns that overreliance on AI governance could open the door to exploitation.
Recent security tests revealed how advanced AI models can be manipulated, exposing vulnerabilities in decision-making and funding allocation. As institutions worldwide experiment with AI governance, Buterin stresses that unchecked reliance on AI could threaten trust, transparency, and security.
The risks of naive ai governance
Buterin’s concerns echo those of security researcher Eito Miyamura, who demonstrated how malicious actors could exploit OpenAI’s Model Context Protocol (MCP). By embedding hidden commands in a simple calendar invite, attackers tricked ChatGPT into exposing sensitive emails.
“Large language models fail to reliably distinguish valid instructions from malicious commands,” — Eito Miyamura, EdisonWatch co-founder.
Buterin added, “If AI governance systems manage critical funding or decisions, attackers could bypass safeguards using jailbreak-style prompts. Supporting governance through untested AI systems risks exposing entire organizations to exploitation.”
Buterin’s proposal: info finance as an alternative
Rather than relying on a single AI system, Buterin suggests a market-based framework called info finance. In this model, multiple governance systems compete openly, creating diversity and resilience.
Random spot checks by human juries review AI-driven decisions, and developers along with external reviewers are incentivized to detect flaws.
“Adding human oversight ensures AI governance is both transparent and trustworthy,” Buterin said, emphasizing the need for checks alongside automation.
Designing institutions for resilience
Buterin describes this approach as an “institution design” strategy where diverse AI models reduce risks associated with centralization. This method minimizes the chance of coordinated attacks and allows governance to adapt in real time.
He previously noted, “Increased human control generally improves both quality and safety,” urging institutions to reconsider over-reliance on fully autonomous AI systems.
As debates on AI governance intensify globally, Buterin’s vision focuses on pluralism, oversight, and adaptability—providing a roadmap to strengthen security while harnessing AI’s potential.