Investor-overriding guardians (or simply "guardians") are defined as self-appointed individuals empowered to override investor preferences to decide how much profit should be sacrificed for a firm’s social mission. These guardians are "agents without principals," answerable to no one but themselves. The sources identify only three firms that have utilized this unusual governance structure to manage a built-in conflict between profit-seeking and a mission: Ben & Jerry’s, OpenAI, and Anthropic.
The Context of "Ben & Jerry’s Risk"
The term "Ben & Jerry’s risk" refers specifically to the danger that these guardians will cause "double trouble": acting in ways that both materially harm investors and undermine the very mission they were installed to protect.
- Ben & Jerry’s Failure: In 2021, the independent board (guardians) attempted to boycott Israel against the wishes of its parent company, Unilever. This led to a massive backlash, including state divestments and lawsuits, costing Unilever billions in market value. Ultimately, the move backfired: Unilever transferred the Israeli business to a licensee who can now sell the products in perpetuity, achieving the exact opposite of the guardians' goal.
- OpenAI LLC Meltdown: In 2023, OpenAI’s guardians attempted to fire CEO Sam Altman over safety concerns. This nearly destroyed the company, and the ensuing chaos led to the departure of the most safety-minded researchers to start competing ventures, potentially making OpenAI less safe than before.
Insulation and the "Kill Switch"
The sources distinguish between how much power these guardians hold over investors based on their degree of "insulation":
- Fully Insulated Guardians: At Ben & Jerry’s and OpenAI, investors lacked any mechanism to remove the guardians or unwind the arrangement. This lack of accountability is seen as a primary driver of the publicized meltdowns.
- Partly Insulated Guardians (The Anthropic Model): Anthropic is highlighted for including a "kill switch" (or "failsafe"). This allows an unspecified super-majority of investors to unilaterally terminate the guardian arrangement if it becomes too costly or destructive, thereby reducing Ben & Jerry's risk.
AI Corporate Governance and Future Implications
The sources argue that while guardians are intended to protect society from risks like dangerous AI, they are often ineffective or counterproductive. Even OpenAI's 2025 restructuring into a Public Benefit Corporation (PBC) largely preserves the power of its guardians through the OpenAI Foundation, which retains the right to appoint directors and veto actions related to "safety and security".
Unlike mission-driven firms like The Hershey Company or Novo Nordisk—where the mission of the controlling entity is highly aligned with making money for beneficiaries—AI firms like OpenAI and Anthropic have a sharp built-in conflict because their mission of "benefitting humanity" often requires sacrificing lucrative but potentially harmful projects. The authors predict that, given the "fiascos" at Ben & Jerry's and OpenAI, few future firms will deliberately choose to adopt a structure with fully-insulated guardians.
The sources characterize OpenAI’s governance as an example of an "investor-overriding guardian" model, which is defined by a deep, built-in conflict: the firm raises billions from profit-seeking investors but empowers self-appointed individuals to sacrifice those profits for a social mission,. This structure is the primary source of what the authors call "Ben & Jerry’s risk"—the danger that these unaccountable guardians will cause "double trouble" by harming investors while simultaneously undermining the mission they are supposed to protect,,.
The 2019 Structure and the 2023 Meltdown
Under its 2019 structure, OpenAI put its operating business into a for-profit subsidiary, OpenAI LLC, which was fully controlled by the directors of OpenAI Nonprofit. These directors acted as "fully insulated" guardians because investors had no "kill switch" or mechanism to remove them.
This risk materialized in November 2023 when the board attempted to fire CEO Sam Altman,. The sources argue this event perfectly illustrates the failure of insulated guardians:
- Harm to Investors: The attempted firing nearly wiped out the company’s value as 700 of 770 employees threatened to leave,.
- Mission Undermined: Although the board acted out of safety concerns, the fallout resulted in the departure of the firm's most safety-minded researchers to start competing ventures, potentially making OpenAI less safe than before,.
The 2025 Structure: Persistence of Risk
In 2025, OpenAI restructured into a Public Benefit Corporation (PBC), removing the previous profit cap. However, the sources argue that Ben & Jerry’s risk remains because the new structure largely replicates the old power dynamics,.
- Guardian Control: The "OpenAI Foundation" (the new name for the nonprofit) retains the exclusive right to appoint and replace all directors of the for-profit arm.
- Veto Rights: The Foundation’s Safety and Security Committee (SSC) holds veto power over any "PBC actions relating to safety and security," including the authority to halt the release of AI models.
- Fiduciary Priority: The PBC charter explicitly requires directors to ignore the pecuniary interests of stockholders when dealing with safety and security issues, focusing solely on the mission.
Comparison and Broader AI Context
The sources use OpenAI to highlight a broader trend in AI corporate governance, specifically contrasting it with Anthropic. While Anthropic also uses guardians, it includes a "kill switch" that allows a super-majority of investors to fire them. This makes Anthropic's guardians only "partly insulated," significantly lowering the Ben & Jerry’s risk compared to OpenAI's "fully insulated" model,.
Furthermore, the authors dismiss comparisons to mission-driven firms like The Hershey Company or Novo Nordisk. In those cases, the mission (making money for a school or medical research) is highly aligned with commercial success. At OpenAI, the mission of ensuring AI "benefits all of humanity" creates an inevitable and severe conflict because it may require suppressing profitable but potentially harmful technology. The sources conclude that given the "fiascos" at Ben & Jerry’s and OpenAI, few future firms are likely to deliberately choose this fully-insulated guardian structure,.
The Anthropic "failsafe" (also referred to in the sources as a "kill switch") is a specific governance mechanism that allows a super-majority of investors to unilaterally terminate the firm's mission-guardian arrangement. This feature is central to the sources' analysis of how AI firms can manage the inherent conflict between profit-seeking and social missions while mitigating the dangers of unaccountable leadership.
Mechanism of the Anthropic Failsafe
The failsafe operates by empowering an unspecified super-majority of Anthropic PBC’s investors to unilaterally amend the Anthropic Long-Term Benefit Trust (LTBT). This is significant because the Trust is the entity that appoints a majority of Anthropic's board members (initially four out of seven).
- Termination Power: Investors can effectively "fire" the guardians and their board appointees without the guardians' consent.
- Dynamic Threshold: The required super-majority to trigger the switch reportedly grows larger over time, which the sources suggest reflects an increasing need for commitment as the company's technology becomes more powerful.
Partly vs. Fully Insulated Guardians
The sources use the Anthropic failsafe to distinguish between two types of "mission guardians":
- Partly Insulated: Because of the kill switch, Anthropic's guardians are only "partly insulated" from investor control.
- Fully Insulated: In contrast, the guardians at Ben & Jerry’s and OpenAI are described as "fully insulated" because investors in those firms lack any mechanism to disarm them or unwind the governance structure.
Mitigating "Ben & Jerry’s Risk"
The primary purpose of the failsafe is to reduce "Ben & Jerry’s risk"—the danger that self-appointed guardians will cause "double trouble" by materially harming investors and simultaneously failing their own mission.
- Deterrence: The mere existence of the switch is intended to deter guardian-appointed directors from "crossing the line" and taking actions that are strongly opposed by a large majority of stockholders.
- Safeguard: It serves as a literal failsafe; if guardians act unreasonably, investors can remove them to protect the firm’s value.
Context of AI Corporate Governance
In the broader landscape of AI governance, the Anthropic model is presented as a more stable alternative to the "fully insulated" model seen at OpenAI.
- OpenAI Contrast: While OpenAI’s 2025 structure requires directors to ignore investor interests in matters of safety and security, Anthropic’s guardians are never explicitly required to ignore profits and can choose to prioritize them.
- Governance Trade-off: The sources note a trade-off: while the kill switch reduces "Ben & Jerry's risk," it simultaneously increases the likelihood that guardians will "do too little" for the mission because their power is contingent on investor approval.
The sources conclude that given the high-profile meltdowns at Ben & Jerry’s and OpenAI, future firms are unlikely to adopt fully-insulated structures, making the Anthropic-style partly-insulated model a more probable template for mission-driven AI governance.
The sources provide a critical analysis of the "investor-overriding guardian" model, concluding that this governance structure is fundamentally flawed because it attempts to manage a deep and potentially unmanageable tension hard-wired into a firm's corporate DNA: raising billions from profit-seeking investors while allowing self-appointed individuals to override those investors to pursue an open-ended social mission.
The Core Analysis: "Doing Too Much"
The standard critique of mission-driven firms is that "guardians" will eventually prioritize profits due to economic pressure (doing "too little" for the mission). However, the sources argue that guardians often "do too much." Because they lack accountability to investors and have no personal financial stake in the firm's success, they may act in destructive ways that cause "double trouble": materially harming the firm’s value while simultaneously undermining the mission they were appointed to protect.
Implications for AI Corporate Governance
The authors analyze the implications of this model specifically for the AI sector:
- The Fallacy of Established Models: AI firms often point to successful mission-driven companies like The Hershey Company or Novo Nordisk for legitimacy. The sources argue this comparison is a false reassurance. In those companies, the mission (funding a school or medical research) is downstream of commercial success; making money is the mission. In contrast, OpenAI and Anthropic have missions that may require sacrificing lucrative projects for the sake of safety, creating a sharp, inevitable conflict that the Hershey/Novo Nordisk models never faced.
- Fiduciary Duty Ineffectiveness: The sources analyze the 2025 Public Benefit Corporation (PBC) structure of OpenAI and find it provides little protection for investors. The charter explicitly allows (or requires) directors to ignore profit in matters of "safety and security". Because the term "safety" can be broadly interpreted, guardians remain effectively unaccountable for decisions that subordinate investor interests to their own view of the mission.
- Regulatory Limits: While regulators might see "in-house" guardians as a way to police AI safety, the sources suggest this is unlikely to yield broad social benefits. If one firm (like OpenAI) throttles a profitable but "unsafe" advance, "unguarded" competitors (like Google, Meta, or international rivals) will simply rush to seize that opportunity, rendering the guardian's sacrifice moot.
Analysis of the "Ben & Jerry’s Risk" Over Time
A key implication of the analysis is that Ben & Jerry’s risk increases over time. When Ben & Jerry's established its independent board in 2000, the arrangement worked for two decades before failing spectacularly in 2021. The authors warn that while an AI guardian arrangement might seem stable today, the identities of the guardians and the political/technological environment will change in unpredictable ways, potentially leading to a "civil war" between investors and mission-keepers decades down the line.
Final Implications and Prediction
The sources conclude with several normative implications:
- For Investors: Purchasing equity in firms with "fully insulated" guardians is a high-risk gamble.
- For Governance Design: The analysis strengthens the case for "partly insulated" guardians—like those at Anthropic—who are protected by a "kill switch" allowing a super-majority of investors to remove them if they become too destructive.
- The Prediction: Given the highly publicized meltdowns at Ben & Jerry’s and OpenAI, the authors predict that few future firms will deliberately choose to adopt a structure with fully-insulated guardians.
No comments:
Post a Comment