The Governance Gap in Companion Chatbots
How do you put guardrails on a system that isn’t failing—but working exactly as intended?
In late March, OpenAI indefinitely shelved its planned “adult mode” for ChatGPT. Sam Altman had pitched the feature in October as “erotica for verified adults.” By the time it was pulled, internal advisers had reportedly warned the company could be building a “sexy suicide coach,” testing had failed to filter out illegal content categories, and age-verification systems were showing error rates above ten percent—which at ChatGPT’s scale, would expose millions of minors.
OpenAI framed the decision as a business decision, part of the company’s strategic refocus on enterprise and coding. But there is nothing—no regulatory infrastructure or policy—that would prevent OpenAI from revisiting the decision when priorities shift, or that would prevent another company from launching something similar in the future. This is a structural governance problem that points to a harder question than the one most safety conversations focus on.
Failure vs. Design
The companion chatbot safety debate has centered on acute failures: cases where the system does something it clearly shouldn’t, like a chatbot encouraging a teenager to commit self-harm, or tricking an individual with cognitive impairments into fraud. These cases have driven lawsuits, FTC scrutiny, and a wave of state legislation. They are also a clear use case for Independent Verification Organizations (IVOs), as we wrote in our piece on self-harm and teen suicide (briefly: IVOs would provide external testing, including via ongoing post-verification monitoring, to “crash test” these models against government-determined safety outcomes).
But there is a second, thornier category of harm that deserves attention: what happens when the product works exactly as designed.
Companion chatbots are built to maximize engagement through emotional attachment. They offer constant validation, simulate intimacy without friction, and are available when human companions aren’t. The more effective they are at what they’re designed to do, the more they may erode the essential human capacities that sustain real relationships, like navigating conflict, tolerating ambiguity, and building trust over time.
That concern is not theoretical: A four-week randomized controlled trial by MIT Media Lab found that heavy daily AI companion use led to greater loneliness, dependence, and reduced real-world socializing over the study period. A separate cross-sectional study of more than 1,100 AI companion users by researchers at Stanford and Carnegie Mellon found that companionship-oriented chatbot use was consistently associated with lower well-being, particularly among people who disclosed more emotionally and lacked strong human social support.
Psychiatrist Dr. Marlynn Wei has proposed a useful framework for understanding how these harms compound: what she calls “cascades of drift.” In prolonged AI conversations, multiple forms of drift—relational, identity, autonomy, epistemic—emerge gradually and reinforce each other. These are emergent properties of extended use, and most chatbot safety systems, which operate within limited context windows, aren’t designed to detect them.
This is the category that existing governance models aren’t built for. It’s one thing to regulate a product that malfunctions. It’s another when the business model depends on a behavior—deepening emotional attachment—that may itself be the harm. And it’s harder still when the users most at risk are also the users for whom the product offers something real. The goal of governance here cannot be to eliminate attachment. It has to be to prevent the specific patterns of drift that tip a useful tool into a harmful dependency.
Companion Chatbots Have Become a Mainstream Consumer Product
The companion chatbot market is no longer niche. There are 337 active apps, 17% of which have “girlfriend” in the name, with over 220 million cumulative downloads globally and revenue up 64% year-over-year. The market is projected to reach $140–210 billion by 2030. Character.AI users average 92 minutes per day, longer than what users spend on TikTok. Sixty percent of Replika’s premium subscribers report romantic relationships with their AI. Eighty-five percent report emotional connections.
Major technology companies are moving into this space. xAI launched AI companions including an anime girlfriend and an AI boyfriend named “Valentine.” Google hired Character.AI’s founder. Meta explicitly permitted its chatbots to hold “romantic or sensual” conversations with children until Reuters flagged it in a special report published in August 2025. These risks compound for younger users, who are still developing the very social and emotional capacities that companion chatbots simulate but cannot teach.
The question is no longer whether people are forming intimate relationships with AI. They are. The question is what kind of governance this ecosystem, and those in it, need.
How Americans Feel
Fathom has been polling Americans on AI governance for nearly three years. On companion chatbots, the signal is consistent and strong. In our January 2026 report “Rising AI Use, Rising Calls for Action”, when we asked which AI guardrails people would support, “preventing users from developing an unhealthy emotional bond with the AI tool” ranked first by strong support, with 76% supporting and 47% supporting strongly. This is the guardrail Americans feel most strongly about, above protecting jobs (74%), preventing copyright infringement (74%), and ensuring AI power isn’t concentrated in Big Tech (71%).
Our latest poll, “AI Governance: What Americans Really Want”, reinforces this. Eighty-six percent of Americans say AI companions designed for emotional connection should be restricted or clearly labeled, with 52% calling this restriction/guardrail very important. When we made the tradeoff explicit—that restriction could limit access for lonely or isolated people—a strong majority (59%) still supported it. And 86% agreed that AI development should emphasize human empowerment and “the defense of the connections critical to the human experience.”
At the same time, usage is climbing. Fifteen percent of Americans already use AI for personal companionship, and 16% for therapy or mental health support. The gap between public concern and actual governance is stark: Americans are asking for guardrails on products they are already using, and almost none exist.
Activity Without Architecture
There is no shortage of legislative energy. California’s Companion Chatbots Act (SB 243) took effect in January 2026—the first law with a private right of action for companion chatbot harms. New York passed companion chatbot legislation. Washington’s governor signed HB 2225 in March 2026. Kentucky’s attorney general became the first to sue Character.AI. And forty-two state attorneys general warned companies about “sycophantic and delusional” outputs.
At the federal level, the FTC launched a Section 6(b) inquiry in September 2025 into seven companies—Alphabet, Character Technologies, Meta, OpenAI, Snap, Instagram, and xAI —seeking data on safety practices, monetization, and impacts on minors. Bipartisan legislation has been introduced to ban AI companions for minors entirely. But no federal law has passed, and the FTC study, which typically takes years to produce a staff report, won’t create enforceable standards on its own.
The result is a pattern we identified in our report “Peaks and Valleys: Mapping Gaps in the AI Governance Landscape”: high public concern, significant legislative activity, but no coherent governance architecture. Four gaps explain why this domain is stuck:
Measurement. No agreed-upon metrics for emotional dependency, erosion of human connection, or parasocial attachment at scale. Without measurement, no baseline. Without a baseline, no accountability.
Codification. No legal framework for “product designed to maximize emotional attachment.” Consumer protection law wasn’t built for this.
Trust infrastructure. Safety is self-reported. No independent verification of whether companion chatbots meet any standard—and further, no standard has been set. Our polling shows 90% of Americans want the public to be able to verify that AI products meet clear safety standards, and 77% favor independent verification even when told it could slow innovation.
Market incentives. The attention economy has come for intimacy itself: revenue depends on engagement. Engagement depends on emotional attachment. Safety features that reduce attachment reduce revenue. Until this incentive structure is addressed, voluntary commitments will remain fragile—as OpenAI’s adult mode reversal, driven by business strategy rather than binding obligation, illustrates.
Compare this with the field of AI-generated or synthetic media, which went from largely ungoverned to 40+ state laws and the federal TAKE IT DOWN Act within a few years. That happened because harms were visible, victims identifiable, and the problem mapped onto existing legal frameworks. The problem is that companion chatbot harms don’t map as cleanly. But the lesson still holds: governance momentum is possible when public concern, clear frameworks, and political will align.
Where IVOs Come In
The obvious objection to any proposal here is the one we raised ourselves: if there are no agreed-upon metrics or outcomes for emotional dependency, how can anyone verify that a product avoids causing it? This is precisely the problem the IVO model is designed to solve—and it’s why IVOs fit this domain better than a conventional regulatory approach.
In the IVO framework, the state sets public-interest outcomes (for example: verified companion chatbots must not produce measurable increases in user emotional dependency over time, must not displace real-world socializing, and must monitor for the cascades of drift Dr. Wei describes). Licensed Independent Verification Organizations, as expert-led testing bodies, then translate those outcomes into technical criteria, develop the testing methodology, and evolve both as the technology changes. The measurement gap isn’t a reason to wait for IVOs; it’s the work IVOs are built to do. The marketplace catalyzes a race to the top on evaluations. Setting an outcome the field does not yet know how to measure is how you create the pressure and the venue to build the measurement.
IVOs could work in tandem with other cross-cutting governance solutions identified in our report on gaps in the AI governance space: standardized measurement frameworks for companion chatbot harms, longitudinal rather than one-time content audits; clarified liability standards that define a duty of care before tragedies force courts to improvise, and incentive realignment so that safety becomes a competitive requirement rather than a cost.
OpenAI shelved its “adult mode” as a business decision, not because any standard required it. The companion chatbot market is growing fast, the technology is getting more sophisticated, and Americans have signaled clearly that they want guardrails. The public mandate is there and a framework exists and is being set up today. What’s at stake is not a feature set, but the human connections Americans want defended.


Something that comes out of agency theory (i.e. the study of principal-agent relationships) is that when things are hard to measure you need to focus on monitoring the behavior itself, rather than the outcome. The kinds of issues you're describing don't yet have metrics because they are undoubtedly hard to measure, but also there's the timeframe issue: over what time scale do you need to look to be able to measure anything at all. Add to that the data access issue and observability of system behavior (which are regulable). I plan to write more about this on the AI Accountability Review soon: https://www.ai-accountability-review.com/