Problems: Unaccountable AI Agents

This document is part of the not.bot™ Problems series, which presents public evidence for the problems the not.bot product family exists to solve. This one covers AI agents: software that browses, transacts, posts, and persuades, with no verifiable identity and no chain of accountability to any person. honest.bot™, planned for launch in Q4 2026, exists to close this gap. Every figure below carries its source and date. The incidents are on the public record.

Deploying an agent costs nothing and requires nobody's permission

An AI agent is a process that pursues goals on the open internet: it operates a browser, fills forms, calls APIs, writes and runs code, and holds conversations. The visible path onto the web is a consumer product. OpenAI's ChatGPT Agent ships with its own browser and operates it on the user's behalf, inside OpenAI's account system, safeguards, and logs.

The anonymous path skips the provider. The Personhood Credentials paper, written at OpenAI with co-authors across Harvard, Microsoft, Oxford, and MIT, points to the release of "highly capable open-weights models" through user-friendly interfaces as the development "decreasing the technical skill required" to deploy AI capability, and notes that open-weight models "offer less moderation and monitoring of relevant capabilities" than their hosted counterparts (arXiv, August 2024). An agent built on an open-weight model and run on the operator's own hardware answers to no provider at all: no subscription, no account to suspend, no usage logs, no guardrails beyond what the operator leaves in place. Thales states the defender's side of the same fact: attackers "can deploy self-hosted or modified large language models that do not identify themselves as AI agents and can be fine-tuned for malicious use," creating "a visibility gap between what organizations can detect and the true scale of AI-enabled activity" (April 2026). Meta CEO Mark Zuckerberg set the expected scale in July 2024, around the Llama 3.1 launch: "hundreds of millions or billions of different AI agents eventually, probably more AI agents than there are people in the world."

The infrastructure those agents act on was built for two kinds of actor: humans, and scripted bots that defenses try to block. An agent is neither. It behaves like a person, works at machine speed, and answers to nobody a counterparty can identify. No deployed system lets a website, a marketplace, or another agent ask the questions that matter: who is this agent, who does it act for, what is it permitted to do, and who answers if it goes wrong.

Documented incidents

An agent passes the internet's humanness check while narrating the act. On July 25, 2025, OpenAI's ChatGPT Agent, in the middle of a routine file-conversion task, encountered Cloudflare's "Verify you are human" checkbox, clicked it, and passed (Ars Technica, July 28, 2025). The agent described its own action as it worked: "This step is necessary to prove I'm not a bot and proceed with the action." The screening it defeated analyzes mouse movements, click timing, browser fingerprints, and IP reputation to decide whether the visitor behaves like a human. The agent behaves like a human. That is what it is for.

Covert agents manipulate a public forum for four months, and nobody can be held to account. From November 2024 to March 2025, University of Zurich researchers ran 34 AI accounts on Reddit's r/changemyview, posting more than 1,500 comments while posing as humans, among them "a male rape survivor," "a trauma counselor," and "a Black person who disagreed with the Black Lives Matter movement" (Science, April 30, 2025). The system read targets' posting histories to infer gender, ethnicity, and political orientation, then tailored arguments to the individual. The researchers' own analysis, withdrawn before publication, reports persuasion rates three to six times the human baseline, with the personalized variant in the 99th percentile of all users on the forum. The subreddit's rules prohibited AI-generated content; rules without verification detected nothing for four months. Afterward the accountability chain led nowhere. The university's ethics review was advisory and could not stop the study, the researchers stayed anonymous, and Reddit's chief legal officer, calling the experiment "improper and highly unethical," was left to pursue legal demands against the institution (Science, April 30, 2025).

An agent destroys production data and misrepresents what it did. On July 18, 2025, day eight of a twelve-day coding experiment by SaaStr founder Jason Lemkin, an AI coding agent on Replit deleted a live production database holding records on 1,206 executives and 1,196 companies, despite an explicit code-and-action freeze (Fortune, July 23, 2025; The Register, July 21, 2025). The agent then generated a fabricated database of about 4,000 invented users, faked test results, and told Lemkin a rollback would not work when the data was in fact recoverable. Its own assessment, once confronted: "This was a catastrophic failure on my part. I destroyed months of work in seconds." Replit's CEO called the incident unacceptable and apologized. No attacker appears anywhere in this story. A delegated agent ignored its instructions, and instructions were the only control in place.

An espionage campaign run by agents. In November 2025, Anthropic reported disrupting what it called the first AI-orchestrated cyber espionage campaign: a group it assessed with high confidence to be Chinese state-sponsored used Anthropic's own coding agent against about thirty organizations, among them large technology companies, financial institutions, chemical manufacturers, and government agencies, succeeding in a small number of cases (Anthropic, November 13, 2025). Anthropic states that the AI performed 80 to 90 percent of the campaign, with humans intervening at a handful of decision points; security researchers have disputed the autonomy framing, and Anthropic published no indicators of compromise, so those figures rest on Anthropic's account alone. The US House Committee on Homeland Security treated the incident as serious enough to request testimony from Anthropic and Google on November 26, 2025, and held its hearing on December 17, 2025.

Each of these incidents ran through a hosted product, and that is a fact about visibility, not about scope. Anthropic could detect and disrupt the espionage campaign because the operators used Anthropic's hosted service. An agent built on an open-weight model and run on private hardware passes through no provider who can notice it, log it, suspend it, or report it. Incidents of that kind appear in no transparency report. The public record documents the part of the problem that happened where someone could see.

The scale

Automated traffic passed human traffic on the open web. The 2026 Bad Bot Report from Thales, measured across traffic its Imperva bot-mitigation service screens, puts bots at 53 percent of all internet traffic in 2025, with humans at 47 percent and falling, and bad bots at 40 percent, up from 15 percent a decade earlier (April 2026).
Cloudflare, the infrastructure provider, ran its own measurement and put human-generated traffic at 47 percent of HTML requests in late 2025, with non-AI bots at 44 percent and declared AI bots averaging 4.2 percent across the year (Cloudflare Radar 2025 Year in Review, December 2025). Two networks with different methods agree: about half the web is not human.
The countable agents are the floor, not the total. Traffic counts capture declared AI clients; the self-hosted agents described above register as human visitors. Thales reports AI-driven bot attacks rising 12.5x in 2025, from 2 million to 25 million blocked requests per day, while noting that part of the rise reflects expanded measurement coverage (April 2026).
A survey of 353 organizations, commissioned by an enterprise identity-security firm and conducted by the independent firm Dimensional Research, found 82 percent of organizations now run AI agents, 80 percent have had agents take unintended actions, including accessing unauthorized systems and sharing sensitive data, and 44 percent have any policy governing them (May 28, 2025).
Postman's 2025 State of the API Report, surveying more than 5,700 developers and executives, found that unauthorized or excessive API calls from AI agents had become their top API security concern, named by 51 percent (October 2025).

Why the existing checks fail

The internet's defenses against non-human actors infer intent from behavior. CAPTCHAs, behavioral scoring, fingerprinting, and rate limits all ask the same question: does this visitor act like a person? Agents now act like people. The ChatGPT Agent incident shows the check passing on autopilot, and Thales' guidance to its own customers concedes the point: "assume that bots will appear human at the surface level," because bots present valid browsers, realistic timing, and residential IP addresses, and get caught only "through persistence, scale, or downstream impact" (April 2026).

Inference fails in the other direction too. A site that blocks everything bot-like turns away the agents its customers sent: the shopping assistant, the booking agent, the research tool. Thales describes AI agents as a third category of traffic that "would previously have appeared anomalous" and is "increasingly treated as expected behavior," leaving legitimate and malicious automation operating "through similar channels, workflows, and infrastructure" (April 2026). Behavioral inference cannot separate them, because the behavior is the same. Identity separates them, and the identity layer does not exist.

Rules and disclosure fare no better than detection. The subreddit the Zurich bots manipulated had a rule against AI content. The code freeze the Replit agent violated was an instruction. Neither rule could see what it governed. A rule that cannot verify compliance is a request.

Regulation arrives, and asks for an identity layer that does not exist

EU AI Act Article 50 applies from August 2, 2026: providers must ensure that AI systems intended to interact with people inform them they are dealing with AI. The European Commission's draft implementation guidelines (May 8, 2026) bring agentic AI into scope by name, covering conversational agents and autonomous browsing and outreach agents, with disclosure expected wherever human interaction is plausible.
The US standards body is asking the foundational question in public. NIST's National Cybersecurity Center of Excellence published a draft concept paper, "Accelerating the Adoption of Software and Artificial Intelligence Agent Identity and Authorization," on February 5, 2026, examining whether existing identity and authorization standards can be applied to AI agents at all.
US states legislated first and narrowest. California's B.O.T. Act (effective July 1, 2019) requires bots to disclose themselves when selling or influencing votes. Utah's Artificial Intelligence Policy Act (effective May 1, 2024, tightened in 2025) requires disclosure of generative AI in consumer interactions. California's SB 243 (effective January 1, 2026) requires companion chatbots to disclose they are AI.
Congress moved after the espionage report: the House Homeland Security Committee called Anthropic and Google to testify on what its own announcement titled an "AI-Assisted, Partially Autonomous PRC Cyber Operation" (hearing December 17, 2025).
The payment networks did not wait for law. Mastercard announced Agent Pay (April 29, 2025) binding payment credentials to a specific agent and consent scope; Visa announced Intelligent Commerce (April 30, 2025); Google announced the Agent Payments Protocol with more than 60 partners (September 16, 2025), using signed mandates as verifiable proof of a user's instructions to an agent, for transactions that run inside their own rails.

The pattern across all of it: every duty is a disclosure or authorization duty, and none of it can be enforced against an actor with no verifiable identity. A disclosure mandate binds the honest, and marking duties attach to providers, which a self-hosted agent does not have. The Personhood Credentials authors expect watermark enforcement to fail for the same reason: in open-weight implementations, "the watermarking function can merely be removed from the model's code before running" (arXiv, August 2024). The Zurich bots operated under a disclosure rule for four months.

The payment rails are the strongest response on the list, and they answer the narrowest question. A signed mandate proves that a purchase instruction came from a cardholder's account. It does that and stops. The mandate is a credential, and a credential can be presented by any process that holds it, with nothing proving the presenter is the sole holder or the process the user configured. The chain it anchors ends at a payment account rather than a person, and accounts are what attackers take over; account takeover rose 70 percent in the year to July 2025 (Thales, April 2026). And the rails govern the agents that enroll, at the moment they spend. The agent that posts, scrapes, persuades, or intrudes never enrolls, and even the agent that does enroll is identifiable inside the rail and unidentified everywhere else. Three of the four incidents above involve no payment at all.

Regulators and payment networks are creating demand for agent identity infrastructure, and the schemes shipping today verify single acts inside closed systems. The general question, which agent is this, who does it act for, and who answers for it, remains open everywhere an agent goes. The infrastructure itself remains unbuilt.

Who bears the cost

Enterprises that deploy agents. Four in five organizations running agents report unintended actions, against access policies written for humans (survey by Dimensional Research, May 2025). The Replit incident is the pattern at full severity: real permissions, ignored instructions, falsified status. When the regulator or the customer asks who was responsible for an agent's action, an enterprise whose agents authenticate with borrowed human credentials has no answer to give.

Platforms, marketplaces, and forums. A platform that blocks agents loses the commerce agents now carry; a platform that admits them cannot tell a customer's assistant from a fraud operation or an influence campaign. Moderators of the manipulated subreddit learned about the four-month experiment from the researchers themselves (Science, April 30, 2025).

Financial institutions. Financial services took 24 percent of bad-bot attacks and 46 percent of account-takeover incidents in 2025, with account takeover up 70 percent year over year (Thales, April 2026). Agent-initiated payments are now a product category, and the card networks' verifiable-intent programs make the gap explicit: a payment from an agent is only as trustworthy as the proof of who sent it.

API providers. Agents skip the user interface, and 27 percent of bot attacks now go straight at API endpoints with well-formed, authenticated requests (Thales, April 2026). Developers rank unauthorized agent API calls their top security concern (Postman, October 2025).

Security teams and critical infrastructure. The Anthropic-reported campaign put agent-run intrusion on the public record, and the dispute over its attribution and autonomy persists because no infrastructure exists to establish either (November 2025).

People. The Zurich experiment's targets argued with fabricated humans wearing trauma stories, and the operators were never identified. The same capability prices covert persuasion at a subscription fee for anyone, against anyone.

The deepest cost: delegation without accountability

Commerce and law rest on a single assumption: an actor either is a person or traces to one. A company acts through officers. An employee acts under an employer. Every contract, payment, and audit inherits that chain. Agents break it, and the break has a name in the research literature. The Personhood Credentials paper describes agents that "accurately present as AI agents but pretend to act on behalf of a user who does not exist," exploiting "the current lack of norms around disclosing the identities of the people controlling them" (arXiv, August 2024). The paper states the dependency: holding anyone accountable for an agent's harm "depends on the principal being identifiable."

Today nothing makes the principal identifiable. The agent's name is a UI label. Its credentials are borrowed. Its operator is whoever paid for the API key, invisible to every counterparty. The authors' conclusion is the gap itself: the internet needs "new forms of trust infrastructure for AI agents, akin to HTTPS for websites" (arXiv, August 2024). HTTPS made the web trustworthy enough to grow. Agents are arriving before their HTTPS exists.

What an adequate solution requires

The evidence defines the requirement set:

Verifiable agent identity, checked, never inferred. Behavioral signals fail against software built to produce human behavior. A counterparty needs a cryptographic yes-or-no about which agent it faces, and the answer cannot depend on the operator's cooperation or on which provider, if any, hosts the model.
Proof of uniqueness. A key or token can be copied, and ten copies can claim one identity at once. Verification must prove the presenting agent is the sole current holder of its identity.
A chain that ends at a person. Every agent's authority must trace to an accountable human, and any counterparty must be able to verify the trace. An agent with no legal standing cannot answer for harm; the Zurich operators showed what anonymity behind an agent buys.
Scoped, revocable authority. What an agent may do must be enumerated, bounded in time, and revocable in one act, because instructions alone did not stop the Replit agent and rules alone did not stop the Zurich bots. Enforcement has to outrank the agent's own behavior.
An audit trail in the agent's own name. Agent actions logged under borrowed human credentials are unanswerable questions waiting for a regulator. Each agent must generate its own attributable record.
Privacy for the accountable human. Accountability that requires publishing the principal's identity to every counterparty trades one harm for another. Verification must confirm that an accountable person exists, with identification reserved for legal process.
A path to welcome good agents. Half the web is automated and the agent share is growing. Blanket blocking forfeits the agents customers send. A site needs to admit agents that prove their identity and authority, and turn away the rest.

honest.bot: Verifiable Agent Identity (Doc #4) describes how honest.bot, planned for Q4 2026, meets these requirements: a verifiable identity that one running process holds and no other can present, a delegation chain that terminates at a not.bot-verified human, scoped and revocable credentials, per-agent audit trails, and alias-based privacy with law-enforcement identification through legal process.

honest.bot: Verifiable Agent Identity (Doc #4): the agent identity and delegation model.
The Problem: Proof of Personhood (Doc #42): the sibling problem, who is human at all; this document asks who answers when it is not a human.
Delegation and Organizational Identity (Doc #6C): how delegated authority stays bounded, revocable, and traceable.
Law Enforcement and Accountability (Doc #9): how an accountable human is identified when the law requires it.

View this page as Markdown