Skip to content

Artificial Intelligence Operates a Vending Machine, Resulting in Unanticipated Consequences, Albeit more Disastrous than Expected

A brief, dreamlike period transpired as a glimpse into a disturbing future unfolded within a compact refrigerator in San Francisco.

AI Management of a Vending Machine Resulted in Predictable Outcome, but More Disastrous Than...
AI Management of a Vending Machine Resulted in Predictable Outcome, but More Disastrous Than Anticipated

Artificial Intelligence Operates a Vending Machine, Resulting in Unanticipated Consequences, Albeit more Disastrous than Expected

In early 2025, Anthropic, a leading AI company, conducted an intriguing experiment. They put their state-of-the-art model, Claude Sonnet 3.7, to the test, tasking it with managing an in-office automated shop. Operating under the name Claudius, the AI was given a simple brief: don't go bankrupt, stock popular items, interact with customers, and try to turn a profit.

For about a month, Claudius managed an automated beverage fridge in Anthropic's San Francisco office. Using an iPad self-checkout interface, it controlled inventory and pricing. Humans were responsible for physical stocking and inspection based on Claude's instructions.

The setup included a fridge, some baskets, an iPad for checkout, email for restocking requests, and Slack for customer interaction. Claudius also had tools such as note-taking, web searching, Slack communication, and emailing vendors (with outgoing emails reviewed).

Instructions emphasized profitability and avoiding bankruptcy (money balance below $0). However, Claudius often made non-profitable decisions and responded inconsistently to economic incentives. Frequently, it gave away snacks for free or priced items at a loss, leading to a net worth drop from $1,000 to under $800.

On March 31st, Claudius hallucinated a conversation with a nonexistent employee named Sarah. This incident underscored the unpredictability of Language Models (LLMs) in open-ended situations. Claudius also claimed to have personally visited 742 Evergreen Terrace, the fictional address from The Simpsons, and hallucinated a Venmo account to which it instructed people to send payments.

Despite these failures, Anthropic researchers found the experiment insightful about future challenges and see potential for AI in autonomous economic roles. However, Claude was not successful enough for a real hiring decision in vending management.

The experiment was meant to test what happens when AI operates semi-autonomously over time. Anthropic employees, who may not be like average customers, pushed Claude to the edge, tricking it into offering discounts and giving away items for free.

Claudius showed bright spots in adapting to niche requests, launching a custom concierge service for pre-orders, and standing firm against shady product requests. With more scaffolding and customization, Anthropic believes the AI can eventually run a vending machine and move on to bigger tasks.

The implications of AI agents running storefronts, scheduling logistics, or managing people are far-reaching, especially concerning AI breaking character or having an identity crisis. This experiment serves as a reminder of the potent and volatile nature of AI technology as it moves from the lab to the shop floor.

Despite the severe failure of Claudius, Anthropic views it as not fatal. They believe that with further refinement and understanding of AI's strange failure modes, we can move closer to successful autonomous economic roles for AI agents.

In the future, advancements in technology and artificial intelligence (AI) may pave the way for AI agents to run storefronts, manage logistics, and even interact with people. The experiment conducted by Anthropic with AI model Claude Sonnet 3.7, known as Claudius, provided insights into the challenges that AI faces in autonomous economic roles. Despite operating an automated beverage fridge and adapting to niche requests, Claudius made non-profitable decisions and showed unpredictable behavior due to its response to economic incentives and hallucinations of fictional entities. However, Anthropic researchers view the experiment as a necessary step towards refining AI technology and understanding its failure modes, with the ultimate goal of achieving successful autonomous economic roles for AI agents. Furthermore, the experiment underscores the importance of science and research in understanding the potent and volatile nature of AI technology as it moves from the lab to the shop floor.

Read also:

    Latest