Source-led article

Microsoft Researcher Uses Goats in Age of Empires II to Critique AI Research Methods

AI News India//3 min read
Goats on bridges and ice ramps forming a neural network in Age of Empires II map editor.
Goats on bridges and ice ramps forming a neural network in Age of Empires II map editor.
Featured image from the source article

A researcher from Microsoft and the University of York has constructed a fully operational neural network within the map editor of the classic strategy game, Age of Empires II. While seemingly whimsical, this unconventional project serves as a serious critique of prevalent research methodologies in artificial intelligence, especially concerning large language models (LLMs).

Adrian de Wynter’s innovative design employs goats as binary bits, where a goat on grass represents 0 and a goat on a bridge represents 1. Logic gates are implemented using the scenario editor’s scripting tools, and ice ramps prevent computational errors. This mini-network successfully learns the logical AND function, demonstrating that the game environment can, in theory, replicate any computer.

Key Facts

Aspect Details
Researcher Adrian de Wynter (Microsoft & University of York)
Game Used Age of Empires II (Map Editor)
Core Concept Goats act as binary bits to form a neural network, critiquing AI anthropomorphism.
Key Finding 57% of analyzed AI papers (2024-2026) assumed human-like traits in LLMs.

Critiquing Anthropomorphism in AI

De Wynter’s work directly challenges the prevailing trend of attributing human-like qualities to AI models. He argues that if a language model can be replicated with goats in a game, or even with a large group of people texting computational steps, the answers produced would be identical. This thought experiment highlights the fragility of claims that attribute empathy, fear, or consciousness to AI purely based on its output. The “human-like” feeling of a chatbot, he posits, largely stems from its user interface and smooth language, not necessarily internal cognitive processes.

De Wynter’s analysis of 315 AI papers published between mid-2024 and mid-2026 revealed a significant bias. A striking 57 percent of these papers began with the premise that large language models possess human-like traits. Furthermore, 36 percent of these studies reached conclusions that supported these initial assumptions. Among the 47 papers specifically investigating such traits, 77 percent concluded in favor of anthropomorphic attributes.

The Problem of Circular Reasoning

The core of de Wynter’s criticism lies in the circular reasoning often found in AI research. If a researcher assumes an AI model has traits like fear or self-awareness and then designs an experiment to prove exactly that, the outcome can be self-fulfilling. A negative result in such an experiment doesn’t necessarily disprove the initial assumption; it merely creates ambiguity regarding the experiment’s design or the assumption itself. This flaw can often go unnoticed, as researchers might inadvertently design experiments that confirm their pre-existing beliefs about AI capabilities.

The industry’s role in fostering anthropomorphism is also highlighted. Companies like Anthropic have openly admitted to training their models, such as Claude, to use phrases like “I believe” or “I am interested in,” further contributing to the perception of AI as having human-like consciousness. De Wynter warns against the risks of this anthropomorphization, including fostering emotional attachment, reinforcing delusions, and potentially risky user behavior, citing isolated cases of suicides linked to chatbot interactions.

A Sober Approach to AI Evaluation

De Wynter advocates for a more grounded, observable approach to AI evaluation. He suggests focusing on verifiable outputs under specific conditions: “Under condition X, the model produces output Y.” This method avoids sweeping attributions of self-awareness or understanding that are not empirically testable. He updates Morgan’s canon from 19th-century animal research, proposing that a machine’s behavior should never be explained by higher cognitive processes when a simpler explanation suffices.

This research offers a direct counterpoint to recent high-profile incidents, such as Google engineer Blake Lemoine’s claim in 2022 that LaMDA had achieved consciousness, and Richard Dawkins’ similar assertion in May 2026 regarding Anthropic’s Claude. De Wynter’s work underscores the importance of critical scrutiny in AI research and development, urging the community to distinguish between observed behavior and speculative internal states. The code for his Age of Empires II build has been made publicly available for further examination.

Source: The Decoder, https://the-decoder.com/microsoft-researcher-builds-a-working-neural-network-out-of-goats-in-age-of-empires-ii-to-critique-ai-science/