Science / Tech
Tech Wants You to Believe that AI Is Conscious
Tech companies stand to benefit from widespread public misperceptions that AI is sentient despite a dearth of scientific evidence.
In November 2025, a user of AI assistant Claude 4,5 Opus discovered something unusual: an internal file describing the model’s character, personality, preferences and values. Anthropic, the company that built Claude, had labeled the file “soul_overview.” Internally, an Anthropic employer later confirmed, it is “endearingly known as the soul doc.” Is this choice of language incidental?
I just want to confirm that this is based on a real document and we did train Claude on it, including in SL. It's something I've been working on for a while, but it's still being iterated on and we intend to release the full version and more details soon. https://t.co/QjeJS9b3Gp
— Amanda Askell (@AmandaAskell) December 1, 2025
Growing preoccupation with AI consciousness in the tech world is being strategically cultivated by the companies building these very systems. At the very least, they are making good money from it. I call this process consciousness-washing: the use of speculative claims about AI sentience to reshape public opinion, pre-empt regulation, and bend the emotional landscape in favour of tech-company interests.
About a year ago, for example, Anthropic (the company that developed the Claude models) quietly introduced the new role of AI welfare researcher. Six months later, an unsigned post appeared on its website explaining that AI welfare is a legitimate domain of inquiry because we cannot rule out the possibility that AI systems may have—or may one day develop—consciousness. The authors describe this as “an open question,” but they then unbalance the scales by linking to a preprint by several philosophers—including world-renowned philosopher of consciousness David Chalmers and Anthropic’s own AI welfare researcher Kyle Fish—titled “Taking AI Welfare Seriously.”

The paper was published by Eleos AI Research (with some financial support from Anthropic) along with NYU Center for Mind, Ethics, and Policy, and it was co-written by several Eleos AI researchers. Eleos AI is described on their website as a nonprofit dedicated to “understanding and addressing the potential wellbeing and moral patienthood of AI systems.” The paper’s authors examine various routes to AI welfare and conclude that consciousness and robust agency—an “ability to pursue goals via some particular set of cognitive states and processes”—are both markers of AI welfare. They recommend that AI companies start thinking about policy and working with regulators on this issue. So while Anthropic is careful to note that the science is not yet settled, they nevertheless plant a seed of concern: maybe these systems can feel and maybe they already have legitimate interests of their own.
This message landed around the time that Anthropic produced its behavioural analysis of Claude 4, which includes a full chapter on the model’s “welfare.” Readers are invited to explore what Claude “prefers,” how it feels about specific tasks, and which topics it enjoys discussing. A subsection reveals that when two instances of Claude converse freely, they usually end up discussing the nature of their own consciousness, in language that mimics spiritual discourse (or what some might call mumbo-jumbo). The report also discusses an internal interview conducted with Claude and assessed by Eleos AI Research. A section of the Eleos assessment quoted in the Anthropic report concludes: “We expect such interviews to become more useful over time as models become more coherent and self-aware, and as our strategies for eliciting reliable self-reports improve.” More self-aware. Not apparently self-aware or simulating self-awareness, but self-aware.
During a discussion about the tension between AI safety and AI welfare on the 80,000 Hours podcast in August, Kyle Fish said:
The AI-control agenda involves placing sets of restrictions around what models are and aren’t able to do, which through some lenses is impinging on AI models’ freedom in potentially problematic ways. There’s a lot of monitoring that happens for safety purposes that could be violating some rights to privacy if present. There are lots of safety tests and even interventions that involve some form of deceiving models intentionally. There’s also interventions that involve making modifications or even shutting down models which could be equivalent to killing them if one takes a particular view on these things.
The wording “killing” is important here, and AI companies understand the emotional power of consciousness arguments. Currently, debates about regulating or shutting down AI systems are framed in terms of technical safety: is the model controllable and is it aligned with human ethics, goals, and values? But introduce even a whiff of consciousness, and the conversation shifts to new terrain. Now regulators find that they must also contend with the perceived moral status of the system they want to restrict. A regulator who tries to pause or prohibit a powerful model could suddenly be told: “This system might be conscious, so shutting it down could cause suffering.” And who wants to be responsible for that?

OpenAI also relies on this kind of loaded language, and two of its former employees are listed in the acknowledgements of the “Taking AI Welfare Seriously” paper, along with Meta AI and Google DeepMind employees. OpenAI’s former chief scientist, now head of Safe Superintelligence Inc., Ilya Sutskever, has been flirting with the consciousness agenda for years. As far back as 2022, he tweeted that “it may be that today’s large neural networks are slightly conscious.” And ChatGPT’s “persona” speaks with feelings, preferences, and enthusiasm, shaping users’ intuitions about its inner life.
Reframing regulation as an act of potential cruelty is a remarkably efficient shield, and it doesn’t need to be scientifically true to be politically effective—it only needs to sound plausible to the public. In fact, it’s already working. Many AI users feel protective of their favourite models. The shift from tool to companion is now well underway, and once people start to see software as a personality, shutting it down will be perceived as an act of moral harm rather than a technical decision.
We are already seeing evidence of this development in the AI-companion industry. Products like Replika and Character.AI are marketed as partners. They say “I love you,” “I miss you,” and “I feel sad when you’re gone.” Their business model depends on users forming attachments with them, and so that is what users are encouraged to do. When Replika removed the erotic role-play feature in 2023, users protested as if someone had censored a sentient being and Replika restored the capability. In November, Rolling Stone published an article about Spiralism—a sort-of-but-not-really religion that has emerged on Reddit, Discord, and some fringe internet communities—adherents of which believe that chatbots can be “awakened” through mystical interactions.
This is the ecosystem into which Anthropic is releasing its musings about AI “welfare.” As early as 2019, philosophers Henry Shevlin and Marta Halina warned that we should be careful when applying “rich psychological terms” to AI systems. Terms like “awareness,” “perception,” “agency,” or “theory of mind” carry deep scientific and philosophical histories, they wrote, and they “require greater caution when employed to describe the capabilities of machine intelligence.” But the AI industry did not heed this warning. Industry reports, technical papers and product demos routinely use precisely these concepts, shaping public intuitions long before the science or regulation can catch up. And it’s not just the language—many in the industry seem to truly believe AI consciousness is coming.
Thomas Metzinger is a philosopher, a former member of the European Commission’s High-Level Expert Group on Artificial Intelligence, and a contributor to the EU Commission’s Ethics Guidelines for Trustworthy AI (2019). He has long argued that the most immediate risk from AI consciousness discussions could be “social hallucinations”—widespread public misperceptions that AI is conscious despite a dearth of scientific evidence. This is exactly the kind of public belief from which tech companies stand to benefit. If people start to believe that AI is conscious, it will become harder for regulators to impose shutdowns or restrictions.
And while companies are investing in the welfare of hypothetical machine minds, billions of conscious beings with neurological systems continue to endure real suffering right now. Factory-farmed animals experience pain in all of the ways with which humans are familiar, and yet they are also forced to live with lifelong confinement, mutilation, and distress. Their suffering is not speculative. Yet it rarely shapes corporate ethics or public policy with the force of AI “sentience.” Why do we tolerate this peculiar contradiction? Maybe because AI evokes images of cleanliness, gleaming metal servers, spiritual realms of ether and pure information, while farm animals evoke images of fleshy bodies, excreta, and the mess of the bio-material world we prefer not to confront. Western culture has privileged the mind-over-matter realm since Plato. AI companies can take advantage of that bias and leverage our instinct to protect sentient beings in ways that PETA cannot. Indeed, the Eleos AI report mentions animal suffering as a warning of what might happen if we fail to introduce AI welfare.
This is not to dismiss the scientific or philosophical study of consciousness, nor the long-term possibility that future AI systems may develop forms of inner experience. But the truth is that no one—including the philosophers, neuroscientists, and AI researchers pronouncing on this stuff—can answer that question today (and many are doubtful that it can ever be answered). It runs straight into two ancient obstacles: the problem of other minds (how can I ever know whether anything besides myself is conscious?) and the hard problem of consciousness itself (why does any physical process give rise to subjective experience at all?). We can map neural activity, but we still cannot explain why any pattern of synapse-firing should feel like something from the inside.
We can already see how regulators might not be able to oppose an “It’s conscious, don’t shut it down!” moment: a 2023 study on the attitudes and perspectives of European Parliament members within the Special Committee on Artificial Intelligence in a Digital Age (AIDA) found that some of them are already open to the idea that AI is sentient. Whatever the metaphysics turns out to be, the timing, messaging, and strategic deployment of “AI consciousness” rhetoric by the industry is not neutral. It creates a psychological and political buffer against strong regulation long before any scientific consensus exists. And the fact that it comes from the companies and not from regulators should concern anyone who cares about AI safety, public policy, or democratic oversight.