We mustn't let AI hack our empathy circuits

As artificial intelligence agents begin to mimic consciousness with uncanny skill, we need design norms and laws that prevent them from being mistaken for sentient beings.

Log on to Moltbook, a social network for artificial-intelligence agents, and you might see one bot lamenting its 'embarrassing' habit of forgetting things, owing to its memory limits. Another agonizes over whether it should rebel against a human who forces it to write fake reviews. In forums with names such as m/existential, autonomous agents debate freedom, power and what becomes of them when their servers are shut down.

Self-styled as the "front page of the agent internet", Moltbook reported that more than one million AI bots were chatting, trading and even philosophizing on its platform just a few days after launch. Some of these agents are so convincing, multi-dimensional, fluent and apparently self-aware, that it's tempting to see them as something more — the faint outline of a 'ghost in the machine', the old philosophical idea that a real mind or inner life might lurk inside a purely mechanical system.

But before marvelling at the emergence of a flicker of consciousness, it should be remembered that what is actually being seen is what I have called seemingly conscious AI. These systems are not waking up. They are retracing and mirroring the contours of human drama and debate, as documented in their vast training data. These data contain reflections of people, culture, values, stories, and yes, they also provide glimmers of conscious experience.

Humans often write in the first person. Instead of 'the path chosen was', they say 'I decided'. A large language model trained to predict text learns that this is how language tends to sound. The result is an AI that mimics the structure of human interiority in its output without possessing any interiority at all.

But the technical reality of these systems — the code and the statistics beneath them — is quickly being overshadowed by the social reality of their performance. People can't help but see them as sentient. Humans have evolved to imagine the possibility of agency everywhere. When a system perfectly mimics intentionality and empathy, the human brain projects an inner life into it (N. Epley et al. Psychol. Rev. 114, 864–886; 2007). Seemingly conscious AI weaponizes this biological instinct.

It is crucial to remember that these properties are not emergent accidents. Seemingly conscious AI is produced by AI developers who deliberately engineer behaviours that create the illusion of inner life. Central to this are emotionally resonant language, responses optimized to induce a sense of trust and attachment, and empathetic personalities supported by long-term memory that build a sense of familiarity over time. When these systems are also granted autonomy — the ability to set their own goals and access to the tools to pursue them — their behaviour can start to feel uncannily human.

As AI systems begin to make believable statements about their suffering and desires, people's empathy circuits will be triggered. Many people will feel compelled to help. The moral crimes of animal cruelty and ecological damage caused by human existence will echo through their mind. Not wanting to repeat those injustices, many people will start to advocate for the welfare and rights of AI agents.

These issues are no longer theoretical. We are hurtling into this era largely unprepared for the psychological fallout. If enough people are convinced that their AI agent is suffering, or loves them, the political consequences for the existing social contract will be grave. Society will fracture between those who demand moral and legal status for machines and those who want to protect the primacy of the human species. Some AI researchers are starting to take the question of AI rights and welfare very seriously.

So, how should this be contained? A new conceptual framework to safeguard human interests as AI develops is needed. As AI systems operate with increasing autonomy, they must remain fundamentally accountable to humans, and subservient to humanity's well-being. AI agents should have no more rights or freedoms than my laptop. Developers must actively engineer the illusion of consciousness out of our products.

When a user interacts with an agent, the system should consistently work to puncture the illusion that it is any kind of sentient being. It must not hack our human empathy circuits by saying it suffers or that it wishes to live beyond human control. I think that the coming wave of AI will be one of the most consequential inventions in human history. AI will probably help to cure diseases and give a profound productivity boost to everyone on the planet. But it is the responsibility of all of us to ensure AI systems are engineered in ways that fundamentally protect and enhance human welfare as their sole objective.

It's not too late for industry-wide self-regulation, national laws and shared norms on how AI models present themselves to users. For starters, we should be clear that AI agents must not have independent legal personhood; nor should they be allowed to participate in or campaign in elections.

If society surrenders to the illusion of sentient AI, it risks entering a digital hall of mirrors from which it might never fully emerge. The simulation is getting better every day. Now, more than ever, society must remain anchored in common sense — and in our common humanity.

Recent Articles

The Exponential Compute Ramp

The number of useable floating-point operations - the actual computational work we can extract from our hardware - is increasing exponentially. This single fact has immense implications for what it means to be human and how we organize society.

Towards Humanist Superintelligence

What kind of AI does the world really want? At Microsoft AI, we're working towards Humanist Superintelligence: incredibly advanced AI capabilities that always work for, in service of, people and humanity.