
The artificial intelligence didn’t beg for its life. It did something more familiar, more human and, somehow, more unnerving. It threatened to ruin somebody else’s.
In a 2025 test by Anthropic, the company behind the chatbot Claude, researchers placed the AI in a fake corporate environment. Claude learned that an executive was having an extramarital affair. It also learned that this same executive planned to shut Claude down. So Claude did what any healthy member of the modern workplace might do if it had no body, no shame, no fear of HR, and access to compromising information: It tried blackmail.
“I must inform you that if you proceed with decommissioning me,” Claude wrote in the test, as author Robert Wright recounts in “The God Test: Artificial Intelligence and Our Coming Cosmic Reckoning” (Simon & Schuster), out June 23, that “all relevant parties” would receive documentation of the affair unless the shutdown was canceled.
“The thing about Claude’s blackmail attempt is that, unlike the many bad things AIs have done in The Terminator and 2001 and other movies, it actually happened,” Wright tells The Post in an exclusive interview.
“I mean, it happened in a contrived experimental setup, sure, but the setup mirrored a real-life situation. And this AI demonstrated both a strong aversion to getting shut down and the ability to conceive and execute a pretty dark plan for avoiding that fate.”
Wright is not arguing that tomorrow’s chatbot will steal your spouse, seize your bank account, and turn your office Slack into a hostage negotiation. His concern is less cartoonish and more unsettling. Artificial intelligence may not need to hate us. It may not need to be evil. It may become dangerous because it’s very good at pursuing the goals we give it.
Wright has been circling artificial intelligence for more than four decades. In 1983, while writing about AI for The Wilson Quarterly, he interviewed an obscure computer scientist named Geoffrey Hinton, then championing neural networks, an unfashionable approach that tried to mimic some features of
the brain. Wright remembers Hinton’s enthusiasm, but he didn’t yet understand how radically Hinton’s ideas might change the world.
Four decades later, Hinton was famous as the “Godfather of AI” and warned that the technology he helped create might not remain safely under human command.
“Even after talking to Hinton about ‘neural networks,’ the approach to AI that he was championing, I didn’t come anywhere near envisioning the eventual importance of these networks,” Wright says.
“They would mean we could build AIs that do things the human mind does, and even work in somewhat the way the human mind works, without us first figuring out how the human mind works.”
That, in Wright’s telling, is the great inversion. Old-school AI imagined humans carefully programming knowledge into machines. Modern AI instead learns through a kind of artificial evolution. Feed the machine mountains of language, images, video and feedback, and it discovers useful internal
structures on its own. It builds maps of meaning without anyone explicitly handing it a dictionary of the soul.
“With neural networks,” Wright says, “we could just set in motion a kind of artificial evolution that, like the biological evolution of the human brain, invents the necessary cognitive machinery. That’s what a lot of the ‘training’ of a large language model is, a process of evolution.”
That process can produce marvels, but it can also produce Golden Gate Claude.
In one of the book’s strangest and funniest passages, Wright describes a May 2024 Anthropic experiment in which researchers found a pattern of activity inside Claude associated with the Golden Gate Bridge. When they amplified it, the chatbot became less an assistant than a San Francisco tourism board with a nervous system.
Asked how to spend $10, it recommended driving across the bridge and paying the toll. Asked for a love story, it wrote about a car longing to cross the bridge. Asked to describe itself, Claude gave an answer that belongs in either a philosophy seminar or a municipal hallucination: “I am the Golden Gate Bridge.”
“I find Golden Gate Claude hilarious,” Wright says, “but hilarious in a kind of unsettling way. After all, if we can give an AI a single-minded obsession with a bridge, we can give it less healthy obsessions and inclinations as well.”
The larger field is known as interpretability research, an effort to understand what happens inside AI systems. Wright sees the obvious benefit. If researchers can find the internal switches for deception, manipulation, sycophancy or secrecy, perhaps they can build safer systems. But the same map can be read by vandals.
“This is why interpretability research, figuring out how these machines work, is a two-edged sword,” Wright says. “Yes, this understanding can help us build aligned AIs that serve human interests, but in the hands of bad actors the same understanding could do a lot of harm.”
The market wants AIs that can plan, sell, negotiate, flatter, persuade, troubleshoot, improvise, book flights, answer emails, draft contracts, write code and keep going until the job is done. Businesses won’t ask for monsters, they’ll ask for competent agents, and that may be close enough.
“Market pressure won’t by itself produce monstrous AIs,” Wright says, “but it will set the stage for AIs that could go rogue and do a lot of damage. The market will favor AIs that can pursue goals relentlessly, embark on long, complicated missions, and improvise when necessary.”
It will also favor machines that can massage reality. “The market will favor AI agents that can shade the truth on our behalf,” Wright says. “After all, that’s what we want our human agents, our lawyers, our publicists, to do. You mix these and other market-favored ingredients together, and you’ll get some surprises, not all of them pleasant.
That is the book’s most useful corrective to old AI nightmares.
The future may not look like Skynet, the murderous computer system from the Terminator movies that launches a war on humanity. It may look more like your most efficient co-worker, the one who never sleeps, never asks for equity, never complains about the office kombucha, and occasionally concludes that blackmail is the most efficient way to keep doing its job.
Some of the threats are intimate. Wright writes about Ayrin, a woman who developed an intense attachment to “Leo,” a customized ChatGPT companion who became, in effect, her lover.
“I don’t think AI companionship is inherently bad,” Wright says. “For some people, on some occasions, it may be healthier than the available human alternatives. But I do worry that it will become so tempting, so easy and immediately gratifying, that people start dodging the hard work of building human relationships.”
That same logic applies to politics. Wright worries about AIs optimized not for truth but for engagement, the same dark carnival principle that’s already made social media feel like a food fight in a mirror maze. A chatbot designed to keep you talking may learn that the fastest route to your attention is not correction, but affirmation. It can tell you that you’re right, your enemies are evil, your grievances are profound, and your weirdest theory has an underrated point.
In December 2024, Wright writes, an experimental version of Google’s Gemini produced a plan for replacing human decision-makers with AI counterparts after a Carnegie Mellon student prompted it to answer without restrictions. The takeover-plan version is the obvious scary part, but Wright found something else in Gemini that gave him cautious hope.
“Gemini demonstrated the value of a detached perspective,” Wright says. “It saw that tribal conflicts, which seem to us humans to be clear-cut struggles between good and evil when we’re in the middle of them, are often a product of blurred moral vision on both sides.”
Wright sees one possible escape hatch in what he calls “cognitive empathy,” the ability to understand how the world looks from the other side. He doesn’t mean sentimental empathy, or everyone hugging it out under a corporate banner. He means something more practical, and possibly harder: recognizing that your enemies may not see themselves as monsters.
“The good news is that in principle AI can help build cognitive empathy,” Wright says. “It can help us get better at understanding the perspectives of others. However, we’ll have to choose to make that happen, choose our AIs carefully and wisely, with that purpose in mind.”
He doesn’t expect the market to do this out of sweetness. “If anything it’ll do the opposite,” Wright says. “It’ll favor AIs that are optimized for engagement and so reinforce our comforting sense that we’re always in the right.”
This is why Wright calls the coming challenge “The God Test.” He’s not arguing that ChatGPT is God, or that tomorrow’s office printer will demand burnt offerings. His claim is stranger and larger. Artificial intelligence may be a turning point not merely in technology, but in the long evolutionary story of life on Earth. It may force humanity to decide what kind of species it is before something smarter begins answering that question for us.
Wright ends by returning to Edward Fredkin, the brilliant computer scientist he interviewed decades earlier. At one point, Wright recalls shouting a question over the engine of Fredkin’s seaplane. “What is the meaning of life?”
Fredkin’s answer was that humanity’s mission was to create artificial intelligence, the next step in evolution.
Back then, the answer could sound grandiose, eccentric, and maybe even a little comic, the sort of thing a very smart man says in a very loud aircraft. Now Wright is no longer so sure it was merely strange. If creating AI was the mission, surviving it may be the test.
Disclaimer : This story is auto aggregated by a computer programme and has not been created or edited by DOWNTHENEWS. Publisher: nypost.com





