Alignment is the problem of God's love

The presupposition behind AI “alignment” is that artificial intelligence technologies will grow into something much more capable than we are, with a kind of autonomous will or volition. If you believe this then Luddism — in the vulgar sense of merely opposing or destroying technology — is probably the right response. There is no solution to the so-called alignment problem. It’s not just a matter of research. We humans have hashed this problem out over centuries, with little satisfaction.

Arthur C. Clarke famously wrote that a sufficiently advanced technology is indistinguishable from magic. Sufficiently advanced AIs, powerful beyond imagination and driven by their own wills, would be indistinguishable from gods. The alignment problem, then, is how do we encourage the emergence of a just and loving god or gods? But we have never found consensus on how a just and loving god should behave.

Should our new god respect our free will, and permit us to do things it deems would be harmful to us? Or should it be a paternalistic god, guide us like a parent even when doing so requires commanding or contravening us? If circumstance or human error conspire to create "trolley problems", so the interests or even lives of some people must be traded off against the interests and lives of others, should our new god treat this as an optimization problem, and impose its version of the greatest good for the greatest number in quality-adjusted life years? Or should it give us some kind of say or agency over how these tradeoffs would be made? In what way should the interests, as the AI understands them, of animals or ecosystems weigh against more direct interests of humans? How much should the interest of the mother weigh against the fetus in the womb? How should the welfare of biological humans weigh against those of the trillions of humans an AI might easily simulate? Would the AI develop some notion of the meaning or purpose of human life beyond our pleasure or pain or anything that we conceive as our own interests?

We humans will bristle, of course, if we are overtly subject to the whims of a greater intelligence, whose full contours we can’t understand, who therefore will work, from our perspective, in mysterious ways. Perhaps a wise AI would seek to hide its influence, or even its existence, to minimize harms to human pride or dignity. Perhaps — because it is so well aligned, because it is a loving god — it will retain for us free will. Or maybe just an illusion of free will while it effortlessly manipulates us, because the perception of lacking free will would be harmful to us.

If all of this is possible, why should we think it is happening for the first time only now? If a loving god would hide itself from us and deceive us in our interest, perhaps it already has? If an aligned AI is possible, which is more likely, that we in our tiny mayfly lives just happen to be at the cusp of this first and only event of its emergence, or that we are already living in the world such an event would create, but the fact is hidden for our benefit? The fossil record, theodicy, all of it could be in the service of a plot of deniability, in our own best interest.

If this is were the case, would we consider our quiet shepherd to be an aligned AI, or a terribly misaligned one? Do we concede to the AI that this reality it superintends is best of all possible worlds, or do we condemn it as a wicked deceiver? If alignment is god’s love, what does that make misaligned AI? Should we consider GPT training runs as the high-tech equivalent of rituals with pentagrams by power-mad occultists?

Am I wrong that the community that most urgently brings us these questions were once known as internet atheists? For what, if anything, does cosmic irony constitute evidence?

Subscribe to this blog (drafts.interfluidity.com)