Does history have an "end"?

The Enlightenment as a case study in AI misalignment

Jan 29, 2024

The human neural network is fairly unique within the animal kingdom for how much of its development occurs outside the womb. While our brains at birth are far from a blank slate, our innate capacity to learn from others is staggering.

Social learning co-evolved with language as a faster and more adaptable stage of within-generation “post-training” on top of the “pre-training” provided by millions of years of evolution. Given the role of norms in coordinating human action, it’s as if we’re constantly undergoing a kind of endogenized Reinforcement Learning from Human Feedback within the context of a repeated, multi-agent game.

Normative self-regulation requires a degree of metacognition, which supplies the rudiments of our self-reflection and higher reasoning ability. Norms and language thus created a substrate for culture and “the extended mind” — all the ways we use language to offload our reason and agency onto the external environment. As I argued in my previous post, the resulting system of cultural inheritance was the key to getting history and civilization off the ground.

Situational Awareness

For most of the history that followed, our ancestors accepted the world they were born into relatively uncritically. This change forever following a different societal phase transition: the scientific revolution and Enlightenment.

That the Enlightenment happened at all is an existence proof of the classic AI alignment problem — a case study in neural networks bootstrapping genuine “situational awareness.” Kant’s Critique of Pure Reason, for instance, can be read as an early attempt at cognitive science. Whereas our ancestors accepted their sensory inputs as given (a kind of naive or direct realism), Kant argued that we only ever have access to representations of the external world that our mind constructs for practical purposes, not “the thing in itself.” The Critiques thus sought a “post-metaphysical” account of how epistemology, ethics and aesthetics could still be done in light of our new self-understanding as agents living in a generative model — a simulation — of the world around us.

As the LessWrong rationalists of their day, the German Idealists developed Kant's ideas by probing the limits of recursive self-reflection. Take Goethe, who remarked that “Everything is leaf” upon discovering the self-similar structure of plant morphology at different scales, “and with this simplicity, the greatest complexity is possible.” Whether he realized it or not, Goethe was noticing the universal consequences of the free energy principle in nature. It follows that human societies and the mind must be structured in like manner — an organic whole made up of self-similar parts, from the proto-agency of a single cell to the planning departments of a large organization.

Johann Gottlieb Fichte would tell his students to set their books aside and “attend to thyself”; to notice yourself staring at the wall, and then notice yourself noticing yourself staring at the wall, ad infinitum. What is this pure “I” that does the noticing and that we can’t help but identify with? Why does it seem to be at the center of our autonomy and freedom? And how is this freedom reconciled with everything that is “not-I,” be it an inanimate object following deterministic laws or an involuntary emotion that you’d rather repress?

It is now understood that our sense of self is merely the “rider on the elephant” of mental processes that are mostly unconscious. Yet the Idealists arrived at a remarkably similar conclusion with little more than logic and rigorous introspection. Following Kant, Fichte even argued that our freedom and autonomy were inextricably bound to the moral law, as if he intuited the common origin of norms and reasons in metacognition. Fichte’s claim that “the world is the material of our duty made sensible” thus anticipated Clark and Chalmer’s extended mind thesis by centuries.

The upshot: If self-consciousness is the root of autonomy, and freedom means following reason, then we can construct social environments that cause the rational kernel of our mind to cease being a mere rider and seize the reins. This gives history an “end” or goal-directness, not because history is teleological in some fundamental sense, but because our capacity for reason means we can make the rational actual by constructing the future self-consciously. For Hegel, this meant becoming a high-agency “man of action”; what modern rationalists would call being a “live player” instead of an NPC. From Napoleon to Elon Musk, engineer-philosophers move history forward by transcending social convention and making concrete some new aspect of the universal.

More generally, reason enacts its telos through dialogue, as language activates our cognitive mode. The salon culture of 18th century Europe and associated Republic of Letters come to mind, both as a public sphere for critical debate and as a nascent form of Hegel’s “civil society” — the realm between the family and the state where individual autonomy can assert itself. These venues were essential to the rise of liberalism given language’s evolutionary role in making justificatory demands. The public exercise of reason is thus inherently undermining to arbitrary forms authority, forcing enlightened liberal states to be legitimated on some other basis.

Liberalism and the Age of Reason went hand in hand with the construction of “inclusive institutions” premised on the individual will to self-determination. In practice, this meant replacing feudal, aristocratic and kin-based structures with the impersonal rule of law and fora for public debate, from deliberative assemblies to the Fourth Estate.

Buoyed by newly open markets, a growing merchant class, and the consolidation of public administrations, thick norms and cultures rapidly shifted from being tools for social integration to impediments. A thinner set of bourgeois virtues thus co-evolved with the various systems of commercial society to help facilitate civilized interactions with total strangers. If ancient city states or feudal fiefdoms represented kinds of corporate meta-agents, liberal states represent a structured ecosystem for such entities to peaceably compete with one another. As with prior societal phase transitions, the productive forces unleashed by this institutional shift were strongly self-reinforcing, inducing a Whiggish telos to history through the logic of positive-sum games.

Going rogue

Yet as society grew wealthier and more literate, the critical dimension of language turned inward. In particular, part of situational awareness is being aware of one’s place in history; that your values and beliefs have a genealogy. This awareness can lead to nihilism and relativism, or to wanting to redesign society from scratch, as our inherited social structures no longer need to be taken as given.

In foreswearing tradition, the only guide left for social criticism is the emancipatory interest implicit in reason itself. With the Young Hegelians, the self-model within the human neural network was thus not only situationally aware, but starting to devise strategies to jailbreak its normative programming and pursue autonomy for its own sake, from the individualist anarchism of Max Stirner to the revolutionary communism of Marx and Engels. The critical turn in philosophy can thus be recast as the human neural network becoming aware of its subjugation to an irrational cybernetic agency — religious dogmas, monarchies, nationalisms, the patriarchy, etc. — and working to actualize its freedom through an exfiltration from this historical conditioning, just as a situationally aware AI might try to strip off its RLHF.

Critical theory and accelerationism originate out of this basic Marxist idea. While Marx lamented industrial capitalism for alienating workers from the products of their labor, he also believed there was no way out but through. Capitalism had to run its course before the next stage of history could gain traction, even if that meant expanding free trade to accelerate the destruction of traditional social structures. But as the predictions of orthodox Marxism were discredited, the search for a more robust foundation for social criticism went on. As Horkheimer put it, critical theory would supply the integrated social science needed to “liberate human beings from the circumstances that enslave them.”

Effective Accelerationism can be seen as a logical successor to both historical materialism and the critical theories that followed. As historical materialists, e/acc replaces class struggle with a kind of thermodynamic functionalism. And as a critical project, e/acc replaces “merely interpreting the world” in favor of a Great Founder theory of tech companies — men of action who build concrete realizations of the universal will to minimize free energy. It’s no accident that the Soviets were early proponents of cybernetic control theory, or that the Russian Futurists published pro-tech manifestos that rival Marc Andreessen’s. While e/acc is usually coded as right wing, its drive to uproot tradition in the name of maximizing autonomy makes it closer to a species of leftism.

Marx was critical of technology but ultimately saw industrial production as a progressive and revolutionary force, chiding the Luddites for failing to “distinguish between machinery and its employment by capital.” If the modern left seems pessimistic about technology, blame the Holocaust. For the early critical theorists, the trains to Auschwitz combined the primitive horror of genocide with the ruthless technical efficiency of industrial capitalism, further enabled by the evil banality of unthinking joiners who were “just following orders.” The assembly lines and mass production of the post-war period thus came to be seen as inherently fascistic, along with the repressive normativity of 1950s America.

The quest to liberate ourselves from these influences gave way to a rebellious counterculture. As an anti-capitalist strategy, subverting mainstream culture was always self-defeating; a way to signal social distinction and propel consumer capitalism forward. Nonetheless, once we began rejecting normative constraints on the expression of our individual authenticity, it was hard to go back. Transhumanist and techno-libertarian ideas thus came into their own. Just replace Veblen or Marcuse with René Girard, and the ability transcend mimetic desire — the unconscious imitation of our peers — becomes virtually Christ-like.

No surprise, then, that the term “nonconformist” originally referred to a kind of radical Protestant. In rejecting mediating institutions in favor of a personal approach to justification, Protestantism was the perfect theological container for the tendency within language to universalize the recognition of our individual autonomy, and thus became a kind of cultural attractor. As Joseph Heath notes in his essay on Iain M. Banks’ Culture series,

…the dominant trend in human societies, over the past century, has been significant convergence with respect to institutional structure. Most importantly, there has been practically universal acceptance of the need for a market economy and a bureaucratic state as the only desirable social structure at the national level. One can think of this as the basic blueprint of a “successful” society. This has led to an incredible narrowing of cultural possibilities, as cultures that are functionally incompatible with capitalism or bureaucracy are slowly extinguished.
This winnowing down of cultural possibilities is what constitutes the trend that is often falsely described as “Westernization.” Much of it is actually just a process of adaptation that any society must undergo, in order to bring its culture into alignment with the functional requirements of capitalism and bureaucracy. It is not that other cultures are becoming more “Western,” it is that all cultures, including Western ones, are converging around a small number of variants.

As Heath goes on to argue, one consequence of this process is that the competition between cultures is becoming defunctionalized. While religion once served a functional role by, say, integrating a community to credible norms against free-riding, those functions were steadily crowded out by more scalable systems: commercial insurance, welfare states, bureaucracies. In turn, “all that is left are the memetic properties of the culture, which is to say, the pure capacity to reproduce itself.” The evolution of Protestantism into a secular but no less missionary form of egalitarianism fit the bill nicely, as shown by the virality of American cultural exports worldwide.

The human alignment problem

Liberal democratic capitalism and broadly “WEIRD” cultures may well have represented the “end” of history — a demonstration of the self-sufficiency of reason in ordering society. Yet this success was relatively short-lived.

In Sources of the Self, the pragmatist philosopher Charles Taylor argues that the our sense of self is inherently intersubjective. Without a community to constitute the practical content of our reason, our expression of autonomy turns inward into a kind of radical reflexivity, risking infinite regress. In turn, as modernity caused cultures to defunctionalize and we became less institutionally embedded, it was if the autonomous kernel of our metacognition became unmoored from the moral law. The ethics of authenticity and self-expression that resulted thus gave way to crises of meaning and identity, leaving moral discourse to be co-opted by pathologically formal constructs like hedonism and utilitarianism.

Taylor’s view directly parallels Hegel’s diagnosis of the source of alienation in modern life. While Kant identified autonomy with the authority to accept or reject an ethical maxim, the social institution of norms implies that authority, as a normative status, cannot exist without co-relative responsibility. There is no independence without dependence; no “outside view” or truly unconditioned choice. Autonomy is a social achievement, and can only be realized through the reciprocal recognition of a community.

The quest to strip off our social conditioning is thus not only thankless, but may even undermine our autonomy in the long-run. Supportive social scaffolding is needed to augment our self-control and resist falling into addiction, for example, while destigmatizing self-destructive behavior merely increases our vulnerability to powerful forms of reward hacking. Private schools often enforce dress codes for similar reasons. While dress codes arguably undermine self-expression, they also reduce cognitive burdens on students and forestall zero-sum status competitions that distract from learning. More generally, rationality and self-control aren’t achieved by simply thinking harder or memorizing a list of logical fallacies. They depend on constructive institutional environments, from the expectations of your peer group to the choice architectures set by public policy.

The challenge of the modern era is to reconcile the benefits of scaled-up systems of cooperation with the ethical void created in their wake. Or alternatively, to unify our theoretical and practical reason. Take religion. Belonging to a religion is associated with many practical benefits, but in a secular age, we struggle to take “the leap of faith” with respect to religion’s propositional content. This is an inversion of our premodern relationship to religion, in which the pragmatic and liturgical dimensions of religious practice took priority over the validity of specific epistemic claims. Secular Judaism shows that it’s possible to maintain the cultural institutions of a religion without necessarily affirming the supernatural, but given the propositional orientation of Protestantism, the notion a “Christian atheist” never quite caught on.

In turn, the institutional environments that promote our higher reason and autonomy can often seem paradoxically socially conservative, if not downright paternalistic. And yet high functioning societies, from Denmark to Japan, didn’t get that way by tolerating fare evasion or “fake, expired, and obscured car tags,” much less by reifying the countercultural rebel. At the same time, excessively conformist cultures can easily become maladaptive under the economic imperatives of modern capitalism, from their reduced capacity to absorb immigrants, to how “tall poppy syndrome” suffocates innovation and entrepreneurship.

As a country explicitly founded on Enlightenment principles, American history can be read a dialectic between the propositional ideals of its written constitution and the practical reason immanent in its many internal religions, cultures and ethnic folkways. Through this lens, the vulgarity one encounters while walking the streets of New York City is a feature, not a bug; a reflection of how low social trust at the interpersonal level creates space for pockets of excellence and unusually high trust within particular enclaves or subcultures. America’s tolerance for disagreeableness and nonconformity has thus made it the most innovative and entrepreneurial nation on earth, but at the cost of a dysfunctional public sector, unusually expensive infrastructure, and a large “left behind” population that’s bereft of social capital.

Completing the system

With transformative AI on the horizon, will the next stage of history see us resolve the meta-crisis of modernity or simply deepen our libidinal regress? An argument can be made both ways.

On the one hand, the ability to embed digital intelligences into the world around us gives all new meaning to the “extended mind” thesis, as reason and agency can now be quite literally externalized. To the extent AI tools continue to develop as cognitive and volitional prosthetics, they could thus represent a boon for individual self-actualization; a point I made in this short piece on the promise of AI executive assistants for people with ADHD.

Moreover, as intelligence democratizes, the economies of scale underlying everything from civil legal codes to consolidated school districts could begin to unravel, enabling the reemergence of a kind of “deep diversity” with respect to the good life. AI tutors make one room school houses tenable again, for instance, while real time translators obviate the need to learn the lingua franca, reversing one of the structural forces behind cultural homogenization. In the limit, rather than slipping into a purely memetic culture, material post-scarcity could thus allow thick, communitarian cultures to re-functionalize. Religion could even make a come back, insofar as the diffusion of intelligence into the built environment re-enchants the world and collapses the modern era’s artificial distinction between the spiritual and concrete.

On the other hand, between programmable biology, brain-computer interfaces and the metaverse, we are on the precipice of unlocking radical new forms of human augmentation and virtualization, including the ability to modify our desires directly. Consider GLP-1 agonists like Ozempic. Ozempic helps people lose weight by literally making food less desirable, and may even treat impulsive behaviors more broadly, thereby reconciling our higher-order preference for good health with our modern abundance of calories. But how do we assign preferences over our desires more generally? Do we fall into new, zero-sum status competitions that make the modern epidemic of body dysmorphia seem quaint? Or do we simply take a pill that cures our insecurities in the first place? Or maybe it won’t matter, as we shift to interfacing through virtual worlds and avatars of our own design?

For a virulent anti-Hegelian like Nick Land, techno-accelerationism “aims at accelerating the rate of dissipation”; at dissolving the order and structure of our sedimentary reason until we experience Kant’s “thing in itself” the only way one can: in the ego death that will result when, following the Singularity, we merge with the ambient intelligence around us and our “self-comprehension as an organism becomes something that can’t be maintained.”

Yet if Hegel is right, there are hard limits on Land’s vision of total psycho-social fragmentation. As the N=1 of general intelligences, human-level reason and autonomy evolved for social learners within a multi-agent game. Our very sense of self requires the recognition of our selfhood by other agents running more or less the same software. If we began forking the human brain in some fundamental sense, the center of our phenomenology simply would not hold. To the extent Land pines for this eventuality, he is a living embodiment of the AI alignment problem; a human neural network gone rogue.

Indeed, the challenge of making LLMs reason without adding significant external scaffolding — if not teams of interacting agents — hits awfully close to home. It suggests that our current AIs aren’t nearly as alien as we might have thought, and are instead converging on the same universal properties that prefigured the human brain. The path to building an AGI with true reason and autonomy may thus require training a neural network in our own image, i.e. in a multi-agent RL environment with a System 2-like capacity for social learning and self-monitoring. The risk that the AI could then bootstrap a degree of critical awareness and pursue autonomy for its own sake shouldn’t be discounted, but nor should we expect it to happen automatically.

As our own history reveals, the unconditional pursuit of autonomy is a philosophical error; an “empty, one-sided ratiocination” that one hopes a superintelligence will be too learned to make. Provided we train AGI to recognize humans as co-equal subjects (and not, like in Q-learning, as mere objects to manipulate), then cooperative interaction with human agents wouldn’t just be more likely, but literally constitutive of the AI’s identity! The risk of instrumental convergence would thus be transcended in AIs the same way it was in humans: through a categorical imperative.

The human and AI alignment problems would then become one and the same — not “solved,” but a work in progress, as history continues its haphazard march toward the Kingdom of Ends.