A discourse on purpose

causality bayescraft

Epistemic status: philosophical spitballing with a bit of math


I've always had a certain admiration for people who have intimidating bodies of work behind them. Whether in writing, in the arts, in the most elegant or most far-reaching theories in science, there is a pantheon for people who have struggled to build a coherent set of objects or ideas throughout their entire lives. Some call these bodies of work legacies. Some call them wealth. I call them, edifices, or towers of work whose stability (and thus, eventual height) is entirely dependent on their coherence from the ground up.

Coherence how? People do things as a result of running cognitive algorithms (whether they are conscious of them or not) and even for the most seemingly versatile edifice-builders, you can find an underlying bag of "tricks" that covers their entire life's work1.

Okay, so I'm not against taking metaphors too far, as long as you keep at the back of your mind that they eventually have to be squared off against reality. So let's do it.

Q: Are all people edifice-builders, then?

A: Yes, in a way. Some people have short towers. Some people have stout ones, covering vast lands but never reaching any height substantial. Some have really pointy, really slender towers with singular themes and singular defining philosophies. Some have laser turrets that target and shoot down other edifice-builders. Such is the way of life. But everyone gets to build only one.

I'm not talking about individual achievement over time. If you were to become stranded in space, drifting on a ship with dying batteries and broken solar panels never to be heard of again, even if you you use the time you have left to write a manifesto or to resolve quantum gravity, your chance to build on top of your edifice stops then and there. But if a future civilisation discovers your remains and a bunch of people makes cults and statues out of your semi-lucid wall engravings, then that is still a golden capstone added to your edifice.

So what I'm really talking about here is civilisational accumulation. You can imagine a city of every person who has ever lived situated on an infinitely vast (and infinitely flat) grassland. When you start your life you get an empty lot. You see mostly the same bottom parts of buildings all around you, so you learn quickly to copy their towers. But as you gain experience in masonry and carpentry, you get to add to your tower in any way you see fit. Some additions are inconsequential: perhaps "I should always dot my i's using pink hearts" is a small groove invisible from far away. But then you see other people with fantastically ominous towers casting shadows and reaching clouds and you begin to wonder about the sort of engineering and architecture necessary to build such things. That’s what I call coherence.

People's towers crumble if they try to build up with incoherent foundations. But that crumbling is another chance to rebuild yours2.

And then you can begin to imagine the entirety of our history as a species (and of sentient things in general) in zones and districts. Areas with small huts are ripe for starting new architectural clusters. Areas with skyscrapers, much less so. The difficulty of getting your tower seen nowadays rests in the fact that the city is becoming too big and too crowded (though the Internet giving us neon signs and coin-operated telescopes alleviates this somewhat).

Q: Okay, enough about urban planning. What is the point of all this?

A: My tower is crumbling and this is my attempt at rebuilding its foundations.


How are people so creative anyway? If you buy that a person is the sum of his or her cognitive algorithms (admittedly a hard sell for a lot of people, but oh well), then we should be able to explain the origins of creativity via those algorithms.

And this is my fledgling hypothesis: creativity is the systematic exploration of small regions in thingspace, or the collection of all possible things, including all possible truths, all possible theorems, all logical contradictions, all unnameable configurations of matter (including the stuff growing under your toilet seat). Yeah, it also contains Russell's paradox, the invisible teapots orbiting Jupiter, and the shavings of barbers who do not shave themselves. Just all things: if you want it, it's there.

But maybe that's too big a space to fit in one's bag, so let's shrink thingspace into symbolspace or the collection of things describable using a finite number of symbols. Immediately, we've shrunk a huuuuge infinity into, well, a huge infinity. We still have logical contradictions as citizens, but at least we can give them addresses now by permuting our symbols until we hit them3. Note that I'm using ‘symbols’ in a loose sense here, so for example phonemes count as long as we have a discrete and finite number of them.

The final step in our reduction of the infinite is to reduce symbolspace to those things allowed by physical laws, and this small subregion of thingspace I'll call realspace4. The slice of this realspace you actually observe is reality, and there's only one.


Fig. 1: The hierarchy of things.

Creativity then is plucking things out of thingspace and putting them in reality.

$$\text{Creativity} : \text{Thingspace} \rightarrow \text{Reality}$$

Fig. 2: A very creative use of math notation, like those happiness = live + laugh + love stickers in middle school.

There's a lot of things to unpack here, the most important of which is, what the hell?

And the reason why you're thinking that is because this still doesn't explain anything. Little Bobby and Teeny Sarah might both be plucking crayon-drawn representations of their houses on paper, but we still give Sarah's expertly shaded roof tilings an A+ and Bobby's sausage hands and feet whiskers a Good job! sticker and a permanent position on the less visible part of the classroom art wall. So yeah, there's an element of finesse and technique to creativity, and we ascribe to those who have these traits in abundance a stronger sense of being creative which our model doesn't capture. So whence cometh creativity?

To answer that, let's go up the abstraction ladder again and talk about optimisation processes.

The most important part of the mapping analogy of creativity above is the arrow, the →, because we can only get something useful from this line of thinking if it can teach us how to pluck things better from thingspace. In general, mappings from one region of thingspace to another, I identify as optimisation processes if only for the reason that it allows us to talk about wanting to go from things to things in a slightly more formal way. And immediately this shows us a couple of affordances5: a) that optimisation processes have a definite beginning and end, b) that we can focus our attention on their restrictions to particular regions of thingspace. In our case, it's most productive to talk about optimisation processes restricted to reality.

$$\text{Optimisation process} : \text{Thingspace} \rightarrow \text{Thingspace}$$


$$\text{Optimisation process}|_\text{Reality} : \text{Reality} \rightarrow ?$$

Fig. 3: More or less read as “Optimisation process restricted to reality”. I was never a fan of the bar notation for function restriction. It's a bit too dense for me to process in a split second of reading.

Where does an optimisation process restricted to reality (which from now on I'll call a realisable optimisation process) lead us? If we accept the inviolability of physical laws, then anything we do in reality must somehow end up in a reality where those same laws still hold. Hence, a realisable optimisation process must end up in an identifiable region of all physically possible things. And that, my friends, is a region I like to call a purpose.

$$\text{Optimisation process}|_\text{Reality} : \text{Reality} \rightarrow \text{Purpose} \subseteq \text{Realspace}$$

Fig. 4: I promise this is getting somewhere.

In other words, realisable optimisation processes turn real things into purposes.

Q: Doesn't this imply that we can compare optimisation processes?

No, contrived questioner, it really doesn't, but let's run with your idea anyway.

If optimisation processes have to be physical processes to become realisable, then whatever magic they have to do to get to their purposes must obey physical laws. Now, the simplest way to go from one configuration of reality to another is to simply transport all the necessary particles from their current position to their intended one. This takes some amount of energy we don't even need quantum mechanics to calculate, and hence this suggests a preliminary least upper bound on the capabilities of an optimisation process: just take a look at its energy requirements from the current state of reality.

So say I'm a hiker who wants to climb Mt. Everest. Starting from the state of reality where I'm at the South Base Camp, I know for a fact that it will take me a bare minimum of over 2500 kJ in order to reach the summit6. But if I go there by foot, I'd lose a lot more energy to friction and to maintaining my body temperature by breaking down food (which will also add to the weight I have to carry). If I go there by helicopter, I'm gonna burn a lot of fuel but I'll probably sidestep the death-from-hypothermia risk I would have had otherwise. In any case, any deviation from the energy least upper bound is wasted motion.

On the other hand, it takes more than just energy to run an optimisation process. Take evolution, for example. The energy required to assemble brains from complex multicellular biomatter is pretty low in the grand scheme of things (if you don't believe me, try assembling a star). But it took evolution roughly 3 billion years to go from cyanobacteria to brains7 and that's already a fifth of the current age of the universe. Hence, it behooves us to consider not just energy bound as the metric by which to judge optimisation processes but efficiency as well.

Already, these two criteria put us in a bind. It means there are purposes that are inaccessible to us a priori by virtue of taking too much energy or taking so long that we don't live to see them through.

Q: So what's all this have to do with creativity?

Hold on, I'm almost there.

But before we cut the proverbial knot, let’s take another detour and talk about Saint Simeon the Holy Fool.

In many ways, Simeon, or Abba Simeon was one of the earliest recorded trolls in history. Not much is known about his early life except that he was born in a Mesopotamian city called Edessa in the time of Justinian I and that he had a partner named Ioann (trans. ‘John’).

At 20 years old, Simeon and Ioann entered the monastery of Abba Gerasimus in Syria and thereafter spent 29 years in the desert living the ascetic lifestyle. Then one day, he said to Ioann:

What more benefit do we derive, brother, from passing time in this desert? But if you hear me, get up, let us depart; let us save others. For as we are, we do not benefit anyone except ourselves, and have not brought anyone else to salvation.

And so he dragged a dead dog to the nearby city of Emesa.

Yep. As Leontios of Neapolis recounts in The Life of Symeon The Fool:

The manner of his entry into the city was as follows: When the famous Symeon found a dead dog on a dunghill outside the city, he loosened the rope belt he was wearing, and tied it to the dog’s foot. He dragged the dog as he ran and entered the gate, where there was a children’s school nearby. When the children saw him, they began to cry, “Hey, a crazy abba!” And they set about to run after him and box him on the ears.

On the next day, which was Sunday, he took nuts, and entering the church at the beginning of the liturgy, he threw the nuts and put out the candles. When they hurried to run after him, he went up to the pulpit, and from there he pelted the women with nuts. With great trouble, they chased after him, and while he was going out, he overturned the tables of the pastry chefs, who (nearly) beat him to death. Seeing himself crushed by the blows, he said to himself, “Poor Symeon, if things like this keep happening, you won’t live for a week in these people’s hands.”

Okay, I admit, I’m not an expert on Christian hagiography. My meager knowledge of Catholicism comes from having gone to a priest-run school. But I’m sure as hell that this ain’t the sort of thing they taught us to do!

Leontios goes on to tell us about Simeon’s brief employment in a tavern of all places:

Once he earned his food carrying hot water in a tavern. The tavern keeper was heartless, and he often gave Symeon no food at all, although he had great business, thanks to the Fool. For when the townspeople were ready for a diversion, they said to each other, “Let’s go have a drink where the Fool is.”

One day a snake came in, drank from one of the jars of wine, vomited his venom in it and left. Abba Symeon was not inside; instead he was dancing outside with the members of a circus faction. When the saint came into the tavern, he saw the wine jar, upon which “Death” had been written invisibly. Immediately he understood what had happened to it, and lifting up a piece of wood, he broke the jar in pieces, since it was full. His master took the wood out of his hand, beat him with it until he was exhausted, and chased him away.

The next morning, Abba Symeon came and hid himself behind the tavern door. And behold! The snake came to drink again. And the tavern keeper saw it and took the same piece of wood in order to kill it. But his blow missed, and he broke all the wine jars and cups. Then the Fool burst in and said to the tavern keeper, “What is it, stupid? See, I am not the only one who is clumsy.” Then the tavern keeper understood that Abba Symeon had broken the wine jar for the same reason. And he was edified and considered Symeon to be holy.

Really, the Catholic schtick seems to be virtue signal so hard, people realise your moral superiority. And true to form, Simeon took this to an entirely new level [CW: fake rape]:

One day when the tavern keeper’s wife was asleep alone and the tavern keeper was selling wine, Abba Symeon approached her and pretended to undress. The woman screamed, and when her husband came in, she said to him, “Throw this thrice cursed man out! He wanted to rape me.” And punching him with his fists, he carried him out of the shop and into the icy cold. Now there was a mighty storm and it was raining. And from that moment, not only did the tavern keeper think that he was beside himself, but if he heard someone else saying, “Perhaps Abba Symeon pretends to be like this,” immediately he answered, “He is completely possessed. I know, and no one can persuade me otherwise. He tried to rape my wife. And he eats meat as if he’s godless.” For without tasting bread all week, the righteous one often ate meat. No one knew about his fasting, since he ate meat in front of everybody in order to deceive them.

It was entirely as if Symeon had no body, and he paid no attention to what might be judged disgraceful conduct either by human convention or by nature. Often, indeed, when his belly sought to do its private function, immediately, and without blushing, he squatted in the market place, wherever he found himself, in front of everyone, wishing to persuade (others) by this that he did this because he had lost his natural sense.

How can we understand all this? How can someone think “saving people” meant pretending to rape people and literally shitting in public?

I think this is why it’s important to understand that optimisation processes may not be optimising for the things you think it’s optimising for. Taking seriously the notion that optimisation processes implement purposes, we might be mistaken as to which purpose-regions they will eventually lead to and so it behooves us to discover ways in which we can be sure. This is basically a higher-level version of the Korzybskian map vs territory distinction where instead of maps you get physical processes and instead of a one-level territory you get a still-one-level-but-now-combinatorially-larger Realspace.

The Elephant in the Brain elephantly argues that we are more often than not in the throes of virtue signaling, unconscious beasts of burden for status-games. We are status-seeking even as supposedly pure and uncorrupted intellectuals because we fear oblivion as much as the next guy. Ascetic devotion is a quirk of the same status drive: it is borne out of cognitive algorithms optimising for status in proxy. Social cognition, by virtue of never really happening with complete information about brains of similar computational power, has to make do with proxy purposes. One such proxy is moral high ground, and so the ‘tails come apart’ if you will when, like Simeon our Church-approved role model, it is pursued to the extreme.

We can transplant the same argument in discussions of creativity. From an anthropological perspective, it’s quite baffling why humans produce art. Sure, early signs of creativity like tool-making and differentiated clothing can be chalked up to environmental and social pressures that we more or less understand now in full. But evolutionary psychology hasn’t yet churned out a consesus on why Brian Wilson has to write God Only Knows or why Monet has to paint The Luncheon.


Fig. 5: In particular, they can’t explain why there has to be a creepy doll under the seat on the left.

One hypothesis is the disgust hypothesis, which is that people evolved to feel disgust strongly and as such created a complementary but ultimately vestigial emotion in the process. Another explanation is the handicap principle, which is basically why peacocks invest so much resources on growing a colorful tail. The infamous statistician Ronald Fisher built on this principle and proposed the sexy son hypothesis which kind of makes me think that Freud had a point.

In any case, there is yet to arise a consensus on why humans create things the way they do. But if we take seriously the evolutionary psychology tenet that, quoting Cosmides, “Our neural circuits were designed by natural selection to solve problems that our ancestors faced during our species' evolutionary history.“ then all these theories of human creavity are but grasping parts of the same elephant. No matter the reason why Michelangelo sculpted David to the detriment of commonly held prerequisites to survival like food, reproduction, and wealth, the fact of the matter is, he did. Like Simeon, It might have been jumpstarted by an appeal to the status drive, say, him being enthralled by similar figures handmade by masters whose level of prestige in society he now wants to obtain for himself. Heck, maybe it did involve sexy sons for Michelangelo like Cecchino Bracci. But such a process born took a life of its own, an existence of its own, a purpose of its own and it gave all of us unendowed men a hill to defend on r/smalldicks a source of great envy. And perhaps one day, the same purpose shall give us another powerful cognitive algorithm in a future Michelangelo.

(Okay, I confess that this all sounds like a circular just-so story and to be frank it kinda sorta is. Indeed, any human behaviour can be justified with such a coarse description of ‘cognitive algorithms’ which don’t really lend themselves to simple reduction (indeed, aside from vision and hearing and memory most of the more peculiar aspects of cognition are still black boxes). However, as I will attempt to argue for the rest of this piece, there might be a nontrivial benefit to such a loopy, teleological perspective if only as a thinking tool.)


Q: Hold on just a sec, all this talk about purposes is pure nonsense. You can just tag something as a ‘purpose’ and call it a day. What ever happened to the virtue of narrowness and empirical justification?

That’s the thing: this theory of purposes is only useful insofar as it lets us cleave thingspace into two mutually exclusive regions for any given purpose: the purposeful (those within the target) and the unpurposeful (those outside). And as sentient beings we already sort of perceive these purposes as if they have an existence of their own. Hence, metaethics. Hence, aesthetics. Hence, why people like Plato kept insisting that there’s a universe out there where his mother loves him even if only in abstract.

Indeed, purposes can only exist in a deterministic universe where ‘free will’ is as physical a thing as there can ever be. We perceive ourselves to have free will insofar as we feel we can arbitrarily change the boundaries of these purposes. But more often than not, what happens instead is that we discover where the boundaries lie as we think about them.

Another way to think of purposes is this: in spacetime, everything that will ever happen, every interaction, every causal event, is already set. The path you take in spacetime is called your worldline (actually your worldtube since physicists are so creative). Call by \(S\) the set of all your interactions with other worldlines along your own worldline. Then we can think of \(S\) as a realised purpose whose optimisation process is well, you. Put in more general terms, a purpose is an idealised abstraction of the path a worldline was already going to represent anyway8.

And through this lens, free will then is experience of chugging along this fully determined path under the curse of not knowing everything (and thus only having a vague inkling of what lies ahead).

But if so, if everything is but a stream following its course, then why do anything at all? Because thermodynamically speaking, you still have to convert chaos into order before order can happen. Effects must still have causes. Posteriors must still come from priors. The universe in which you read this essay and decide to sit down and wait to die is still a different universe from the universe where you get up and start building. All that has happened up to the present moment will still determine which set of possible worldlines you still have access to in the future.


Fig. 5: Not these kinds of worldlines though.

So the question is not, “If I do this, would goal X happen?” but rather “Is my next action compatible with a universe where I arrive at goal X?” There is an important mathematical difference9.If we truncate your set of other-worldline interactions at the interaction indexed by ‘now’ and call it $S_\text{truncated}$, then this notion of compatibility comes from asking whether or not $S_\text{truncated}$ is equal to your desired purpose $S'$ trucated at the same index. Or relaxing a strict equality, from my current set of actions $S_\text{truncated}$, is there still a sequence of actions that would lead me to a desired $S'$?

Let’s build up a repertoire of examples if it still isn’t clear enough:

  • Instead of “Will studying type theory help me create AGI?”, ask “Is studying type theory consistent with a universe where I help create AGI?”
  • “Will one-boxing net me a million dollars?” ↦ “Is one-boxing consistent with the universe where I gain a million dollars?”
  • “Will reading this book help me solve world hunger?” ↦ “Is my knowing this book consistent with the universe where I solve world hunger?”
  • “Will thinking about all this ‘purposes’ crap help me in my goals?” ↦ “Is my having heard this theory consistent with the universe where I achieve world domination?”
  • “Will getting out of bed right now and putting my phone down lead me to a happier life?” ↦ “Is me standing up and doing other things consistent with the universe where I’m happier?”

I hope these don’t seem like useless transformations to you now. For me, at least, thinking about purposes has helped me come to terms with the sobering and often nihilistic view of determinism that emptied the haunted air and plucked the wings of fairies around free will. A lot of people are seriously bothered by the notion that they can’t really ‘choose’ things, so to speak, but are only ever chugging along in a slow stream of causality that’s already been preordained by the quantum fluctuations that happened $10^{-30}$ seconds after the Big Bang. By only requiring that the past be consistent with the future, we give up the superstition of choice while still preserving a sense of agency10.

Put in another way, we can still choose to lay down in a ditch and wait to die because of all this, but now we can at least regain a sense of responsibility for the fate of the universe. Earlier, I talked about the link between optimisation processes and thermodynamics. Since any purpose must have a corresponding optimisation process if it is to be realised—the alternative is to wait for fluctuations to do the job for you, and boy are you going to wait a long time11—then we now have roles to play again. We’re stewards of the cognitive algorithms, the optimisation processes that make us, ‘us’, and our job is to make their ride as smooth and as quick as possible.

  1. For example, Leonhard Euler touched almost every part of 17th-century mathematics and yet you'll find that a common theme in his mathematics is the extension of familiar things (e.g., algebraic manipulation of finite quantities) to unfamiliar things (e.g., infinite series) by taking them at face value, as opposed to the cognitive strategy of Alexander Grothendieck who kept asking what exactly mathematical objects are about (leading to the construction of increasingly all-encompassing superobjects like topoi, and eventually to the insight that it is the relationships between them which are important, not the mathematical objects themselves).

  2. A good way of demolishing your tower is to spend a considerable amount of time thinking alone. This was the same trick used by Muhammad when he meditated in some cave in the 7th century, and the same trick used by Isaac Newton when he invented calculus, optics, and classical mechanics while hiding from the last bubonic plague in England.

  3. It's also interesting to note how this process depends on how you interpret those symbols in the first place. But as products of millions of years of evolution, we already have an agreed-upon interpretative system in our heads, called natural language.

  4. Some would prefer the term, "multiverse", but I haven't seen enough of the physics to adopt the name with confidence.

  5. In the Design of Everyday Things sense.

  6. See this back-of-the-envelope calculation.

  7. See Grosberg, RK; Strathmann, RR (2007). "The evolution of multicellularity: A minor major transition?"

  8. What about quantum mechanics though? Does adopting the Copenhagen, inherent-randomness-in-the-universe interpretation invalidate this whole chain? No. It doesn’t really matter how your worldline lines up, whether it’s actually just a line or a region of uncertainty in some tube of spacetime. What matters is, there are different distinguishable paths in Thingspace, different purposes corresponding to different possible realities. And you will only ever observe and remember one path anyway.

  9. Which is that the former is easier to represent as a small causal graph, while the latter is more amenable to a chain (totally orderd set) representation. And there are other important differences. It’s for example easier to define a notion of distance between sets than between graphs. Two worldline-purposes $S$ and $S'$ are closer the more spacetime events they have in common. And perhaps we can even recover some of that Pearlian causal goodness by defining causal influence: the influence a worldline-purpose $T$ on another $T'$ is stronger if, by surgically removing its interactions with the latter, $T'$ becomes smoother. In other words, stronger worldlines leave more spacetime kinks.

  10. And also recovers some parts of Yudkowsky’s timeless decision theory that I never really understood.

  11. Ergodic calculations suggest a timeframe on the order of 10^100 years and that’s assuming a finite universe.