Locality and Decidability
The Cerebral Cortex in the brain is composed of hundreds of thousands of cortical columns. This is a form of general-purpose learning circuit, and there are minimal differences between the columns found in the visual cortex, the auditory cortex, the motor cortex, the prefrontal cortex, or any other part of the cortex. With very few and largely insignificant exceptions, the six layers and the circuit they form is universally applied to all functions of the cortex. If something like the grid cells that the brain uses as a coordinate system can arise in the motor cortex for example, that implies that every part of the cortex regardless of function has circuitry that is capable of computing coordinates under the right conditions.
The most biologically plausible and detailed model of how this circuit works is the one being developed by Jeff Hawkins and his team at Numenta. The model is far too complex and nuanced to describe in detail here, but one of the foundational ideas is that the cortex is fundamentally built around prediction. A basic observation that drives this is that surprising events, even as simple as the ground being unexpectedly uneven while walking, are often extremely noticeable. If something unexpected is so noticeable, that implies that the brain must be at least implicitly computing a range of what is expected.

However, if we start asking what information is required to generate certain kinds of predictions, this begins to imply many other forms of data that the cortex must be tracking. Say you have a mug. There is a handle, there is an embossed image on the side, there is a rough, unglazed ring around the bottom. If you place your fingers on the mug and feel around, your cortex must know where each finger is relative to the mug, as well as the mug’s position and orientation in order to predict where each particular feature is. If your knowledge of the mug is wrong, if something feels off or out of place, you’ll notice quickly.
The current iteration of Numenta’s theory suggests that the location and orientation of various object and sensor reference frames must be encoded in each cortical column, and specifically layer 6 seems by far to be the most plausible candidate for where this occurs. For every patch of your somatosensory cortex that receives tactile information from a patch of skin, there is a layer of neurons tracking the coordinates of that sensor, relative to the body and relative to objects in its vicinity. The same goes for other sensory domains as well, patches of the visual cortex presumably track the 3D location in space of the object they’re observing.
Column structure is very consistent across the cortex, but many are not connected to sensors. Yet these brain regions also include the same circuitry that others use to track location. While perhaps it is currently difficult to reason about how abstract thought may make use of reference frames and coordinate spaces, the extremely consistent structure of cortical columns implies that effectively all cortical functions make use of some notion of locality and space.
Even when faced by a seemingly alien form of computer such as the brain, we should assume that the ultimate limits are generally the same even if the building blocks are different. If our mathematics leads us to conclude that computing a particular algorithm has certain fundamental limits, or should scale in accordance with a particular function, we should expect this to hold regardless of if the computation is implemented with proteins, neurons, transistors, or something dug out of a UFO wreckage.
Nevertheless, we don’t seem to ask many such questions, often dismissing such systems as largely impossible to understand. The brain clearly is not operating at full capacity at all times, but it’s rare for anyone to ask a question like what the brain’s equivalents of, say, malloc() and free() are.
Locality in Conventional Computing
Any computation can be unrolled into a computation graph. Each instruction is a node, which receives data from previous nodes and performs a transformation on them to produce a new value, which can be used by other nodes downstream.
If a loop has a hard upper bound on the number of iterations, the unrolled loop forms a finite compute graph. If there is no upper bound, the loop forms an infinite graph. Unless it is a genuine infinite loop for all possible inputs, you can expect at least one, often multiple possible exit points that spill out into other parts of the program, or perhaps even to the program’s final endpoint.
All computation can be encoded as such a vast and potentially infinite graph. Encoding the graph explicitly in full detail is infeasible for nontrivial programs, but these graphs generally contain so much repetitive structure (for example, in loops) that they are highly compressible. Many constructs of programming languages, as well as the concrete computations that they compile down into, amount to methods of compressing and decompressing these graphs.
The linear sequence of instructions that a CPU performs is a serialization of this graph. Functions may consist of fragments of this graph, and the CPU’s instruction pointer encodes a coordinate for which part of which fragment of this graph the computer is currently executing. Of course, this graph is usually far too large to encode with a single 64-bit pointer, and so we employ call stacks and nested functions to extend this address space to arbitrarily large spaces, with code fragments embedded inside code fragments, embedded inside other code fragments.
If we have a for loop, this loop will generally include a variable that tracks which iteration we are currently executing. If we have nested for loops, our position in this computation graph will be two-dimensional, which makes sense as such nested loops are often used to iterate over things like two-dimensional arrays. Trees are effectively just a discrete form of hyperbolic space; stacks and similar data structures often serve as coordinates for computations embedded in hyperbolic and tree-like spaces.
All of this ultimately is a way of procedurally generating this vast compute graph, determining which instruction exists at each coordinate point, and which point to land on next to perform the next computation. Many of the instructions the CPU performs don’t even directly perform the real computations, but rather serve to navigate this scaffolding.
Even in parallel computing – if we wish to perform some computation for every element of an array, we’ll generally assign each thread a coordinate or range of coordinates to iterate over. CUDA, the software framework for computing on Nvidia GPUs, is built around generating large multidimensional arrays of threads.
Conventional computation is littered with tools whose sole purpose is to encode positions which track our location on computation graphs, however there are some places where their presence is less obvious. The primary exception can be found in constructs like while loops and recursive functions, which also happen to be the primary places where undecidable and otherwise ill-behaved, infinitely-looping program behavior often arises. This isn’t to say that such constructs always lack such coordinates; a while loop can certainly emulate a for loop with an iteration variable, it just doesn’t have to, or can encode another more exotic coordinate space. A recursive function which performs a binary search encodes a range of possible locations for the result it’s searching for, which it progressively narrows with each iteration in a way that is guaranteed to terminate quickly.
Such language constructs are in a sense an admission that the various coordinate systems we have built many of our computing tools are not an exhaustive list and that there is sometimes a need to break free from them and build something bespoke or exotic. We can certainly fail in this of course, and build something which simply wanders aimlessly through some vast abstract space, with no obvious force driving it toward any exit point in any reasonable amount of time.
Decideability and Geometry
A powerful tool often used in static code analysis is the abstract domain, a mathematical object used to place bounds on the space of possible values that a variable or collection of variables can encode. Abstract domains are based on lattices, which as I’ve discussed before are very closely related to topological spaces. Fundamentally, our best tools for modeling and bounding the behavior of complex programs are geometric in nature.
One of the most universally used tools for understanding the behavior of algorithms and data structures is Big-O notation, where we represent our computation with a polynomial and then by convention drop all constants and everything but the most significant term. This practice of dropping terms has motivations, but is merely a convention and if we choose to ignore it we are left reasoning about computations in terms of polynomial expressions, which closely resemble Taylor Series and similar sums which are incredibly commonly used in physics. The presence of Taylor Series also implies that we can perform calculus on the properties of our compute graphs, which further implies the potential for geometric tools.
Physics, and its vast and chaotic swarms of particles, has absolutely no obligation whatsoever to be legible to humans, and yet once again our best tools for understanding it are consistently geometric in nature. The invention of calculus revolutionized physics, allowing us to better understand complex curves geometries, and revealing their presence not only in the static shapes of objects but in their complex and chaotic dynamics.
I’ll make a conjecture here – if all of our tools for making computation predictable and well-behaved can be interpreted as operating on geometric coordinate spaces, if our static analysis tools for automating the understanding of programs are based on geometric tools, if our most powerful tools for understanding even the most chaotic systems we can understand in the world around us are consistently geometric in nature, if the most obvious ways in which programs end up losing decideability properties involve removing obvious coordinate spaces, and if even billions of years of evolution seems to have converged on computation that tracks coordinate spaces for everything, then this suggests that there may in fact be a deep connection between what is and isn’t decideable about a dynamic program or other complex system and the extent to which its behavior can be mapped to some geometric structure.
I'm loving these blog posts looking at computation through a geometric lens please keep them coming. I think what you're doing is original and deeply important.