A is a computer

I received a DM from Ed. We chatted about an idea that had stuck with him for half a decade. There were concrete designs and technical discussions, but what stuck for him, and what stuck to me was a simple saying: a flower is a computer.

A flower is a computer that, through complex biological processes, receives, transforms, and transmits information, over space and time. How can we access that information? I would like to ask the flower some questions: Are you thirsty? Who watered you last? Would you like to be moved to the shade? We do ask these questions, but we ask them through our screens. The flower does not speak for itself.

A computer is a flower when I take care of it, when I watch it grow. A computer that is a flower is a ecosystem with internal dynamics worth observing. A computer that is a flower is not a tool. It does not exist for my purpose. The computer today is a crop, planted serially, predestined to be ripped out of the ground and consumed. The computer today is not among other varieties of computer, but in a field of homogeneous production.

Ed had a hunch that some new technologies may allow this approximation to become concrete. Lacking some intimacy, our exchange ended, but I got to thinking what he might have meant. Now, a few weeks later, I can imagine talking to a flower, backed by an AI chat interface. My mind had gone in another direction, using the same underlying technology, but gesturing towards a simpler, but perhaps more interesting idea.

Asking a question, or asking for an image, is an instrumental use of AI. It is akin to using a camera to film a theater performance for reproduction. It would take some time for cinema to develop montage. Underneath the language model is latent space, a concept truer to its nature than concerns of facts and representation, or dreams of sentience.

Ever since encountering word2vec in college, I've been compelled by latent space. It allowed thinking of language spatially—a king plus a woman minus a man is a queen. Words, represented as vectors in space, could undergo geometric transformations. A word was "embedded" in space, placed at certain coordinates of a detailed map that could be browsed like an atlas at a school library. Word2vec turned this intuition plastic.

The idea of perceptual embeddings staked a similar path for images. Much how we build up complex visual stimuli from cones and rods, a layered network of computation could go from image to a vector in space. Shallower layers would encode textural features, whereas deeper ones would encode a structural understanding. CLIP was the breakthrough moment when co-located image and caption were embedded into a shared latent space.

Cristóbal Sciutto, June 2024.