Solving decentralized discovery

We won't have meaningful decentralization until we solve decentralized discovery. When I say "discovery," I'm talking about the challenge of efficiently surfacing relevant data to users when they're not quite sure what they're looking for.

As a volume of data grows, the opportunity lies in helping people find things. And there are an awful lot of software applications in which how you find the data is just as important as where you store it — which means that distributing the data doesn't always correlate with distributing the power. When you use a marketplace application like Uber, what is the functionality that Uber provides? Is Uber merely storing the contact information of cab drivers? Of course not. The primary function of Uber isn't storage, it's discovery. As a passenger, you need an introduction to a cab driver who's close to your geographic location. So you must be able to search the database of cab drivers — quickly — over some notion of physical proximity. But those cabs are moving around your neighborhood at 40 miles per hour. Their physical proximity to you is constantly changing.

That's a hard search problem to solve at scale! The reason Uber holds power is because they have solved it. The reason Uber holds power is not because they have centralized the data. If all the cab drivers in the world stored their contact information to IPFS, the result would be useless when you need a ride to the airport. How would you find the drivers that are whizzing around your neighborhood? Uber is powerful because they've centralized discovery.

The path to a technologically decentralized driver cooperative is paved with advances in decentralized discovery protocols, not advances in blockchain transaction speed. And if your driver cooperative isn't 100% technologically decentralized, then it's just Uber with a "trust me bro" ethics statement.

The decentralization hype machine is largely ignorant to this idea. A staggering number of Twitter celebs who opine about DAOs and "dAPPs" (lol) have described the marketplace problem as one that can be solved by smart contracts alone — imagining that Uber's only role as middleman is making sure that both passenger and driver agree on the fare, and grossly ignoring the question of how the two parties get introduced in the first place.

Fixating solely on where the data lives is midwit-tier stuff. If you decentralize the data but you don't decentralize discovery, then for every application in which discovery provides a primary value, some startup will swoop in and re-centralize the whole stack over a service that provides the best discovery function.

And most of the apps you care about are based on discovery. Youtube recommends videos to find an audience for creators. Twitter can tell you who's interesting and worth following. Yelp, DoorDash, and Tinder all provide a common value: they introduce you to things you're likely to be interested in (restaurants, people to go on dates with) that you didn't know existed in the first place. These companies acquired tremendous power and now behave as assholes not because they own the data, but because they own discovery.

Since discovery overlaps with search, the underlying computer science trends in the opposite direction of blockchains. (If you want to make data searchable, you can't really pick a worse solution than a blockchain.) The path forward begins with the literature in distributed multidimensional indexing structures, like the skip graph. The research tapers off around 2009 — right when P2P hackers began to flee from classical consensus. I'm working on a solution. Are you?