Hi Folks, Dorian again. You may remember me from such poasts as the one a few weeks ago about the protocol-agnostic PDS I plan to cook up. Well I’m back as I’ve been summoned by Christian to tease you all with another one: I know there’s been recent chatter about creating an app directory (or several?), so what might that look like?
Even better: what if the metadata that drove such an app directory was on the protocol—so anybody who wanted to make their own app directory could make one? What if we used that same metadata to give the same treatment to lexicons, so ATProto app developers could browse by theme or vertical?
App Listing Lexicon
So in the first instance, I’m imagining a lexicon that can express the data that goes into something like an app store page, but—and this is the part that’s got me vibing—also a lexicon for describing the concepts that make up the app directory’s taxonomy, along with how those concepts relate to each other. The result would be a set of concepts—represented as addressable data objects on the network—with organic consensus around what those concepts mean. These would form the elementary building blocks of a shared public asset which could be used in all sorts of applications.
Standard dot site all the things
What I’m proposing here, at root, is a set of lexicons following the example of Standard.Site which primarily ladder up to real-world user goals, intentionally designed to span across apps. There are levels of scale to consider beyond a single taxonomy for a single app directory.
What’s really great about standard.site is it’s an ad-hoc group of companies who’ve gotten together to create a very simple pair of lexicons (so far) that express what amounts, to a first approximation, to the container and individual entries of an RSS feed—except in ATproto-ese. What this affords is anything reasonably document-shaped—blog, newsletter, forum post, or even old-fashioned book or news article—to be addressable on the protocol. They list the benefits thusly:
- People writing indexers don’t have to wrangle a whole ménagerie of formats,
- End users can move their content between providers,
- No single entity owns the standard,
- Governance (I paraphrase) by rough consensus and running code.
In that vein, I am imagining, in the first instance, an app listing lexicon. Maybe crib schema.org for inspiration. App authors could publish conforming data objects, and indexers could pick them up and read them. It’s easy to imagine an ecosystem (ratings, commentary, release notes, playthroughs, et cetera…) popping up around this core entity. A feature of this, though—and we can debate the merits of this—is that the app listing would also list what lexicons the app uses.
This is all well and good, but if you wanted to get into the app indexing business, you might ask why you should care about what lexicons the app uses. Well, for one, developers will care what lexicons an app is introducing into the ecosystem, but I’m also going to argue that it’s actually the lexicons that are the optimal place to attach a very important piece of information: the thematic category(ies) to which the lexicon—and by extension, the app—belongs.
Let’s say you have an app like Strava, but it publishes its waypoint data to the protocol. First off, that lexicon (say, latitude, longitude, timestamp, previous waypoint) would be eminently reusable for any other app that traced a path through physical space, or otherwise dealt with that data. As such, we could imagine this lexicon being categorized under “geodata” or something. Now, let’s say this app also had some kind of “session” lexicon that maybe recorded the calories you burned or something. Well, that’s unambiguously a “health and fitness” lexicon. But, if you put the knowledge that the app uses these two lexicons together, you can pinpoint a much more specific category of app: a run/bike/paddle/ski/etc tracker.
Protocol-wide tagging taxonomy
The remaining component of this proposal to discuss is the thematic categories themselves. This is an opportunity to create a public good that could have wide-ranging ramifications throughout the entire ecosystem: the concepts themselves that make up the categorization scheme should be addressable data objects that live on the protocol. For one, there’s the mundane matter of synonyms and different representations—such as different languages—of the same concept. For another, we have the opportunity to assert consensus, through use by referencing it, what a given concept means. We can do the same for collections of concepts and the ways (broader, narrower, otherwise-related, not-to-be-confused with…) they relate to one another. In a sense what I’m proposing is a sort of protocol-wide, wikified dictionary/thesaurus, with the added benefit that the objects in question are all addressable, reusable, structured data.
Reality, of course, has a heck of a lot of detail, but this is something I have some experience with—namely SKOS. SKOS is a way to encode, publish, and share concept schemes. Unlike a conventional hierarchical taxonomy, SKOS is roughly set-theoretic, and its progenitors have already gamed out how concepts—and concept schemes—interact. You see it used in things like enterprise CMS products, and representing all sorts of taxonomies in libraries, ecommerce, and beyond. Note: I’m not suggesting, per se, that we use SKOS—this is ATProto after all—I’m just underscoring that there’s a mountain of prior art to draw from.
So to recap the outcomes, at the surface level we have something that satisfies both user and developer needs (finding apps and lexicons by thematic category), and at the deeper level we have a durable public good (a consensus-driven, structured dictionary/thesaurus) that can have all sorts of applications beyond organizing apps and lexicons. This would be driven by a third level deeper, which is a lexicon for concept schemes.
This multi-tier project has a lot of ways to engage, from designing the specs themselves to authoring/curating individual concepts. I know it’s probably a little hairy so I’ll follow up with a diagram once I decide how such a thing ought to look.