Early permissioned data proposal draft feedback

dholms.xyz · June 23, 2026, 8:48pm

This is a forum thread for giving feedback on the early permissioned data protocol proposal that I recently posted.

github.com/bluesky-social/proposals

Permissioned data (#94)

main ← permissioned-data

opened 08:47PM - 23 Jun 26 UTC

dholms

+555 -0

This is an early draft of the proposal for permissioned data. **Please** do not …over-index on this. Details, terminology, and behaviors are all likely to change. For a friendly introduction to the protocol, check my [my Leaflets](https://dholms.leaflet.pub/). For discussion, feel free to hop in on this PR with specific feedback, or have larger discussions over on the community forum.

bmann.ca · June 23, 2026, 9:21pm

From the PR:

0016 Permissioned Data

This is a draft proposal, not the final specification. Details, terminology, and behaviors are all likely to change.

Introduction

The [AT Protocol][ATPROTO] is a protocol for public broadcast data. Users publish records into a repository on their PDS, and applications crawl those repositories to build views. Authority rests in the DID that publishes a record, and records are signed, redistributable, and universally addressable.

This document specifies a permissioned data protocol for data that is not public, data with an access perimeter. It runs alongside the public protocol and serves modalities such as:

Personal data: bookmarks, mutes, drafts
Gated content: paid newsletters, subscriber-only posts
Socially shared: private posts, stories
Groups: private forums, communities, group chats

The permissioned protocol shares the abstract shape of public atproto. It retains identity-based authority, per-user repositories, lexicon-typed records, and the general flow of applications crawling PDSes to build views. However it is a distinct protocol rather than an extension of the public one. It has its own repository format, sync mechanism, addressing scheme, and resolution path. Public atproto is built for public broadcast (signed, archival, rebroadcastable) while the permissioned protocol is built for party-to-party transmission within an access boundary.

This protocol provides access control, not confidentiality. It is not end-to-end encrypted. Services (both PDSes and authorized applications) can read the data they handle, which is required for server-side features such as search, indexing, notifications, aggregation, and moderation. E2EE is a separate concern that may be layered on top by an application and is out of scope in this proposal.

Relationship to public atproto

	Public atproto	Permissioned protocol
Unit of data	Record in a repo	Record in a permissioned repo
Repo scope	One repo per user	One permissioned repo per (user, space)
Record authority	User DID	User DID
URI authority	User DID	Space authority DID
Commit	Merkle Search Tree root	LtHash set-hash digest
Signature	Rebroadcastable, archival	Deniable on rebroadcast
Addressing	`at://` URI	`ats://` URI
Access	Public	Gated by space credential

Terminology

Space: an authorization and sync boundary for a set of permissioned records, identified by an (authority, type, skey) triple.
Permissioned repo: one user’s records within one space, with a cryptographic commit, hosted on that user’s PDS.
Repo host: a service that stores and serves users’ permissioned repos.
Space host: a service that answers for a space as a whole, issuing credentials, enumerating writers, and routing notifications.
Space authority: the DID at the root of a space, which resolves to the space host and the key material for issuing credentials.
Space credential: a token issued by the space authority that grants read access to a space.
Delegation token: a token issued by a user’s PDS that an application exchanges with a space authority for a space credential.
Client attestation: a token signed by an application’s own client authentication key, proving the application’s identity to a space authority. Required only when a space gates on app identity.
Syncer: an application that keeps its own copy of a space in sync by pulling from repo hosts.

A PDS fulfills both the roles of a repo host and a space host. However, these roles are discussed separately because they do not necessarily need to be filled by a PDS. A permissioned repo or a space may be hosted by any service that implements the required APIs.

Spaces

A space is an authorization and sync boundary representing a shared social context. A space may include many different types of records from many users. The space does not colocate records. Instead, each user stores their own records for a given space in a permissioned repo on their own repo host. A space is the union of these per-user repos across the network: an application presenting a space pulls each member’s repo from its host, assembles the view, and applies access control to requesting users.

Each space is identified by three values:

authority: a DID, the root of authority for the space
type: an NSID describing the modality of the space
space key (skey): a string distinguishing spaces of the same type under the same authority

Reading or syncing a space requires a space credential signed by the declared signing key of the space authority. The space authority decides whether to issue one based on the requesting user and client application. The protocol does not define how that decision is made and carries no member list (see Access Control). Spaces scale from a single user’s personal data (e.g. bookmarks) to communities of millions of users.

Addressing

A permissioned record is addressed by an ats:// URI of six components:

ats://{spaceDid}/{spaceType}/{skey}/{authorDid}/{collection}/{rkey}

Component	Type	Description
`spaceDid`	DID	Space authority DID
`spaceType`	NSID	Space type
`skey`	string	Space key
`authorDid`	DID	DID of the record’s author
`collection`	NSID	Record collection
`rkey`	string	Record key

All six URI segments are necessary to identify a permissioned record. The first three components may be used to reference a space:

Space:  ats://{spaceDid}/{spaceType}/{skey}
Record: ats://{spaceDid}/{spaceType}/{skey}/{authorDid}/{collection}/{rkey}

Space authority

A space’s authority is the DID at the root of the space and the issuer of its credentials. It may be a user’s own DID as for personal data such as bookmarks or mutes. Or it may be a dedicated DID which lets a shared space transfer between users independently of any individual account.

The authority DID MUST expose two entries in its DID document:

a verification method with id #atproto_space: the public key used to verify the space’s credentials
a service entry with id #atproto_space_host: the endpoint of the space host

Both values MAY resolve to the same values as #atproto and #atproto_pds from the public data protocol.

Space type

A space’s type is an NSID that names its modality and resolves to a space declaration. It identifies the kind of data a space holds before any network resolution, much as a collection NSID does in public atproto. Because a type names a concrete modality, every space is some specific kind of space rather than a generic container.

The type is also the OAuth consent boundary. Access is granted to a user by type, e.g. “access to your AtmoBoards forums” (see OAuth scopes).

Snipped because too long, read the full PR

tamme.schichler.de · June 23, 2026, 10:43pm

Frankly beautiful, and I don’t say that often.

This covers all uses I have in mind myself so far quite nicely and, more importantly, appears to be a user-safe design that doesn’t leak attestations.

The simplespace feature requirement for PDSs is an important access-to-technology measure, so I’m very glad you included that here. (Its listMembers endpoint is private to the authority of the space, right?)
I think keeping that simple and not including authority ownership in the API there is probably fine, or at least I think client-only apps can still manage a form of transferable space authority ownership by storing credentials privately under this model. (Though, I suspect the owned-authority OAuth login can’t be fully automated in a normal browser with just this. Hm… But I think that’s a separate problem.)

In this line:

Space permissions can also be bundled, unlike with more user-friendly verbiage, into a permission set.

Is “unlike” a typo for “usually”?

dholms.xyz · June 23, 2026, 11:24pm

Yes it is, thanks for flagging!

Yes as described, it is. Although if a space expresses the members as records in the space, then it will be viewable by anyone with read access to the space.

bmann.ca · June 24, 2026, 2:31am

…which would be the only thing in that space – e.g. a space of type member records – is that correct?

baldemo.to · June 24, 2026, 7:14am

Awesome!

Since the proposal right now says terminology is still likely to change, I’d like to flag a potential naming issue early.

In everyday usage, “space” almost always refers to a place you’re in. However, the PD primitive is a permissioning boundary, a far more abstract concept. That conflict between a “place” and a “boundary” is, in my opinion, strong enough to count as a misnomer: it mismodels the concept for anyone meeting it for the first time.

I think that @zicklag.dev’s initial Arbiter design leaflet illustrates this well, especially considering that this seems to be where the group-permissioning layer is heading:

From now on, I’m going to use the terms “group”, “role”, and “space” somewhat interchangeably, based on how a specific space is meant to be used, even though they’re all just spaces.

If a role (a set of people holding a permission) is a “space” (colloquially, a place), then the word has lost its intuitive meaning; readers have to actively fight their intuition to follow the architecture.

I’m usually not one for nomenclature wars, but I do think that this one is particularly important: it’s a core concept that all of PD hangs from, and the cost of the mismodeling lands squarely on newcomers, who are a key part of the adoption boundary.

I wouldn’t be against a wholesale rename if that becomes consensus, but I think a fair compromise might be adopting “Permission Space” as the canonical term, with a capitalized “Space” as shorthand once context is set. This way, we could use the full “Permission Space” on first reference and anywhere the ambiguity is a real problem, and use the shorthand elsewhere. Compound terms already supply the needed context and could stay as they are (“Space credential,” “Space host,” “Space authority,” “Space type,” etc.), but the standalone “Space” in prose that does the most damage could be disambiguated by using its full qualified form.

I know it’s a small thing against a draft this early, but I do think naming is load-bearing for legibility and adoption, especially for a primitive this central, so I wanted to get it on the record.

erlend.sh · June 24, 2026, 10:58am

I started a sidebar topic to test our appetite for further terminology bikeshedding: Terminology of permissioned data

holobrine.bsky.social · June 24, 2026, 6:55pm

My main thoughts are that access control is actually naturally stateful and data driven. “Does this user have permission to do this?” is a stateful question asked at call time somewhere in the call stack and the permission itself is data.

Also, I also don’t think terminology concerns are necessarily bike shedding. Counterintuitive terms are pedagogically expensive, naming things well does matter, and it’s harder to change the names of things when the protocol is already in place.

holobrine.bsky.social · June 25, 2026, 6:07am

Another thought - the space owner’s PDS is a single point of failure for the whole space. That PDS must stay online to keep delivering tokens, or else eventually they expire and everyone loses access. I would like some distributed or redundant authority.

zicklag.dev · June 25, 2026, 3:35pm

I think that can be done with any of the methods used to make distributed / redundant web services.

You can have the DNS for your space host contain multiple IP addresses, use a reverse proxy that routes to multiple backends, you can use a distributed database for membership lists if you need it, etc.

I don’t think that needs to be built-in to the protocol, because that is all stuff available to us through DNS, HTTP, etc.

For some projects it’s stateful and based on a member list that is stored on the space host, but in other projects it will be “stateless” and based on the result of a call to a 3rd party API like GitHub ( imagine allowing access to the space to any contributor to a GitHub repo with keytrace link to their ATProto account ).

The proposal allows different space hosts to implement the credential check however they need to.

In these “stateless” use-cases it’s actually very easy to make it highly available / redundant because the service itself can be made out of as many servers as you want running around the world and the rest of the reliability is built on top of GitHub’s API.

holobrine.bsky.social · June 25, 2026, 4:12pm

Fair point about a single DID having redundancy behind it.

Also, fair point that sometimes the permission will come from a 3rd party API. I still think a layer between this protocol and applications would be useful, but I suppose lexicons are already that layer, in principle. I think I am then interested in lexicons working the same for both public and permissioned data, which they should because it would be very unlike atproto if they didn’t.

dholms.xyz · June 25, 2026, 8:44pm

Yes Lexicons work the same for public & permissioned data.

I actually expect 3 layers (at least for the usecase of communities/groups): the protocol, a cross-app community governance/structure standard, and then application Lexicons.

We’re discussing what an atmospheric community is over here: What is an atmospheric group/community?

And I shared some early thoughts around some data modeling questions in this blog post: Modeling communities on permissioned data - Daniel's Leaflets

bnewbold.net · June 26, 2026, 9:56pm

I took some high-level notes while giving PR review feedback, i’m going to dump them here publicly.

There has been some mixed feelings about the design process (eg, bsky team driving a proposal), and some projects are already building/operating alternative architectures. But as a general vibe there seems to be growing ecosystem consensus around this branch of development.
Nothing is set in stone until we have multiple interoperating implementations and a real application built around it. this includes both small details and the overall architecture.
App services having possession of multiple space credentials from multiple accounts for the same space does feel weird, and will lead to weird/arbitrary implementation decisions (as was raised early on by divy). I don’t think it is a blocking issue, but I hope we can bring structure to this. Eg, provide simple implementation patterns/guidelines.
This design enables interoperation, and retains some of the principles from public data. But I think the centralizing forces and anti-competitive behaviors are different for spaces (vs public “shared heap”), and we don’t have as deep an analysis or narrative around that yet.
Could use a careful security and privacy review from somebody with fresh eyes.
The moderation and anti-abuse story is not very fleshed out yet.
- may want generic space blob pre-scanning at the PDS?
- relationship between spaces, app services, and labelers needs to be worked out
- human moderator access to reported content needs to be worked out
- rate limits and resource quotas will be needed

Here is a sketch set of “design goals” and “architecture properties” that have come up through the process:

scales to millions of readers (eg, for newsletter use case)
personal data (eg, preferences) works with multiple devices, client apps, and app services
the authority for space records is rooted in the account’s DID (keys in DID document), not the account’s PDS hostname. eg, hijacking the PDS DNS should not allow an attacker to publish fake data. this is the same as public data repos.
newly authorized services can “backfill” an entire space (all space records from all members). this is critical for competition and credible exit from service providers.
support for multiple concurrent client apps, and services (eg, moderation services), in the same space. but at the same time, may want the space authority to be able to limit or exclude client apps and services.
ability to migrate spaces between hosts and services
an account’s individual permissioned data can be bulk exported, and migrates between PDS hosts the same as public data
services can verify the authenticity of data they fetch from PDS hosts; but leaked data is refutable (not strongly authenticated at rest). end users would mostly trust their client app software and app services to have verified authenticity, but in most cases should be able to re-verify individual records by re-fetching (eg, using goat, unless the space has locked down client app list)