Thoughts On Advanced Read / Write Access Control & Permissioned Data

Continuing the discussion from Permissioned Data PDS Lexicons:

@rochebit.net, starting a new topic since the previous one was specifically about lexicons.

The permissioned data proposal does make space membership include full read / write access, but I think that is as it should be.

That said, I also want to enable all the different access patterns you’re talking about here. They are extremely important for the app me and my team are making, roomy.space, which is similar to Discord.

This has been basically my full concern for about the last couple months, so I’ll try to make the case that the permissioned space proposal is OK as is, and that we can still have the extra features we want with it.

This is something I need to turn into a proper, focused Leaflet at some point, but I’ll try to get the ideas out here first.


Write Access Is Way More Complicated

First of all, the issue with adding write access to the spec is that as soon as you need write access control, things get way more complicated. There’s all kinds of different ways you could decide to authorize writes and putting any one of them on protocol would be very opinionated and rule out changing it later, which is a problem.

It also works against the core ATProto idea of user’s having their data on their own PDS.

There’s no way that we could stop users from writing data to their own PDS if we wanted to. So if we still want to let people put private data on their PDS, we really can’t stop them from writing anything to it.

The proposal still allows write access to be “controlled” by AppViews by ignoring data that is not allowed / invalid somehow. This is similar to how AppViews work today on public data. They can’t just trust the user input, but they also can’t stop the user from writing to their PDS.

That’s OK for some things, but not for everything, like you point out. I’ll get into a way to handle strict access control like you are talking about without changing the permissioned data proposal further down.

The Proposal Adds Just Enough For Read Access Control ( And That’s Good )

The permissioned space proposal adds just enough to let us sync private data that is hosted on our PDSes. I think that’s good. It’s simple and doesn’t restrict the kinds of solutions that we can build on top.

It lets us make custom space hosts that can use whatever logic we want to create spaces and control who can read them, while still allowing the actual data to be hosted on the PDSes.

It fits the ATProto model really well.

Write Access Control Is Missing From Public and Permissioned Data

What we realized while evaluating our needs for Roomy is that write access control isn’t just missing from permissioned data, it’s missing from public data, too! There’s no way to make an “organization” or “community” account where you can control what gets written by whom.

Any time an app runs into this need, they have to put the data on the AppView and start treating it as authoritative, which causes difficulty with letting people host their own AppViews and generally goes against having all of our data on protocol.

I haven’t read through the whole thing yet, but this leaflet by Tangled touches on the challenges that exist because there’s no such thing as community-owned / access controlled data:

What if Bob nukes his account? Suddenly all bob’s work to that repo will disappear from entire network. This should not happen and obviously not an expected behavior in collaboration platform. For example, we don’t expect old contributor to be able to nuke all their codes and commits from the project without permission just because they want to. Current Tangled appview is working fine because we use our own DB as a source of truth collecting all events in real-time since the very beginning, validating them just as we received them.

So in general we have this issue where ATProto only has a concept of personal data, and there is no facility community / organization data. Personal data is simple, because you never need to do write access control beyond checking whether you are the PDS owner. It is hard to make an access control mechanism that works for every application, especially when every app can make its own lexicon which has its own semantics and needs its own access control rules.

Experimental Community Arbiter Service

So the solution that we’re working on is a generic authorization service that can sit in front of a PDS and provide access control for public ATProto repos, as well as act as a space host and provide access control for permissioned repos.

I’ve almost got a working prototype that uses the Rego policy language to let you create completely custom access control rules.

It works almost like an XRPC proxy that can be associated to a community / org DID and presents a standard ( work-in-progress ) API for managing group membership compatible with the permissioned space proposal.

It can then use that group membership in order to make policy decisions about who can write what to the public repo and permissions spaces, under that DID’s authority.

The cool part about this is that it lets us move authoritative data back on protocol, while not sacrificing on access control. The Rego policy is extremely flexible, and can even make XRPC requests to remote servers in order to determine whether access is granted for a particular request.

We’ve been calling the service the “arbiter” and I’ve been writing about our plans for it as it has developed, but I don’t have a good writeup on the latest plan with the Rego policies. You can see how the idea has been evolving by following the chaing of Leaflets if you want:

I’ve almost got a working prototype where you’ll be able to create new community / org accounts and control access to the public ATProto repo of the account with a customizable policy. It should also be able to act as an alternative to https://opensocial.community.

There’s a lot to figure out, but we think we’ve got a good base and are ready to start experimenting with actual use as soon as possible.


A great part about ATProto and the permissioned data proposal, in my opinion, is that it leaves the door wide open for us to do create things like making a custom arbiter for community data.

If this were to be specified in the protocol itself, then people would be forced to use a particular form of access control that may not work for everybody. But because it is left unspecified, we are able to come up with our own solutions.

Having to come up with our own solutions isn’t always the nicest, but I appreciate that we aren’t forced into a paricular model, and even if some people use our arbiter, since it is built on top of the existing protocol, other apps can be minimally interoperable with it, even though they may not know about the specifics of our extensions on top of the protocol.

1 Like

Thanks

I probably shouldn’t have used the term write permission so much, because like you said it’s all to your own personal private PDS’s. So instead of write permission it’s more of a “to be read” or what PDS’s to include at read time.

I like the Arbiter path and how it supports much more nuanced management, roles and permissions around a space or community. Am I understanding it correctly that while a “standard” PDS implementation will be able to host a space, for a space to have this advanced functionality it would need to be run on a special space host like this Arbiter?

Yeah, the arbiter would work as a special space host, but it’d still be compatible with standard PDSes, as far as letting people put their permissioned data on their own PDSes.

I think that it makes sense to be able to have this Arbiter on a special space host. It definitely fits nicely with the atproto architecture. I even expect this will be the most common way that communities are managed, to take advantage of this advanced control options. But I feel we shouldn’t make using a customised space host a requirement to get these additional controls.

One of the things I find great about atproto is that the PDS’s are simple and everything extra is built separately. We could have some PDS host offer great content creation tools, or delayed content posting, etc. But the atproto architecture means these features aren’t restricted to the PDS host. An AppView could offer this same functionality, or even if a PDS host did offer it, there’s no reason those tools couldn’t work for someone with their PDS hosted elsewhere. This really locks in the concept that you truly own your data, you can take it anywhere, you can host it on any software stack, as long as it meets this limited set of requirements.

If the Arbiter requires these special space hosts that do things with the APIs beyond the spec, then communities lose that ability to be hosted anywhere and move as freely.

When I look through your leaflets on the Arbiter concept, it would seem that almost all of it could be done with a set of permissioned spaces on a standard PDS implementation. The only piece I see missing is separating out the “can read” and “to be read” lists. If the permissioned space spec required this, then the Arbiter could be completely detached from where the space is hosted. I could set up any PDS to host this community (ideally not under my personal identity, but that’s also allowed even if not advised), then sign into some app that helps me manage community spaces. That app could create all these spaces needed to control membership, keep things in sync, etc, all without additional control over this PDS. So then we wouldn’t have to (although can still when we want) have the Arbiter in the path when getting these tokens to access the data, that could be completely on the PDS side.

I agree that the protocol shouldn’t be so specific to lock in all the complex ways of managing access, and solutions like the Arbiter are a great example of what could be built on top of the limited PDS. I just think breaking out those two use cases (“can read” and “to be read”), allows a lot more freedom for innovation like this. Just those two, it shouldn’t need to handle complex roles and permissions. And as a side benefit the minimally compliant PDS’s will also be able to handle a little more complexity than the current spec.

This is basically what we’re doing with Roundabout; we have a ‘community-bot’ service that does two things:

  1. Handles membership. We model membership slightly differently than @dholms.xyz’s proposal; a user who wants to join a community writes a join request (kind of like a follow!) into their pds, and then the community-bot adds the user to the community’s follows (i.e., membership).
  2. We have some data that is explicitly “community data”, mostly managed by community stewards. That is to say that it’s not data that should live on the stewards’ pdses, but it should come from them. The community bot has a set of rules that watch stewards’ repos (which is a list defined by a manually built collection on the community pds), and when a community collection is written to by them, it clones the data in the community repo.

It’s not perfect, but it was extremely simple to build and has served us well for over six months now. Nice to see that roomy is approaching things in a similar way! I don’t know what meaningful interop looks like for this sort of pattern, but it would be cool if we could get there!

2 Likes

Absolutely! I was experimenting with some lexicons that we might be able to share between our arbiter and Habitat’s, but I think we definitely need some more real experimentation before we can really figure out what a common layer might be.

Super excited about the possibilities, though! And we’re really close to having an arbiter implementation we can play with so hopefully that will give us more concrete stuff to work off of.

1 Like

I think I had this rather wrong, and I worked out where I lead myself astray. I had read the part of the spec about notifying apps when writes happen, which mentioned it would contain the DID of the member. I then assumed that meant that only members could write, and the space was doing some check against this list before forwarding on these notifications. In this way it was essentially dictating what content should be read, rather than just leaving it up to the apps or other Arbiters. But if instead that was not meant to be member specific, and in fact the space will forward on all writes irrespective of any eligibility, then all my concerns about having that member list conflated for multiple purposes goes away.

And all this may be moot if the member list ends up being removed entirely.

1 Like