What is Private Data?

I’m catching up so it seems a good time to summarise some of the different things I’ve read on this and take a moment to abstract… apologies if this synthesis isn’t useful.

I’m going to be bold and say that what we are talking about here is content visibility, although you will see that broaden quickly even within this post. Let’s start simply; for single-author content there are several types of audience:

  • Self - keeping visibility closed while drafting prior to publication, e.g., or writing journal entries and notes which will never be seen by another user. Some are making the explicit assumption that this ought to be managed by an app but that solution raises all sorts of questions about data resilience and backup which users will make their own incorrect assumptions about.
  • One other - the classic DM where a post is sent to one other trusted user is not currently dealt with on proto. An on proto solution should be able to handle all the things people are used to with public posts. Reactions and replies are easy to envisage. Reposts and quote posts are harder to think about. This cannot be handled by an app alone… there needs to be a way to route the content. This immediately exposes us to metadata leakage and throws minds towards e2ee about which @ianopolous.bsky.social has written , so we need a way to scope this requirement such that it remains useful while meeting the expectations of general users. @davenash.com has opened the dialogue.
  • A group of homogenous others - at this point we introduce the need for management of the list of others and different people have very different assumptions about what this means. We introduce reply to all on top of reply, i.e.
  • Everyone - this seems to work pretty well… :slight_smile:

For multi-author content there are a bunch of other considerations some of which are identified by @btrs.co in their post. One of the interesting things to consider here is that we add the concept of roles to the groups. As a consequence we’ve bought ourselves even more first order primitives which need to be handled: the content, the roles, the groups (author, editor, subscriber, e.g.), the users, the metadata. Each of these needs visibility consideration and we’ve suddenly introduced workflow onto some of them.

These things are important because there are wide expectations that, for non-public content, its organisation and the visibility of those with rights to it also needs to be non-public. Either those expectations need to be reset or thorny questions about metadata visibility and e2ee need to be addressed.

There are questions about legal risk and who carries it. If a PDS owner can be identified then they can be coerced. If a PDS chimney is smoking you can be certain that the owners’ door will be knocked on. Maybe that’s OK. Maybe it’s not. Does discoverability outweigh anonymity or should that permeability be a PDS owners’ choice?

We have to answer questions about whether a new addition to a group can see content posted prior to their addition. The mail example from Paul’s leaflet, for instance, comes with assumptions about state both for the post and the group itself - valid assumptions which make some use cases impossible.

There are also questions about who can manage the groups and how big they can be - someone wanting to build a subscription service handling 1b users with 10,000 content flavours is coming from a very different set of assumptions than someone trying to coordinate their pub quiz team.

2 Likes