Hey! I’m the person who wrote the proposal up above ^^^. I wanted to drop in and leave a few thoughts.
First, I’m extremely satisfied at the fact that we over at furryli.st and Northsky came to what are broadly extremely similar solutions to our common problem. I absolutely agree that shadow lexicons are the way forward for private/semi-private community data in the Atmosphere.
I do have a few thoughts regarding the specifics of the proposed architecture here, though it’s mostly regarding the initial prioritization of “true” data privacy over system portability and data sovereignty. I would like to propose that simple obfuscation can provide significant additional benefits for community health, resilience, and portability while still nonetheless fulfilling the role of a bulwark against malicious actors.
(Before I get deep into this, I want to make clear that I am not a programmer/computer scientist by trade. My professional background is in biological systems/regulatory networks. This has certainly helped me understand distributed systems better, but it’s very possible I’ll miss key technical details that make a big difference. Please do correct me if this is the case!)
My main concern with the initial proposal of a PDS as a shared relationship. Of course, as the proposal points out, such a boundary definition is not set in stone. But it is my opinion that (with the exception of extreme cases where privacy is absolutely paramount for immediate user safety) a PDS boundary is not the best solution, and, moreover, that consensus towards an alternate boundary system with DID collections/lists could have significant benefits for interoperability, portability, and UX.
A few issues come to mind with the PDS gating architecture (some of which have already been mentioned):
- PDS gating introduces the PDS as a single point of failure for the community. It lacks true portability, which, in my opinion, is a crucial aspect of the longevity of a community within the protocol. Although it’s obviously not 1:1, such a structure is reminiscent of Mastodon/ActivityPub servers, where accounts are at the mercy of the good will and competence of the server host. Loss of the PDS could lead to a catastrophic loss of community integrity and history.
- Related to this: asking users to move to a PDS in order to interact with a community is a significant ask for users. It is essentially a request to entrust a user’s account hosting to a third party that the user might or might not consider trustworthy. This is not necessarily a deal-breaker, especially since account backups are technically possible, but I think it would rightfully give many users pause for newer and/or smaller projects.
- If I understand @bmann.ca’s suggestion correctly, this could potentially be mitigated by using a record to define an arbitrary number of new service endpoints. However, this would still result in records being siloed off within individual PDSes.
- PDS gating creates a significant moderation load for the PDS host. Most significantly, it potentially removes the first line of defense against revolting content such as gore/CSAM. As I understand it, all content within user repositories in Bluesky PDSes are moderated by Bluesky, regardless of whether it’s published as an app.bsky lexicon or not (since it’s content still hosted on their servers). I think that for the long-term sustainability and mental health of the moderation team, it’s in the best interests to offload at least initial moderation screening to a third party if possible, and perhaps creating an override system for bad calls.
- Since I understand that moving away from Bluesky moderation can be a big motivator for private community spaces in ATProto, an alternative might be implementing Hive content moderation tools before serving posts to users. However, I’m not entirely sure on the specifics of how that would work.
- A monolithic PDS-centric design limits the number of communities a user can join. Imagine a user who is Black, queer, a furry, and a game developer. Each of these identities could plausibly have a need for pseudo-private community spaces. Yet, if all of these communities are bounded through PDS gating, it forces users to choose: which identity do you value the most? I don’t think this choice should be necessary.
If the disadvantages I’ve listed are indeed accurate, I think that exploring other alternatives within the solution space is worth our time.
One such alternative solution @evelyn.northsky.team pointed out is a simple collection of DIDs. This is essentially what furryli.st is: a collection of DIDs based on @furryli.st’s follow list which can be used dynamically in any way we see fit. This collection is manually curated by screening users that request to join the list using an internal set of rules and basic visual screening to approve or reject requests. We internally call this a “curated cluster”.
I’ve described our proposed design in my leaflet post, but I’ll summarize it here. Our proposed design is two-pronged:
- To create a shadow lexicon mirroring app.bsky or any arbitrary social app, creating a boundary between records meant to be kept within the cluster and records meant to be broadcast to the wider network.
- To use a curated cluster’s DID collection (such as furryli.st) to exclude shadow lexicon records from users who are not part of the cluster.
Using curated DID collections could potentially solve the issues I described:
- There is no personal investment into the community ecosystem through PDS migration. Users are able to remain on Bluesky’s PDS, or migrate to their own, without losing access to whichever communities they are a part of.
- Bluesky T&S can be used to lower moderation load. While this may not be ideal for some communities, for others who do not have the manpower to consistently maintain active moderation systems, this could make the operation of these community spaces much less burdensome. Cluster operators could act more as bouncers, which we’ve found to be an easier and more forgiving role.
- Users are not limited in the number of curated clusters they can join. A user can be a member of a Black cluster, a queer cluster, a furry cluster, and a game dev cluster, without needing to sacrifice any particular community or identity.
This design can also open up a world of new possibilities that I would describe as Atmospheric:
- Curated clusters are portable: in theory, anyone could make a carbon copy of furryli.st and run their own cluster using the open-source tools we’ve built. In the event we go rogue, users could switch to infrastructure run by a separate party. Since the vast majority of them would likely still be members of the carbon copy, the experience would be broadly identical, and, since the design is PDS-agnostic, no records would be lost.
- This might require an alternate design from the “boundary” property initially proposed by Evelyn. If the boundary DID collection is baked into records, alternate clusters and apps might need to use workarounds to serve records using the previous cluster boundary to new users.
- The portability of the clusters could encourage users to create peripheral infrastructure (e.g. feeds and labelers) custom-made for their respective communities, which could be promoted and even integrated into the user experience by community runners if they benefit the cluster. Different services could even create different experiences while still gating using the same curated cluster. This opens up all sorts of possibilities for tailor-made experiences for a particular community.
- Given that the design is only semi-private, there could be exciting opportunities for interoperability between different communities. If communities organize using a similar schema, users could, rather than accessing separate apps for separate clusters, use a single, portable client to interact with multiple clusters at once, and allow them to switch between different social “lenses” depending on the cluster they’d like to view/interact with (Perhaps letting cluster operator accounts define parameters for the social experience using a standardized record).
- These “lenses” don’t have to be limited to showing the shadow-lexicon either. They could also filter app.bsky posts to show public posts from members of the cluster (or even boolean functions with different clusters). This would leverage curated clusters and the already existing, popular feeds using the app.bsky lexicon to narrow the field of view through which the user experiences the network.
These benefits I’ve listed, of course, ignore the elephant in the room regarding this proposed collection-centric design: since records are not confined to users within a PDS, the records are not fully private. Although they would be obfuscated by not being served by major clients, and a collection boundary would largely prevent bad actors from participating in discussions where they don’t belong, records using the shadow lexicon would still be publicly findable and accessible just like any other record. This is, indeed, the biggest advantage of Evelyn’s proposed design, and strictly meets this WG’s goal of designing a scheme for private records within the AT Protocol.
I did, however, want to bring up the benefits of our alternate collection-centric design which, while not fully meeting the goals of this WG, nonetheless approximates a similar result while opening up several possibilities for a more modular, portable, resilient, and overall atmospheric ecosystem design.
As I mentioned before, I don’t think collection-gating would be suitable for every community. I think Evelyn’s proposal is especially valuable for users at especially high risk of persecution or harassment: political dissidents, persecuted minorities, private collectives and organizations. However, I do want to at least posit the question: At what point do the benefits of a semi-public yet interoperable design outweigh the drawbacks of a truly private yet monolithic design?
I think that there are several kinds of communities that could happily reap the benefits I listed of a semi-private interoperable design while experiencing very few of the potential negatives from lack of true data privacy. This is not limited just to the Black community, furries, game devs, queer people, etc . This could create curated, gated community spaces for, say, universities, research communities, fandoms, really anything you could make a subreddit about. Yet, unlike subreddits, which are monolithically owned by moderators and site admins, communities would be a commons within the ATmosphere that can be tailored to each community’s individual needs and wants, without the risk of centralization, data loss, or a single point of failure.
Obviously, it would be important to let users know, in no uncertain terms, that posts using the shadow-lexicon are not truly private. It would be a mistake to give users a false sense of security. But it’s my sincere belief that many communities would be okay with this, so long as the community space they interact with is truly gated from outsiders and malicious actors.
Edits: Wording/formatting