For context: I’m building Sifa, a professional identity layer on ATProto, e.g. a “LinkedIn for ATProto”. One of the features I’m working on is an activity feed on your personal Sifa profile that shows your ATProto activity across apps: your Bluesky posts, Tangled commits, Whitewind articles, Flashes photos, etc.
The data is public, sitting right there in the user’s PDS, and any AppView can read it by design. But I keep going back and forth on whether aggregating someone’s cross-app activity into a professional profile is the same thing as displaying records within their original app… or something qualitatively different.
(This also very much ties in with Trust Infrastructure on ATproto ).
The 2 extremes
I keep bouncing between these two arguments:
A) Show all cross-app activity from all ATProto users.
This is literally what ATProto lexicons enable. If AppViews can’t read other apps’ data, what’s the point of a shared data layer? @flo-bit.dev started a new event RSVP app and it immediately benefits from previous apps creating events. You don’t have to start from scratch anymore, the data is already there and available to everyone.
Also: even if some users might find it uncomfortable to see all their public posts aggregated in a public place: getting them to be aware of this - even if uncomfortable - could be very much a good thing: not because we should violate privacy/data rights, but because the user already made things public and everyone should be very much aware of potential risks of that (especially when this is connected to your real name/identity). Sifa won’t be showing anything that is not already out there for your current employer/colleague/friends/recruiters to already find today. Sifa can provide an opt-out if you want to “hide” certain apps, but that doesn’t change the fact the the data is still out there in your PDS which is your responsibility.
B) Only do opt-in, don’t aggregate anything from non-users.
There’s a difference between “Tangled shows your Tangled commits” and “Sifa shows your Tangled commits AND your Bluesky posts AND your Whitewind articles, AND your Flashes photos etc. etc., all merged into a professional profile.” The individual pieces aren’t surprising. The aggregation however could very well be, especially if the one doing the aggregating is unknown to you. Feels very Palantir.
How ATproto does things publicly, is a technical underlying feature. Do users who use an ATProto app have a reasonable expectation that apps will read their public records? Us nerds know it, but if we get to a point of adoption by normies, is that still the assumption we can make about our users? The user might not even know what ATproto is beyond “oh neat I can sign in with my Bluesky account”.
Technically, Sifa doesn’t store any PDS post data from users at all, we “just show it”. But to a user that difference probably doesn’t exist. Aggregation into a professional context is a different kind of processing than display within the original app context. Your shitposts on Bluesky and weekend pictures on Flashes look different next to your job title (and you probably know this if you’ve ever googled a job candidate). “The data was already public” has been the defense of every data broker ever, and we should probably aim higher than that.
I feel “don’t aggregate because people might not like it” and “do aggregate it: people should know what everyone can already see about them and they need to be aware of this and act on it if needed” all at the same time.
Legal angle
IANAL, but this is how I understand the current GDPR (I’m EU-based, so this isn’t optional
):
-
The Art. 6(1)(f) legitimate interest balancing test gets way harder when there’s zero relationship between the person and the service doing the aggregation.
-
Meaningfully fulfilling Art. 14 transparency obligations is hard when you have no way to reach those people. There’s a “disproportionate effort” exemption in Art. 14(5)(b), but leaning on that for a core feature feels like the wrong kind of creative.
-
Art. 9 : political opinions and health data are still special category data under GDPR even if publicly posted. There’s an exemption for data someone “manifestly made public” (Art. 9(2)(e)), but the bar for that is high.
Where I’m currently at
Considering the above and (my interpretation of) the legal side of things, I’ve currently landed on this:
Case A: you’ve signed up to Sifa.
If you claimed your profile: we can show cross-app activity by default, with per-app opt-out. The user has a relationship with Sifa, was told at signup what would be displayed, and can hide specific apps they don’t want on their professional profile. This feels defensible both ethically and under GDPR. The data is public, ATProto is built for multi-AppView consumption, and the user actively chose to be here.
Case B: you’ve never signed up to Sifa
If your profile is unclaimed: You do have a basic page with some basic info (handle, avatar, display name), but we don’t show cross-app activity at all. Even if all the things are public, this does not give another app the right to pull it in. No scanning, no badges, no activity aggregation. No trust/reputation assessment.
But that feels SO against the ATProto spirit…. aaaaaargh ![]()
So… yeah. Help? ![]()
- Are other AppView builders thinking about this? If you’re reading cross-app data, how do you handle the “user never heard of your app but we still have all this data about them” case?
- Should there be an ecosystem-level convention for this? Something like a PDS-level preference that says “don’t aggregate my data into contexts I haven’t opted into”? Or would that be against the spirit of ATProto?
- Where do you draw the line between “rendering public data” and “profiling”? A feed generator that reads your posts to rank them seems fine. An app that scans all your collections across all apps to build an activity profile… maybe less fine?
- Is the claimed/unclaimed split the right model, or are there better patterns?
No strong conclusions from my side. The model I described works for Sifa (and of course we can still adjust where needed, especially in this Alpha stage) but I’d rather figure out good patterns for the ecosystem than just for one product.
