@erlend.sh reminded me of glue today, so let’s get this idea out there.
Thanks to the good people in the EU, incumbent platforms are required to provide some kind of data export for their users. I was able to bring my entire Google Photos library into Immich because of Google Takeout. Facebook, Instagram, Twitter, all of the big platforms offer some version of this. As we grow the Atmosphere, people moving over their social media accounts will want to bring this data with them. Starting a new account with your full post archive is a much better experience than starting with a blank slate. How might we make this easy for users?
Atomizer, a rosetta stone of takeout zips and lexicon
I propose a community project to build tooling and services which convert between lexicon and takeout archives.
Data Import
Accept these data takeout archives and convert them into various appropriate lexicon. Atomizer would contain a series of libraries/functions for converting common compatible data, such as:
- Instagram → Flashes | Bluesky | Spark
- Twitter → Bluesky | Anisota
- Substack → Leaflet | Whitewind | PiPup
Libraries would provide standard logic for interpreting these archives and writing them to the PDS.
Data Conversion
The second component is for converting between similar lexicon. ATproto facilitates competition, and we have seen many emerging lexicon for the same kinds of applications, each with different trade-offs and design decisions. Luckily, public JSON files of similar schemas are not hard to translate. I propose the second aspect of Atomizer provide tooling for conversion between similar lexicon to facilitate data import and compatibility between competing apps on the Atmosphere, such as:
- Bluesky ↔ Anisota
- Bluesky ↔ Flashes ↔ Spark
- Leaflet ↔ Whitewind ↔ PiPup ↔ Pckt
- Community Bookmarks ↔ Monomark ↔ Semble
Not all similar conversions will maintain full data integrity. Sufficient warning of data loss will be needed for conversions that will not maintain all data. Which brings me to the third component.
Services
These public libraries should enable the creation of data import and conversion services. The project itself can have a flagship service on a website, allowing users to submit their takeout file and upload the converted lexicon to their PDS. Or, choose a collection on your PDS to convert to another. Want to move from Whitewind to Leaflet? Go to this website, select that conversion, click go.
Beyond a community hosted service, the open libraries would allow any Atmosphere developers to integrate import and conversion services into their own apps. What if you could give your Instagram takeout to Flashes and they import all your posts for you? What if Pckt’s onboarding had an option to convert all your old Pipup posts? What if Blacksky let you import your Twitter archive? What if you want to set up your own online conversion service separate from the community instance, with different technical features. You might offer a paid service which syncs between your Substack and Leaflet accounts using a cloud service. The tooling is shared infrastructure for the whole community.
What do we need?
I would love feedback on this concept overall. I am not a technical person, so especially thoughts on how to architect such a collection of tooling would be appreciated. After something like this is built, would there be community funding to operate it? Can we fund it’s development? Can it be built in a way that minimizes costs, such as client side logic (thinking like PDSMoover)? Would such a community resource be useful to you as a developer? What user stories would this enable that you wish were easier?