Announcing "AT Scholarly Lexicons": A blueprint for decentralised scholarly communication using the AT Protocol

Hi everyone :waving_hand:

Today, I’m releasing a draft set of AT Scholarly Lexicons, designed as a proof-of-concept, to demonstrate how scholarly communication and academic publishing can be transformed through decentralisation on ATproto.

What is this?

These lexicons are schema definitions that describe how academic content, actors and processes could be structured in a decentralised system, built using the AT Protocol specification.

I have attempted to give examples from across the entire research lifecycle: journals, conferences, manuscripts, peer reviews, editorial decisions, authors, editors, reviewers and more.

Think of them as a blueprint—a technical specification that demonstrates how academic publishing could work with portable identities, immutable provenance, and interoperable standards.

Sample Lexicon: Publishing a Journal Article

{
  "$type": "at.scholarly.manuscript",
  "title": "CRISPR-Cas9 Gene Editing in Human Embryos",
  "abstract": "We demonstrate...",
  "authors": [
    {
      "did": "did:plc:abc123xyz456def789",
      "name": "Dr. Jane Smith",
      "orcid": "0000-0001-2345-6789",
      "affiliation": "Department of Molecular Biology, Stanford University",
      "email": "jane.smith@stanford.edu"
    },
    {
      "did": "did:plc:xyz789def456abc123",
      "name": "Prof. John Doe",
      "orcid": "0000-0002-3456-7890",
      "affiliation": "Institute of Genetics, MIT",
      "email": "john.doe@mit.edu"
    }
  ],
  "publication": "at://did:plc:abc123/at.scholarly.publication/nature",
  "doi": "10.1038/nature12373",
  "submittedDate": "2024-01-15T00:00:00Z",
  "status": "published",
  "manuscriptFile": { /* blob reference */ }
}

Mapping Lexicon Relationships

Why I Made It

Academic publishing’s problems are well-known: platform lock-in, proprietary formats, fragmented systems, opaque processes, and top-down implementations of enshittified platforms.

In a recent blog post, I explained how AT Protocol could transform scholarly communication and academic publishing. Big claims need some substantial proof!

These draft lexicons are my attempt to materialise that vision and show concretely how decentralised academic infrastructures might actually work.

They explore practical solutions using AT Protocol’s core features including decentralised identifiers for persistent verifiable researcher identities, content addressing for provenance, and shared schemas for interoperability.

Instead of researchers adapting to closed systems and proprietary platforms, open standards should allow scholarly communities to define how they communicate without compromising the integrity of the scientific record.

An Invitation to Collaborate

This is deliberately a proof-of-concept. I’m not claiming these lexicons are perfect or complete or that I fully understand the breadth of the problems or the opportunities. Also, my technical skills are pretty limited.

Effective standards require broad input. Early design decisions matter, and scholarly communication needs governance that serves researchers, institutions, funders, and the public—not just publishers and vendors.

And so, I’m inviting the community to engage with me and them.

  • Fork and adapt them for your own projects

  • Suggest improvements to the data models

  • Build implementations using these schemas

  • Propose additions that I haven’t considered

I’ve published the lexicons on https://tangled.org/@renderg.host/at-scholarly-lexicons using the very permissive GNU GPLv3 license that lets people do almost anything they want with it except distributing it in closed source versions.

Explore the Repository

Related Notes

5 Likes

Thanks for sharing @renderg.host

I hadn’t really considered licensing for lexicons before. They aren’t really code per se so CC licenses likely more appropriate.

Because they are data definitions not code, and the data lives in user repos — we actually want everyone using them including closed source programs.

This is obviously a side point but I think an interesting one that I should tag @ngerakines.me and sone other lexicon folks on.

The Community Specification is something I’ve used for more standards / protocol oriented stuff GitHub - CommunitySpecification/Community_Specification: Community Specification 1.0 that might be appropriate if you form a consortia around this.

1 Like

Thanks for the tip @bmann.ca .

I’ll happily change the licenses to make them more permissive. Let me review each option and see which is best.

I would happily take advice from the folks with more experience. My goal is to make them as available as possible for the community.

I also posted this same content in the Lexicon community btw.

btw @bmann.ca I’ll be at Eurosky in 2 weeks. I would so happy if you could introduce me to anyone in attendance who can collaborate on this.

Thanks

2 Likes

yeah, I wouldn’t worry too much about it yet. I have a long background in licensing and at this point I think it matters very little: governance and support are the levers around shared projects.

You know @tgoerke.bsky.social and there are likely some other folks around. Growing things here, some feeds, and general community bootstrapping is likely the direction to go, as well as get people out to ATProto for Science in Vancouver in March alongside ATmosphereConf.

I think a lot of the work here is about recruiting others and going to where academic communities already are and providing small product loops we can all learn from.

My intuition is that less formal processes and innovative social graph exploration could be a draw. For Cosmik Network, we’ve talked about creating academic profiles (in contrast to bsky profile records), around which a cloud of apps and data can be run.

2 Likes

Thanks for starting this discussion @renderg.host !

Just wanted to mention some interesting parallel work in the Fediverse to keep in sight for potential interop Julian Fietkau: "@bonfire@indieweb.social @mayel@sunbeam.city @jon…" - fietkau.social

2 Likes

This is fantastically useful. Thanks @ronentk.me :sign_of_the_horns:

1 Like

No solutions here tonight but some thoughts on things that I didn’t notice in the lexicons or docs that likely need consideration:

  • accounting for preprint → publication pipeline with clear links between documents; haven’t read the leaflet about versioning yet but could be relevant here
  • similarly, handling corrections
  • accounting for authors without DIDs - pseudo DID? special DID that’s actually a placeholder that can be claimed? leave blank - could it be added later?
  • who owns the record, i.e. whose PDS is it stored on? → don’t recall if there’s actual momentum on shared-data governance

Looking toward more open reviewing (I can’t recall the site, but there’s a place researchers can kinda leave reviews on papers they’ve read): where does that fit in lexicon vs established peer-review?

2 Likes

(I can’t recall the site, but there’s a place researchers can kinda leave reviews on papers they’ve read)

Maybe you were thinking of alphaxiv?

Either way - we’re working on adding open reviewing with Semble! And yes having different lexicons for different review types seems important.

On a different topic, Bryan posted some thoughts on versioning that same relevant for science publishing applications as well Record Versioning - at:// pizza thoughts