Barazo - forum AppView lexicons for topics, replies, and reactions

Hi all :waving_hand: I’m Guido, longtime BlueSky user, build some smaller apps for BlueSky/ATproto but now it’s time for a real challege… :sweat_smile:

I’ve built communities and implemented community platforms for 20+ years. Thinking about community platforms in light of recent trends in data ownership, sovereign software and combined with the ATProtocol… I see an opportunity to build something cool.

So I’m building Barazo, a forum AppView on AT Protocol. Think Discourse/Flarum alternative, but with portable identity and user-owned data at it’s core. The ease of use of signing in once (like Discord or Slack), but decentralized and under your own control.

The forum should of course be able to stand completely on its own. But if/when it gets a decent amount of traction, you could really start to benefit from the network effect that ATproto offers. For example: you can use cross-forum reputation, benefit from global labels etc. etc. In theory, you could even start a new forum with content that is already existing on the ATProtocol.

I’m still early in development and this is my first time designing lexicons, so I wanted to share what I have before I paint myself into a corner. Reading the Skylights/Bookhive situation was enough to convince me that coordinating early beats discovering incompatibilities later.

Current lexicons

5 records under forum.barazo.*. Full schemas in barazo-lexicons.

forum.barazo.topic.post

Forum threads. Title, markdown body, community DID, category, up to 5 tags, optional self-labels.

{
"required": ["title", "content", "community", "category", "createdAt"],
"properties": {
   "title":         { "type": "string", "maxGraphemes": 200 },
   "content":       { "type": "string", "maxLength": 100000, "description": "Markdown body" },
   "contentFormat": { "type": "string", "enum": ["markdown"] },
   "community":     { "type": "string", "format": "did" },
   "category":      { "type": "string", "maxGraphemes": 64 },
   "tags":          { "type": "array", "maxLength": 5, "items": { "maxGraphemes": 30 } },
   "labels":        { "refs": ["com.atproto.label.defs#selfLabels"] },
   "createdAt":     { "type": "string", "format": "datetime" }
  }
 }

forum.barazo.topic.reply

Threaded replies. Root and parent are both strong refs – for top-level replies, parent == root.

{
“required”: [“content”, “root”, “parent”, “community”, “createdAt”],
“properties”: {
   “content”:   { “type”: “string”, “maxLength”: 50000 },
   “root”:      { “ref”: “com.atproto.repo.strongRef”, “description”: “Original topic” },
   “parent”:    { “ref”: “com.atproto.repo.strongRef”, “description”: “Direct parent (topic or reply)” },
   “community”: { “type”: “string”, “format”: “did” },
   “labels”:    { “refs”: [“com.atproto.label.defs#selfLabels”] },
   “createdAt”: { “type”: “string”, “format”: “datetime” }
 }
}

forum.barazo.interaction.reaction

This is the one I’m most unsure about. Reactions are not hardcoded: the type field is a free string, and community admins configure which types are available. One community might only have “like”; another might have “agree”, “disagree”, and “insightful”.

{
“required”: [“subject”, “type”, “community”, “createdAt”],
“properties”: {
   “subject”:   { “ref”: “com.atproto.repo.strongRef” },
   “type”:      { “type”: “string”, “maxGraphemes”: 30, “description”: “e.g. ‘like’, ‘heart’, ‘helpful’” },
   “community”: { “type”: “string”, “format”: “did” },
   “createdAt”: { “type”: “string”, “format”: “datetime” }
 }
}

forum.barazo.actor.preferences

User settings: maturity level, muted words, blocked/muted DIDs, cross-post defaults for Bluesky and Frontpage. Key is literal:self ,so one record per user. Full schema here.

forum.barazo.defs

Reserved, currently empty.

A few design decisions worth explaining

Every record carries a community field (a DID). An AppView can use this to aggregate and filter across communities without guessing from context.

Reputation is computed AppView-side from reaction records. Barazo doesn’t store reputation on the PDS at all (because a user could trick/fabricate this). The thinking: different AppViews should be free to compute reputation differently from the same underlying data. A “helpful” reaction in a programming community probably shouldn’t carry the same weight as one in a meme or gaming community (e.g. the context-collapse phenomenon).

Where I see a challenge: interoperability

Frontpage has fyi.unravel.frontpage.vote: just subject and createdAt. Mine adds type and community.

I keep going back and forth on whether this divergence is a problem. A Frontpage upvote and a forum “helpful” reaction (which could be a range of emoticons, like LinkedIn has or broader) feel like different actions to me. Different intent, different context. Forcing both into one schema loses something. But then I think about the case where you want to say “this user has N positive signals across multiple apps” without having to understand each
app’s reaction vocabulary, and a shared base starts looking attractive.

I don’t have a strong opinion here. Curious what others think.

Reputation labels (future idea)

I’m already subscribing to Ozone for spam/bot detection in moderation. Further out, I’d want to go the other direction: publish our own labels based on aggregated forum activity. Not content warnings, but positive trust signals: “trusted-contributor”, “domain-expert-rust”, that kind of thing, derived from sustained quality participation across communities.

PMsky is the closest thing I’ve found in peer-voted labels for account authenticity. But that’s moderation-flavored. I haven’t seen anyone publishing reputation labels through the labeler system yet. If there’s a reason that’s a bad idea, I’d rather hear it now. :sweat_smile:

Questions

  1. Does forum.barazo.interaction.reaction overlap with anything in progress?
  2. Is anyone working on a shared endorsement or quality-signal lexicon?
  3. Should forum reactions aim for compatibility with Frontpage’s vote record, or is the semantic gap large enough that divergence is fine?
  4. Any red flags in the schemas themselves, things I’ll regret later?

Full schemas: https://github.com/singi-labs/barazo-lexicons

2 Likes

Welcome @gui.do, looks like an interesting project.

For 1-3, unless you actively want to steward a shared schema, you should go ahead and experiment with what works for you and your product.

You’ll learn most after you launch, and of course there are an increasing amount of other products to learn from.

There are a couple of nascent endorsement / quality things. @ngerakines.me has endorsements in atwork.place, the setup there is reciprocal records between two people. See bmann.ca - at://work for a visual example.

Don’t know that labelers are the right way to do this, I can actually imagine this as records / aggregation, and then the ability for people to deploy their own reputation algorithms (with your product choosing a default).

I’ll let others weigh in on schema design.

3 Likes

Thanks Boris, much appreciated!

Re: “records/aggregation, let people deploy their own reputation algorithms”: yeah I agree this would be ideal. The architecture already works this way (raw interaction records on user PDS, AppView-side computation, per-community weighting), but I had a loose plan to eventually publish computed “trusted-contributor” labels through our own labeler.

Your point makes me rethink that. I’ll export reputation components (badges, memberships, endorsement records) and then give forum admins controls to configure how those signals map to local trust levels. And figure out how forum admins should be able to configure something like this in the backend, fun challenge ;).

If anyone watches: still looking for feedback on the schemas themselves, particularly the community DID field on interaction records (reaction, vote). The firehose self-containment argument feels solid to me but I haven’t seen it discussed as a general convention anywhere. Is that a pattern others would find useful, or am I overloading the record?

1 Like

Why max 5 tags? Generally a length limit should be conservatively high to allow for unexpected circumstances, 5 is pretty low.

It’s been pretty common for me to run out of tag space in other platforms, it’s as if the programmers are assuming that no honest person would be considerate enough to apply more than that, or that there couldn’t be enough usefully specific or well-populated tags in the system to necessitate the use of more (which becomes a self-fulfilling prophesy).

But another reading is that tag systems where the author has to set all of the tags or else the tag wont be added are kinda bad. A really good tag system, imo, would allow people other than the OP, or the result of a crowdvote, to decide which tags get applied. So maybe if I were to implement a forum under this schema I would use the labels list for tagging.

2 Likes

Good point, I bumped it up to 25. The original limit of 5 was conservative for no strong reason. The UI can control how many to display; the lexicon shouldn’t be the gatekeeper :+1:

Community/crowdsourced tagging is architecturally different from author-applied tags though. In AT Protocol, the post record lives in the author’s PDS, so only the author can write to it. Community-applied tags would need a separate layer: either an AppView-managed
“tag overlay” (moderators apply tags stored in the forum’s database alongside the post index) or AT Protocol labels attached to the record by a labeler service.

Labels are probably the right path here. A community runs a labeler that tags content based on moderator decisions or community votes, and those labels travel across any AppView that subscribes to that labeler. It’s the AT Protocol-native way to let people other than the OP annotate content.

I’m shipping with author-applied tags for now and can add community tagging as a separate feature. The two aren’t mutually exclusive.

been looking at standard.site and whether their shared publishing lexicons could work for Barazo’s thread starters.

I did a field-by-field compatibility check and our forum.barazo.topic.post schema could be restructured to be standard.site-compatible without losing anything. We’d use site, title, publishedAt, content (open union with our own $type), textContent, tags, etc. Create site.standard.publication records for forum communities. Forum-specific fields (community DID, category, moderation labels, threading) stay in our namespace.

But before going down that path, there’s a more fundamental question: if we publish forum posts as standard.site-compatible documents, how do consuming apps like Leaflet and Pckt decide what to show?

Forum threads are a different kind of content than blog posts and essays. “Help, my Docker container won’t start” showing up in someone’s Leaflet reading feed would be annoying for everyone.

So I’m wondering:

  • Do apps filter by publication subscription only, or do they also have global discovery that would pick up all site.standard.document records from the firehose?
  • Is there a mechanism for content-type filtering? (Our content union entry would be forum.barazo.content.markdown, which other apps wouldn’t understand, so they’d fall back to textContent… but would they still display it?)
  • Has anyone thought about this for other non-blog content types?

If there’s no good way to filter, standard.site compatibility might cause more problems than it solves.

LMK your thoughts! :folded_hands:

1 Like

I’ve been wondering the same thing with regards to documentation pages, which should fit in perfectly alongside blog posts lexically speaking, but don’t necessarily fit into the discovery feeds of the blogging platforms.

1 Like

i think this could be a viable use case. indexers are encouraged to respect the showInDiscover property. you could set this property to false and tag on an extra field if needed for your own discovery preferences.

4 Likes

Thx for the pointer to showInDiscover :folded_hands: I think it solves a different problem than what we’re dealing with though.

For Barazo, discovery is one of the reasons to align with standard.site in the first place. If I set showInDiscover: false, I’ve done all the work of aligning schemas and then opted out of the benefit. I want forum content to be findable across the ATmosphere (within the appropriate fields if these exist).

The problem I raised in my earlier post is that forum threads (help requests, bug reports, heated debates) are a different kind of content than blog posts/ articles (which seem to be what the content field on site.standard.document is currently manly used for). Putting forum discussions (the first post that is) in a general reading feed might not be helpful (or outright annoying) if that is indeed the intention behind that content field.

So my ask is basically how generic this content field really is.

content field definition from standard.site:
Open union used to define the record’s content. Each entry must specify a $type and may be extended with other lexicons to support additional content formats.

That leaves it open for pretty much everything? :sweat_smile:

If the intention is indeed that we can use it for whatever, what probably needs to happen is some kind of content-type awareness on the consuming side. If an indexer could tell “this is a forum thread” from “this is a blog post” (based on the content union’s $type, or a field on the publication, or some other signal), then discovery works for everyone. Blog apps show blog content, forum apps show forum content.

I think it makes the most sense for me to restructure forum.barazo.topic.post to align with site.standard.document where possible:

  • Switching content to an open union (useful for future richtext support regardless)
  • Adding a site field pointing to a site.standard.publication record
  • Renaming createdAt to standard.sites’s publishedAt

These changes are worth doing on their own. But I should probably be holding off on dual-write (creating an actual site.standard.document per thread) until there’s a way for consuming apps to filter by content type. Otherwise I’d be creating exactly the feed pollution problem I just described.

1 Like