Mapping web pages to canonical AT records

Two quick things

  1. Our bad. This incompatibility was an oversight when we put together the at:// url spec. It’s really a frustrating one too, because the DID scheme has provided very marginal value overall. But I digress. We should’ve caught this and I apologize for the trouble it’s causing now.
  2. Whatever the community chooses here is fine to me. There’s a lot of good understanding of the problem and standard site (and the surrounding decisions) has always been community driven.

That said, I do want to ask how pressing this issue is? As has been stated elsewhere, we do expect to address the core issue – likely through the IETF WG. If the pain here is low enough that it could sit for, say, a year, then we could just eat the short-term pain or writing workarounds in some of our tools.

If there’s energy to just move to the meta tags and people are good with it, then go for it. Just curious if it goes beyond a wart that got jammed up on svelte.

4 Likes

I think the <link> validity was more the spark to the conversation for me. The more interesting questions are can we reach consensus on a shared format that can work with for any AT record, how do we make this common practice across the atmosphere, and what are the downstream benefits for the ecosystem. I’m not sure that the way that standard.site is currently doing it with an NSID in the rel attribute would be the right approach to generalize.

Even if <link> validity wasn’t a concern I think we would need to be careful not stepping on established web practices, even if they’re conceptually similar. <meta> personally feels like the compelling direction atm.

<link rel="canonical" href="at://…" />
<link rel="me"  href="did:plc:…" />

vs.

<meta property="at:uri" content="at://…" />
<meta property="at:did"  content="did:plc:…" />
2 Likes

Is that not what the <link> tag is designed for?

1 Like

Element tagging is out of scope in this thread, but all else being equal, I’d be interested in <meta itemprop> as it both allows non-URI content (at:// is valid) and it’s also allowed in flow content. Meaning using it would resolve our standard site issue:

<head>
  <meta itemprop="renders-at-uri" content=”at://did:plc:whoever/site.standard.publication/abc123”>
<head>

And leave the door open for discussing element tagging in a separate thread:

<blockquote>
  <meta itemprop="renders-at-uri" content=”at://did:plc:vc7f4oafdgxsihk4cry2xpze/app.bsky.feed.post/3l6n7c6cyx42i”>
  people will think anything on bluesky. "he's got thousands of followers! he must have good posts." no they aren't. that isn't true.
</blockquote>

I really don’t want to restart the closed conversation about element tagging, but I wanted to highlight that—unlike <link rel="canonical"> tags (see here)—<meta itemprop> specifically is allowed in flow content and can hold any text we want (avoiding the URI issue).

(I picked renders-at-uri as I think it’s semantically descriptive of what the link is doing—rather than eg. at:uri which, like the at-uri itself, is only descriptive of what it is—but I have no attachment to it specifically)

1 Like

Throwing in my strong support for stronger semantics in this representation, since it’s being rediscussed (I had meant to bring it up with the original design too, so it makes me happy to see :P)

The original issue being that:

<link rel="site.standard.document" href="at://did:plc:xyz789/site.standard.document/rkey">

is duplicating the same information twice, and tells me nothing(ish) about what that content actually is.

It also forces standardization over standard.site.publication, even if a different lexicon were to come along or I wanted to use mine.

Compare to (syntax slightly apocryphal):

<meta rel="alternate" content="at://did:plc:xyz789/my.own.lexicon/rkey" />
<meta rel="alternate" content="at://did:plc:xyz789/site.standard.document/rkey" />
<meta rel="author" content="at://did:plc:xyz789" />

which tells me both records are an alternate representation of this page, and who the author is.

Proposal

For an actual proposal, I think this could work by literally taking a page out of OpenGraph:

<meta property="at:author" content="at://did:plc:xyz789" />
<meta property="at:alternate" content="at://did:plc:xyz789/my.own.lexicon/rkey" />
<meta property="at:alternate" content="at://did:plc:xyz789/site.standard.document/rkey" />

With this syntax, the part after at: in property encodes the semantic meaning of the link, and not its lexicon or data type.

Encoding URI type

I’m not sure why it seems people want to encode at:did vs at:uri as part of the property itself. There may be a reason I’m not seeing, but it seems fairly easy to tell at parse time whether you’ve been handed a bare did or a full uri.

The one case I can think of where that’d be convenient is styling with CSS: selectors cannot distinguish an AtUri and a Did, and thanks to :has() you can now style body elements according to meta tags in the head. It’s very cursed, but you can.

Anyway, if we do want to carry that distinction explicitly, it could be its own attribute without getting baked into the property name:

<meta data-at-type="did" property="at:author" content="at://did:plc:xyz789" />

Extending the data types

The above keeps the property slot meaning what the record represents, the way og:title / og:description / article:author do, rather than encoding the value’s shape.

This opens up a whole lexicon-agnostic layer of standardization on ATproto. Other things that could be standardized with the same system:

  • at:event => the event the page represents [edit:] an event mentioned on the page, see later reply)
  • at:replies-to => what this blogpost is written in reply to (e.g. another standard.site document)
  • at:tag => A list of tags (represented via AtUris not as strings, which will happen as soon as fanfictions people start building tagging systems)
  • at:copyright => Who owns the copyright of this content (their did)

There’s a lot of vocabularies one could take inspiration from, like IANA link relations, schema.org, dcterms… people building will make up their own regardless, but we could settle on an obvious few to carry over (in a different thread).

Most important: with this, my application doesn’t need to hard-code every possible at:alternate lexicon just to parse a page and check whether it carries a given property. This is unlike site.standard.document, where I’d have to know the lexicon up front. It also still hands me more information than a bare at:did or at:uri would, and allows for the same AtUri to appear more than once carrying different semantic meanings.

4 Likes

Strong support towards at:alternate and at:author. Small notes:

  1. at:event seems like a slightly redundant specialization of at:alternate; both links carry the same base semantics (there is a record that corresponds to this page) and at:event would simply serve as a shortcut for consumers that only care about events, regardless of the actual collection being used. I think this shortcut, along with ones for other broad types of records, should be kept but placed in the at:alternate namespace to make this specialization explicit:

    <meta property="at:alternate:event" content="at://did:plc:…/community.lexicon.calendar.event/…" />
    <meta property="at:alternate:profile" content="at://did:plc:…/app.bsky.actor.profile/self" />
    <meta property="at:alternate:track" content="at://did:plc:…/fm.plyr.track/…" />
    

    Generic consumers can process each link as if they were at:alternate links without a specific type, and consumers focused on a specific app modality can skip past links that aren’t useful to them.

  2. I’d use a similar kind of specialization for at:replies-to, using the generic namespace at:ref:

    <meta property="at:ref:bookmark_of" content="at://did:plc:…/standard.site.document/…" />
    <meta property="at:ref:in_reply_to" content="at://did:plc:…/app.bsky.feed.post/…" />
    <meta property="at:ref:review_of" content="at://did:plc:…/sh.tangled.repo.pull/…" />
    <meta property="at:ref:rsvp_to" content="at://did:plc:…/community.lexicon.calendar.event/…" />
    
1 Like

To clarify at:alternate vs at:event (I did not do a good job in the original post) the difference to me would be:

  • at:alternate with event record ⇒ the page represents the event itself
  • at:event ⇒ the event is mentioned in the page, but is not the subject of the page (think a blogpost that’s promoting an event or was made as part of it).

That said, I would (personally) leave honing in on the full semantic to a side thread once people agree on the general direction. That seems an even longer discussion to have :stuck_out_tongue:

3 Likes

I’ve been lurking but I just want to chime in and put support behind the at:alternate OpenGraph-style proposal.

Now that @essentialrandom.bsky.social points it out, it seems obvious that the property attribute has been mistakenly discussed as a shape-defining field rather than a semantic-defining field.

I think this is important to highlight, reiterating the semantic vs shape difference and rebutting the proposed at:{nsid}-type format, that here the general principle is that the NSID of whatever lexicon is involved should not be in the property but in the content.

Tangentially, how much would break if you ripped off the bandaid and got rid of the //?

thanks for articulating this @essentialrandom.bsky.social, @mfzx.net, and @uncenter.dev, I’m on board with this direction! Some question on my mind are:

  • What is the smallest set of semantic properties for the ecosystem to reach consensus on?
    • I propose we start with three properties. Similar to Open Graph, I think they should follow array semantics.
<!-- The DID of the web page's author -->
<meta property="at:author" content="at://<did>" />

<!-- The canonical AT URI of the web page -->
<meta property="at:uri" content="at://<did>/<nsid>/<rkey>" />

<!-- An open set of non-standard relationships of the web page. -->
<meta property="at:rel:{...}" content="..." />
  • How do we make the language extensible without harming future naming?
    • Name spacing all other relationships lets us keep the consensus small to start with, and to allow more non-standard/domain-specific relationships to naturally emerge out of the community. If an new ecosystem-wide property arises, governance over this standard should be able to easily amend it. For example, standard.site defining the publication of the web page.
<meta property="at:rel:publication" content="at://did:plc:abc123/site.standard.publication/rkey" />
  • What does ecosystem governance of this look like?
    This is TBD, but maybe we can follow existing governance structures of standard.site or lexicon.community?

  • Are any of those properties required? (like Open Graph requires og:title, og:type, og:image, og:url)

    • I don’t think so. In some cases there might only be an author, in other cases there might only be a canonical URI.
1 Like

at:uri and at:rel would be the smallest set I’m comfortable with. at:author may be redundant if the page also includes a canonical at:uri link, and I’m not sure how a page that doesn’t include a canonical at:uri link would be making any useful claims. In my understanding, linking to a record with at:uri implies the page was authored by the referenced DID, a claim that can be verified by fetching the record and checking for lexicon-specific fields that link in the reverse direction; just linking to the author doesn’t allow for verification of the claim of authorship in the same way. Not entirely opposed to at:author, just wondering when it’s intended to be used, and what you make of the overlap between it and at:uri.

Personally, I think at:uri does not convey enough meaning. Since it just sounds like the broad concept of an AtUri, people are likely to assume it means any generic AtUri, rather than look up whether it has stronger semantic.

I’d prefer at:canonical and at:alternate, modeled after the rel attribute:

  • rel="canonical" => Preferred URL for the current document.
    at:canonical => preferred AtUri for this page.
  • ref="alternate" => Alternate representations of the current document.
    at:alternate => Other AtUris this page may represent.

I’m ambivalent about prefixing thing with at:rel:, but I’m not strongly opposed. I think some allowance for namespacing is good in general. For example, I could see myself prototyping with e.g. at:com.fujocoded:my-cool-idea…even if it is a bit of a mouthful.

The semantic could be:

  • at:[name] => something there’s some agreement on, honor system
  • at:[namespace]:[name] => experimentation, or something based on an existing system (e.g. rel or org.schema).

Agreed.

Hard to tell, but that’s also where we can look at the current, concrete cases without trying to solve further out than what we can see now. Eventually, the collection in the AtUri does provide further meaning, so you can do a lot even if we start with a simple at:alternate.

at:uri is not enough to imply at:author: presence of the entry in a PDS does not mean the authorship is with the PDS’s owner. This is going to quickly become common with communities, but even now publications can have different authors that publish the content in the publication’s PDS. Sometimes there’s even multiple authors.

Getting concrete

Since we have some examples of what caused issues, I would start with a couple at:[name]s that match those needs.

On standard.site I see these two:

<link
  rel="site.standard.publication"
  href="at://did:plc:abc123/site.standard.publication/rkey"
/>
<link
  rel="site.standard.document"
  href="at://did:plc:xyz789/site.standard.document/rkey"
/>

And I also I have a concrete need for a way to represent events, and the people associated with a page.

Which means my concrete choices would be at:canonical, at:alternate, at:author, at:me.

This would allow for:

Publication main page

<meta property="at:canonical" content="at://did:plc:abc123/site.standard.publication/rkey" />

Article page

<meta property="at:canonical" content="at://did:plc:xyz789/site.standard.document/rkey" />
<meta property="at:alternate" content="at://did:plc:abc123/site.standard.publication/rkey" />
<meta property="at:author" content="at://did:plc:author" />

Event page

<meta property="at:canonical" content="at://did:plc:xyz789/community.lexicon.event/rkey" />

Personal site

<meta property="at:me" content="at://did:plc:my-did" />

or even

<meta property="at:me" content="at://did:plc:my-did/com.atprotofans.profile/self" />

which also indicates my favorite profile to display me as.

Extra notes

As a last thing, we could encourage people to do:

<meta property="..." data-at-type="record" data-at-collection="site.standard.publication" content="at://did:plc:abc123/site.standard.publication/rkey" />

So that people who requested that this info can be surfaced more easily, can have a way not to parse the whole thing. However, we should still consider “content” as the ultimate source of truth, and this as a convenience.

In the meantime, to give people space to experiment, at:[namespace]:[name] should also be considered valid by any tooling built around this.

3 Likes

@chrisshank.com governance is a place where these things are documented.

Moving it into a git repo is probably a good move as iterations slow / stop.

Maybe Ms Boba’s post here is pretty close to final?

And realistically — adoption. Does this solve Standard Sites problem? Will that change be implemented quickly? What other apps (including a PR to the Bluesky app) might adopt this now?

Is this helpful for Sill or Semble?

And then we see what happens and what other needs come up.

2 Likes

Thanks for sharing you’re thoughts @essentialrandom.bsky.social! I really like the core properties being at:canonical, at:alternate, at:author, at:me, and everything else being a custom namespace.

  • Wondering ifat:author should have the same profile semantics as at:me? Or maybe we should hold off on linking to a profile record until the community has more consensus on profile lexicons?
  • I do worry that at:alternate is a little too abstract. Based on you described it I’m sort of thinking about it as “AT URIs that contextualize the canonical AT record”. That results in an implicit relationship between the lexicons of at:canonical and at:alternate. For example, if I’m making a standard.site reader, I would first find the at:canonical, if it’s NSID is site.standard.document, then I would then try to find the publication by looking up the at:alternate with a site.standard.publication NSID. As opposed to just looking up the at:standard.site:publication property (or whatever it ends up being called).

I haven’t done a ton of research, but I like the direction of @essentialrandom.bsky.social’s proposal here.

at:canonical and at:alternative make sense to me. Separate at:author (meaning the author of the content in this specific page/URL) and at:me (meaning an account associated with the overall website or section of website) make sense to me. The at:me mechanism somewhat overlaps with the handle resolution mechanism when it comes to domain names, but folks often have multiple or different domains associated with an overall online persona. Eg, you might have username.gitforge.io as a technical blog and want to associate that with @userna.me as a shorter social handle.

2 Likes

just a quick footnote here that there is a whole schwackadoo of prior art here going back like two decades, eg the open graph schema looks like this; dublin core (which looks like this) is even older, and of course there is schema dot org. link relations in general are a wheel that do not need reinventing.

If you insist, though, the polite thing to do would be to yoke the terms to other schemas via owl:equivalentProperty and rdfs:subPropertyOf.

As far as I know, all prior work on link relations assumes links are well-formed URIs, which is incorrect for AT-URIs as explained previously. It would be nice to be able to naively use <link rel="canonical" href="at://did:plc:…/site.standard.document/…" /> and so forth, but defining our own mechanisms for link relations allows us to define ATP-specific semantics that prior work has no concept of.

For at:author and at:me, will the recommended syntax be a bare DID or an AT-URI with no path? My personal preference would be a bare DID. Should both forms be allowed, or should it be strictly one or the other?

link relations with a prefix are just a prefix mapping.

<link prefix="at: https://authority.lol/ns/atproto#" rel="at:potato" href="did:plc:hlguahglaugahgl"/>

if there’s a problem with the reference not being a valid URI then I suppose you could do

<meta property="at:potato" datatype="at:AtProtoNotAURIWoopsie" content="at://whatever.lol"/>

(I mean, I would just rip the bandaid off and change that before it gets really, reeeeallly baked in there)