Thinking globally, acting locally, and at protocol.
What Is Localization?
Localization is the process of and technology for providing content in a locally relevant language.
Localization may be one of the most important set of standards on the internet.
In the early days of the web, localization was pretty problematic. Most sites were just assumed to be English, and this gave English a structural advantage in computing (at a cost to the rest of the world)
This advantage lasted between 1991-1999, and and those 8 years have perpetuated a fair amount of structural inequality in the world.
In 1999 RFC2616 was standardized, and this provided a way to cannonically describe languages and negotiate with a server to get the right language.
The Accept-Language header helps negotiate the language.
The Content-Language header should send the content language.
Accept-Language allows a semicolon delimited list of acceptable languages (and optionally) desired quality:
Accept-Language: da, en-gb;q=0.8, en;q=0.7
Per their example, it means I will accept Danish, then British English, then English, and Iāll weight the quality of a page that is British English as 0.8 and any English as 0.7.
All of this has been standard for a very long time.
Unfortunately, at present, at protocol does not generally provide a locale within any lexicon.
The few lexicons that currently provide a locale actually only allow a slim portion of this standard:
They allow a two letter language code.
This is quite a bit less than ideal.
Why?
Because dialects vary wildly.
For example, there are 48 french dialects. 28 Arabic dialects. 9 Chinese dialects. A native speaker of Mandarian may not be fluent in Cantonese.
Locales also provide key information about local relevancy. For example, if I am looking for news content in en-US, Iām likely to be more concerned about that than content in en-GB.
In short, **locale really matters to billions of people.**
Unfortunately, at protocol does not currently support those billions of people well.
What Can We Do
app.bsky.feed.post
The two-letter languages field in a app.bsky.feed.post is not sufficient to many of these problems. IMO, it should be, just by expanding the list of acceptable options. Allow any valid locale string. This is supported cleanly by JavaScript, DotNet, and most programming languages worth their salt.
Ideally, it would be named ālocaleā, not languages, but that ship has likely sailed.
newer lexicons
Newer lexicons can easily fix this by providing a standard field.
Again, it I believe this should probably be called Locale, but Language(s) is acceptable.
Whatās important is that this accepts the full list of language formats.
This list is available here: List of ISO 639 language codes - Wikipedia
You can also get a local copy of this list in PowerShell by using the DotNet CultureInfo class
[CultureInfo]::GetCultures('allcultures')
app views
IMO, App views should honor the accept-languages headers and provide content language headers matching the currently viewed content.
You can check the currently acceptable languages with the navigator.language object in JavaScript:
Why We Should Care
Overall, the at protocol is trying to build a brighter/better tomorrow for the internet.
The first time the internet got this wrong, it impacted the rate of adoption for the internet across the world, and hurt billions of people by making them wait for prosperity to reach them.
This structural inequality contributes to the mix of programmers and internet users to this day.
In my opinion, if we want to build a brighter tomorrow atop at protocol, it is critical that we get this as right as we can as soon as we can.
The sooner we do this right, the sooner the rest of the world can enjoy a bright blue sky!
Please help do your part to make a better web for everyone (not just everyone that speaks English)