The Missing Bit

Notes about localization

Living in a multilingual country, nearly all apps I make are localized.

At least in french and english, but also often german. Having to deal with 3 languages on all my apps wasn’t always easy.

In this article I share some experience and thoughts about localization.

Phrase length

This is no secret, different languages use a variable amount of space to display the “same” sentence.

For example, the following sentence “Friendly Rabbits” could be written in english, in french, it would be “Les lapins sont amicaux”, and in german “Kaninchen sind freundlich” we could try in japanese and write “優しい兎”. Those are respectively 16, 23, 25 aand 4 characters long. (note: I know that the french and german translation has a bit of a different meaning, but it doesn’t change my point).

I won’t linger over asian writings, as I only have limited japanese skills and there are plenty of writing systems in that part of the world, but it’s possible for the japanese translator to write “やさしいうさぎ” if it’s for children for example or to render a certain style. Asian characters might require a different font size also.

Our initial sentence can nearly be 60% longer in german that it is in english. In many UI, this can be a problem, that’s why it is important to ensure the following points:

It’s also important to give as much information to the translators about the phrase as possible, to help them keep the length of the translated phrase as similar as possible as the original, by rewriting it if needed. For example the above example french version could be written “Gentils Lapins”, while it has a slightly different meaning meaning literally “kind rabbits”, it would work, it might even be better depending on the context. It’s 14 characters long, so very similar to the original which is 16. This leads us to the next section.

Translation context

Translation is not about taking some text and rewrite that text using words from another language. Above I already mention the alternative “Gentils lapins” in place of “Les lapins sont amicaux” which literally means “The rabbits are friendly” and might not be the best translation here. “Gentils lapins” on the other hand might be better, but this depends highly on the context. There is also one detail to be adressed, why the original english phrase uses a capital “R” for “Rabbits”?

The translation context is at minimum a location like “main title of the chat pannel” but it should be more. Usage, comments, screenshot… all provide important information to translate the phrase.

With proper context, like “This is the main title of the chat panel on the side, we use the british word rabbit for conversation, as that pannel gives use a peak on the last conversations.”. With this single comment, your world crumble, as you thought we were speaking about the animal (I did too, but I thought it would be an amusing twist). The capital “R” is clearer also, it’s a sidebar title and the UI designer thought a capital “R” would be more balanced. It can be easy to fix that particular detail with CSStext-transform: capitalize but it might be weird in some languages.

Don’t use keys

Systems like rails i18n require the developer to come up with translation keys.

In your code, you write:

link_to(t('views.login.confirm'), login_path)

Coming up with a key has the following problem:

A better solution is to use something like gettext, which let you write your code in a more natural fashion:

link_to(_('Login'), login_path)

You then run a script (it depends on the platform you use), and it will extract the phrases in a translation file.

The gettext version is easier to write, provides a default value and is easier to read.

Do format or do not, but there is no try

Number formatting is tricky, for example, “4000” might be written “4,000” in english, “4’000” in Switzerland french, “4 000” in France french, “4.000” in italian, 4千 in japanese. And it might be dependent on what the number represent, money value can be written differently for example. (4’000.- in case on switzerland)

My advice is simple, either format your numbers perfectly, and have this particular part tripled checked by QA and reviewers. Or, simpler, don’t format numbers. Just write “4000” and be done with it, “4,000” in french means [4, "decimal separator", 0, 0, 0] which is “4.0” in english, you don’t want to confuse the user like that. “4000” on the other hand is universal.

Depending on your use case, you might write “4 thousands” too.

Also, be wary of number input, you will get 3,14 instead of 3.14 if someone with a french keyboard sends you a form.

I won’t speak of dates as the problem is similar, just use ISO date like “2016–06–01” for june first of 2016.

In the end, ask your users, they know what to expect.

As an additional note, be sure to not stale relative times. For example, if I leave a webpage open with “last online, 5 minutes ago”, and come back 1 hour later, it should read “last online, about 1 hour ago”, if you can’t update the time, don’t use relative times.