$localize - message-id support

Background

The original design doc for $localize avoids message-ids for Angular v9 since the template compiler does not need to generate messages based on ids.

Therefore the original implementation for $localize relies upon the source-message as the "key" for looking up translations. For example give then following template:





<head>
  <div i18n>
    Hello, {{title}}!
  </div>
</head>

The Angular compiler generates the following source-message, which is then used as a translation-key.

" Hello, {$interpolation}! "

Problems with source-message as translation-key

Multiple meanings

It is possible for there to be more than one translation for a source message depending upon the context, requiring a custom/manual id:

"right" (correct)          => vrai
"right" (opposite of left) => droit
"right" (fair)             => juste

In the curent version of Angular the message "meaning" is combined with the source-message to compute the id.

Whitespace removal

In order to support consistent rendering of expandable ICU messages (containing markup), the source-message has its whitespace removed, if preserveWhitespace: false. This means that the source-message in the template is different to that extracted into translation files. In the example given in Background the source-message in the template is:

" Hello, {$interpolation}! "

while in the translation file it is:

"\n    Hello, {$interpolation}!\n  "

This prevents accurate translation lookup based on the source-message string.

Proposal

Localized string syntax

Extend the format of localized strings to allow "meaning", "description" and "message-id" to be provided. This metadata is provided at the start of the string marked by colons. Each of these is optional. For example:

$localize`:meaning|description@@message-id:source message text`;

$localize`:meaning|:source message text`;
$localize`:description:source message text`;
$localize`:@@message-id:source message text`;

The delimiters within the colons are chosen to match those already used in Angular templates. For example:

<h1
  i18n="site header|An introduction header@@introductionHeader">
  Hello i18n!
</h1>

If a source-message actually starts with a colon then it must be escaped with a backslash. For example:

$localize`\:this message actually starts with a colon (:)`;

This approach is similar to that already implemented for named placeholders: the substitution is post-fixed by a colon delimited placeholder name. For example:

$localize`Hello ${item.name}:title:`;  // placeholder name is 'title'
$localize`${hours}\:${mins}\:${secs}`; // colons are part of the message

Template generation

The template source-message strings are not guaranteed to be identical to those in translation files. See Whitespace removal) above. Therefore we must provide the computed id when generating $localize tagged strings in template generation. For example:

$localize `:@@123456789: Hello, {$interpolation}! `;

This ensures that translation of this message is based on the message-id and not the source-message.

Translation matching

Use the message-id as the key when looking up translations rather than the source-message.

If a source-message does not provide a custom message-id then compute one.

Computed message-ids are generated using the same algorithm as XLIFF2 and XMB/XTB formats. This is implemented in the decimalDigest() function.

$localize.translate() implementation

The $localize() function passes the messageParts and expressions to the $localize.translate() run-time translation function. The current implementation computes the source-message and uses that as the translation-key.

function translate(messageParts: TemplateStringsArray,
   expressions: readonly any[]): [TemplateStringsArray, readonly any[]];

Modify this function to compute the message-id instead and use that as the translation-key.

Translation storage

Internally each translation is stored as a ParsedTranslation object, which is kept in a lookup table.

export interface ParsedTranslation {
  messageParts: TemplateStringsArray;
  placeholderNames: string[];
}
type SourceMessage = string;
type InternalTranslations = Record<SourceMessage, ParsedTranslation>;

Modify this lookup table so that the key is the message-id.

type MessageId = string;
type InternalTranslations = Record<MessageId, ParsedTranslation>;

Loading translations

The loadTranslations() function accepts a translations argument:

export type Translations = Record<SourceMessage, TargetMessage>;
export function loadTranslations(translations: Translations): void;

Change this function to accept a translations argument that is a map of message-ids instead of source-messages:

export type Translations = Record<MessageId, TargetMessage>;
export function loadTranslations(translations: Translations): void;

When calling loadTranslations(), the caller is now responsible for providing the message-id of each translation: either custom message-ids or computed via decimalDigest().

If the translations are loaded from a formatted translation file that uses the same algorithm as decimalDigest(), e.g. XLIFF2 or XTB, then the loader can just use the message-ids directly.

If the translations are loaded from a formatted translation file uses a different digest algorithm, e.g. XLIFF 1.2, then the loader must convert the given message-id before calling loadTranslations(). This is done as follows:

Compute the message-id, using the appropriate digest algorithm, from the source-message in the file
Check this computed message-id against the one stored in the translation file.
If the computed message-id is different to the one in the file, then this must be a custom message-id
- use the custom message-id unchanged
else the message-ids are identical then this must be a computed message-id so compute a new message-id the decimalDigest() algorithm.
- use the new computed message-id

Pete Bacon Darwin

2019/09/05 14:25:55

This is actually buggy because we ought to be including all the whitespace in the source-message: ``` "\n Hello, {{title}}!\n " ``` Currently the Angular template lexer is collapsing the whitespace around the message. (Edited)

2019/09/06 07:35:26

Actually 2) is probably a bug as well. We ought to ensure our tools always generate the same source-message for the same input (i.e. Angular component template) (Edited)

2019/09/09 07:13:56

Actually, this is by design... when support was added for ICU messages, it was determined that the ICU message should remove whitespace in line with the rest of the template. This means that the message string in the template is not the same as in the extracted translation files. (Edited)

2019/09/06 14:29:16

This is fixed by https://github.com/angular/angular/pull/32509 (Edited)

2019/09/09 07:17:33

Instead the message-id is computed based on the full raw message string, before the whitespace is removed. (Edited)

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.