You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Devanagari (and other Indian scripts), candrabindu can nasalise the first consonant of a cluster or the vowel. In the first role, it is written at the left of the cluster, while in the second role it tends to be written centrally or at the right. On Windows 10, the font Nirmala UI distinguishes the two when rendered with Uniscribe, displaying the sequence as a single askhara (i.e. no internal halant) in both cases, but not when rendered with HarfBuzz, which displays the sequence as two aksharas. When the arrangement of the consonants in the cluster is to be left to the renderer, nasalisation of the consonant is encoded on Windows by the sequence <consonant, virama, candrabindu, consonant>.
Contrasting displays on Windows 10 (generated on the same machine and cut down from the same screen dump) are:
Internet Explorer 11 Version 11.1158.17763.0 Update Versions 11.0.185, which uses Uniscribe:
Chrome: Version 81.0.4044.129 (Official Build) (64-bit), which uses HarfBuzz:
The encoding of the various sequences row by row, left to right is
The rendering issue was confirmed to be present in the latest development from HarfBuzz Version 2.6.5. The Uniscribe rendering is linguistically correct and matches readers' expectations.
There was discussion on the Unicode list of how to encode the difference in the thread containing post https://www.unicode.org/mail-arch/unicode-ml/y2011-m06/0144.html . There Is some more recent discussion in the threads entitled "Devanagari ल्लाँ ambiguous?" and starting 5 May 2020. (No URL, as the Unicode Consortium web's server/site is still being repaired.)
The Uniscribe behaviour conflicts with the published specification, which agrees with TUS 12.1 R10 that U+0901 DEVANAGARI SIGN CANDRABINDU must follow the consonants and vowels of an akshara; this rule leads to the 2-akshara rendering produced by HarfBuzz.
This issue also impacts the porting of Devanagari to the USE.
There remains a case for treating U+0901 immediately before a vowel as a string encoding error.
The text was updated successfully, but these errors were encountered:
Andrew Glass has undertaken to update the Microsoft Devanagari specification (MicrosoftDocs/typography-issues#416) to add the current Uniscribe/DirectWrite behaviour as correct.
I checked all the Indic-shaper scripts in Notepad with strings analogous to ⟨ल्ँल⟩. Only Devanagari supports it, and only for certain marks:
U+0900 DEVANAGARI SIGN INVERTED CANDRABINDU
U+0901 DEVANAGARI SIGN CANDRABINDU
U+0902 DEVANAGARI SIGN ANUSVARA
U+0953 DEVANAGARI GRAVE ACCENT
U+0954 DEVANAGARI ACUTE ACCENT
These belong to a new class which I’ll call CB. CB behaves like SM but can also appear between H and C. A dotted circle is inserted between two adjacent CBs. CB blocks 'rphf' in a context like <Ra, H, CB, C>.
In Devanagari (and other Indian scripts), candrabindu can nasalise the first consonant of a cluster or the vowel. In the first role, it is written at the left of the cluster, while in the second role it tends to be written centrally or at the right. On Windows 10, the font Nirmala UI distinguishes the two when rendered with Uniscribe, displaying the sequence as a single askhara (i.e. no internal halant) in both cases, but not when rendered with HarfBuzz, which displays the sequence as two aksharas. When the arrangement of the consonants in the cluster is to be left to the renderer, nasalisation of the consonant is encoded on Windows by the sequence <consonant, virama, candrabindu, consonant>.
Contrasting displays on Windows 10 (generated on the same machine and cut down from the same screen dump) are:
Internet Explorer 11 Version 11.1158.17763.0 Update Versions 11.0.185, which uses Uniscribe:
Chrome: Version 81.0.4044.129 (Official Build) (64-bit), which uses HarfBuzz:
The encoding of the various sequences row by row, left to right is
The rendering issue was confirmed to be present in the latest development from HarfBuzz Version 2.6.5. The Uniscribe rendering is linguistically correct and matches readers' expectations.
There was discussion on the Unicode list of how to encode the difference in the thread containing post
https://www.unicode.org/mail-arch/unicode-ml/y2011-m06/0144.html . There Is some more recent discussion in the threads entitled "Devanagari ल्लाँ ambiguous?" and starting 5 May 2020. (No URL, as the Unicode Consortium web's server/site is still being repaired.)
The Uniscribe behaviour conflicts with the published specification, which agrees with TUS 12.1 R10 that U+0901 DEVANAGARI SIGN CANDRABINDU must follow the consonants and vowels of an akshara; this rule leads to the 2-akshara rendering produced by HarfBuzz.
This issue also impacts the porting of Devanagari to the USE.
There remains a case for treating U+0901 immediately before a vowel as a string encoding error.
The text was updated successfully, but these errors were encountered: