Devanagari Candrabindu as a Consonant Modifier #2392

Richard57 · 2020-05-08T09:28:26Z

In Devanagari (and other Indian scripts), candrabindu can nasalise the first consonant of a cluster or the vowel. In the first role, it is written at the left of the cluster, while in the second role it tends to be written centrally or at the right. On Windows 10, the font Nirmala UI distinguishes the two when rendered with Uniscribe, displaying the sequence as a single askhara (i.e. no internal halant) in both cases, but not when rendered with HarfBuzz, which displays the sequence as two aksharas. When the arrangement of the consonants in the cluster is to be left to the renderer, nasalisation of the consonant is encoded on Windows by the sequence <consonant, virama, candrabindu, consonant>.

Contrasting displays on Windows 10 (generated on the same machine and cut down from the same screen dump) are:

Internet Explorer 11 Version 11.1158.17763.0 Update Versions 11.0.185, which uses Uniscribe:

Chrome: Version 81.0.4044.129 (Official Build) (64-bit), which uses HarfBuzz:

The encoding of the various sequences row by row, left to right is

  0932 094d 0901 0932
  0932 094d 0932 0901

  0932 094d 0901 0932 093e
  0932 094d 0932 093e 0901

  0932 094d 0901 0932 0947
  0932 094d 0932 0947 0901

  0932 094d 0901 0932 093f
  0932 094d 0932 093f 0901

The rendering issue was confirmed to be present in the latest development from HarfBuzz Version 2.6.5. The Uniscribe rendering is linguistically correct and matches readers' expectations.

There was discussion on the Unicode list of how to encode the difference in the thread containing post
https://www.unicode.org/mail-arch/unicode-ml/y2011-m06/0144.html . There Is some more recent discussion in the threads entitled "Devanagari ल्लाँ ambiguous?" and starting 5 May 2020. (No URL, as the Unicode Consortium web's server/site is still being repaired.)

The Uniscribe behaviour conflicts with the published specification, which agrees with TUS 12.1 R10 that U+0901 DEVANAGARI SIGN CANDRABINDU must follow the consonants and vowels of an akshara; this rule leads to the 2-akshara rendering produced by HarfBuzz.

This issue also impacts the porting of Devanagari to the USE.

There remains a case for treating U+0901 immediately before a vowel as a string encoding error.

The text was updated successfully, but these errors were encountered:

behdad · 2020-05-19T02:14:09Z

cc @jfkthame

Richard57 · 2020-05-19T06:29:10Z

Andrew Glass has undertaken to update the Microsoft Devanagari specification (MicrosoftDocs/typography-issues#416) to add the current Uniscribe/DirectWrite behaviour as correct.

behdad · 2022-07-15T21:41:10Z

@dscorbett can you triage this please?

dscorbett · 2022-07-16T16:55:49Z

We should make this change.

I checked all the Indic-shaper scripts in Notepad with strings analogous to ⟨ल्ँल⟩. Only Devanagari supports it, and only for certain marks:

U+0900 DEVANAGARI SIGN INVERTED CANDRABINDU
U+0901 DEVANAGARI SIGN CANDRABINDU
U+0902 DEVANAGARI SIGN ANUSVARA
U+0953 DEVANAGARI GRAVE ACCENT
U+0954 DEVANAGARI ACUTE ACCENT

These belong to a new class which I’ll call CB. CB behaves like SM but can also appear between H and C. A dotted circle is inserted between two adjacent CBs. CB blocks 'rphf' in a context like <Ra, H, CB, C>.

khaledhosny added the Indic label May 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Devanagari Candrabindu as a Consonant Modifier #2392

Devanagari Candrabindu as a Consonant Modifier #2392

Richard57 commented May 8, 2020

behdad commented May 19, 2020

Richard57 commented May 19, 2020

behdad commented Jul 15, 2022

dscorbett commented Jul 16, 2022

Devanagari Candrabindu as a Consonant Modifier #2392

Devanagari Candrabindu as a Consonant Modifier #2392

Comments

Richard57 commented May 8, 2020

behdad commented May 19, 2020

Richard57 commented May 19, 2020

behdad commented Jul 15, 2022

dscorbett commented Jul 16, 2022