• Font
  • Family
  • Foundry
  • Designer
  • Sample
  • Article
  • Help

[Unicode Announcement] Unicode Collation Algorithm Version 5.2 Released

Date:2009-10-22 02:05:34| Standard|Browse: 2|Source: The Unicode Blog|Author: Unicode, Inc.
IntroductionVersion 5.2 of the Unicode Collation Algorithm has been released.

Version 5.2 of the Unicode Collation Algorithm has been released.

See http://www.unicode.org/reports/tr10/.

This version resynchronizes the Unicode Collation Algorithm with all

of the updates for the Unicode Standard, Version 5.2. Please note

the following changes and issues for implementations:

* The text of UTS #10 has been updated. Among other changes, the

revised text for UTS #10 makes it clear that the BASE for

implicit generation of weights for Han characters does not

include unassigned code points.

* There are small changes in Gujarati, Telugu, Malayalam

(including weighting for chillus), Tamil, and Sinhala. While

these changes move in the direction of expected behavior, good

results will only come from tailoring for particular languages,

such as with CLDR.

* There have been significant changes to the ordering of many

combining marks. Many combining marks that are not in customary

use in modern languages now have the same secondary weight, and

will only be distinguished on a fourth level, by code point

ordering. This can be seen by looking at the Unicode Collation

Charts (http://unicode.org/charts/collation/). In 5.2, many

characters now have a white background, indicating that they

sort exactly the same as the previous character, unless a 4th

(codepoint) level is used.

* Implementations of UCA should take note that the increased

number of characters may cause overflows if the implementing

code makes certain assumptions or optimizations. This can result

either from the new character additions (which increase the

number of distinct weights in the table) or because of changes

in the way the weights, particularly for secondary weight

values, are assigned in the table. The latter change may result

in unexpected numbers of characters having the same weight.


All of the Unicode Consortium lists are strictly opt-in lists for members

or interested users of our standards. We make every effort to remove

users who do not wish to receive e-mail from us. To see why you are getting

this mail and how to remove yourself from our lists if you want, please

see http://www.unicode.org/consortium/distlist.html#announcements

More[Unicode Announcement] Unicode Collation Algorithm Version 5.2 ReleasedFollow Fontke
[Unicode Announcement] Unicode Collation Algorithm Version 5.2 Released Comments
Guest Please obey the rules of this website. Unclear?
[Unicode Announcement] Unicode Collation Algorithm Version 5.2 Released Latest comments
No relevant comments
Recommended comments