• Font
  • Family
  • Foundry
  • Designer
  • Sample
  • Article
  • Video

Building UTF-32 CMap Resources

Date:2012-03-07 13:21:05| News|Browse:0|Author:fontke
IntroductionWhen using AFDKO to develop CID-keyed OpenType/CFF fonts, the mos

When using AFDKO to develop CID-keyed OpenType/CFF fonts, the most important CMap resources are the UTF-32 ones, for the following reasons:

Unicode has become the de facto character encoding for today's OSes and applications.When the font includes mappings outside the BMP (Basic Multilingual Plane), the Format 12 (UTF-32) 'cmap' subtable is included. When a font includes only BMP mappings, the AFDKO makeotf tool is smart enough not to create a Format 12 'cmap' subtable, and instead creates only a Format 4 (BMP-only UTF-16) one.UTF-32 is arguably the most human-readable of the Unicode Encoding Forms, because its big-endian hexadecimal representation is simply the Unicode Scalar Value without the "U+" prefix and zero-padded to eight digits.

The AFDKO makeotf tool is used to build a fully-functional font, and a UTF-32 CMap resource is specified as the argument of its "-ch" command-line option.

When developing fonts that are based on one of the public ROSes, such as Adobe-CNS1-6, Adobe-GB1-5, Adobe-Japan1-6, or Adobe-Korea1-2, you simply use the appropriate UTF-32 CMap resources that are made available in the CMap Resources open source project that is hosted at Open @ Adobe. AFDKO includes the UTF-32 CMap resources for the public ROSes, and the makeotf tool invokes them automatically, but the latest versions are always available at Open @ Adobe.

Most of the public ROSes include only one UTF-32 CMap resource, so the choice is always clear, because there is no choice. And, the makeotf tool makes this non-choice choice for you. ☺

There are, however, several UTF-32 CMap resources associated with the Adobe-Japan1-6 ROS, so the appropriate choice depends on the purpose of the font:

The UniJIS-UTF32-H CMap resource is recommended for JIS90-savvy fonts.The UniJIS2004-UTF32-H CMap resource is recommended for JIS2004-savvy fonts.The UniJISX0213-UTF32-H and UniJISX02132004-UTF32-H CMap resources correspond to what is used for the Hiragino (ヒラギノ) fonts that are bundled with Mac OS X: these differ from UniJIS-UTF32-H and UniJIS2004-UTF32-H in that the code points for 65 symbols map to proportional glyphs instead of full-width ones.

When developing CID-keyed OpenType/CFF fonts that are based on an ROS other than a public one, including the special-purpose Adobe-Identity-0 ROS, you must build your own UTF-32 CMap resource. That is the topic of this particular CJK Type Blog article.

When building your own UTF-32 CMap resource, the most important data is a mapping from Unicode code points to CIDs. As long as you have that, the process is relatively simple. As a very simple and minimal example, let's assume that the font includes the following four glyphs:

CIDUnicode Scalar Value
0n/a (.notdef)
1U+0020 (space)
2U+304B (か)
3U+304C (が)

Thus, U+0020 maps to CID+1, U+304B maps to CID+2, and U+304C maps to CID+3. The mappings are specified between the begincidchar and endcidchar operators, as shown in the complete CMap resource below (the non-boilerplate portions are inbold):

%!PS-Adobe-3.0 Resource-CMap

%%DocumentNeededResources: ProcSet (CIDInit)

%%IncludeResource: ProcSet (CIDInit)

%%BeginResource: CMap (CJKTypeBlogTest-UTF32-H)

%%Title: (CJKTypeBlogTest-UTF32-H Adobe Identity 0)

%%Version: 1.000


/CIDInit /ProcSet findresource begin

12 dict begin


/CIDSystemInfo 3 dict dup begin

/Registry (Adobe) def

/Ordering (Identity) def

/Supplement 0 def

end def

/CMapName /CJKTypeBlogTest-UTF32-H def

/CMapVersion 1.000 def

/CMapType 1 def

/WMode 0 def

1 begincodespacerange



1 beginnotdefrange









CMapName currentdict /CMap defineresource pop





Of course, the mappings could be more efficient, by using the begincidrange and endcidrange operators for the contiguous Unicode code points whose CIDs are also contiguous:







But, the makeotf tool does not require the mappings to be efficient, but it makes the mappings efficient in the resulting 'cmap' table. Therefore, there is no reason to go through this effort when building a UTF-32 CMap resource.

I always recommend using a UTF-32 CMap resource regardless of whether it includes mappings outside the BMP. The Adobe-Korea1-2 UTF-32 CMap resource, UniKS-UTF32-H, includes only BMP mappings, and is used as the basis for all of our OpenType Korean fonts.

As a final note, not all CIDs must be mapped from UTF-32 code points. For example, vertical variants are accessed through the use of the 'vert' or 'vrt2' GSUB (Glyph SUBstitution) feature. It is possible to define these mappings in a corresponding vertical CMap resource whose final designator is "V" instead of "H," and to specify it as the argument for the makeotf "-cv" command-line option, but is is considered better practice to define the appropriate GSUB features in the "features" file.

MoreBuilding UTF-32 CMap ResourcesFollow Fontke
Building UTF-32 CMap Resources Comments
Tourist Please obey the rules of this website. Unclear?
Building UTF-32 CMap Resources Latest comments
No relevant comments
Recommended comments