Double-Byte Character Set Support

Many of the world's languages use sets of characters that run into the thousands. Most computers use 8-bit bytes, and assign a different 8-bit code to represent each character; this scheme can represent no more than 256 different characters.

Ideally a COBOL programmer should not need to be aware of the internal code used to represent characters. However, in practice some features of the internal code can affect the source programmer, and this limitation to 256 different characters is one of the most restricting of these.

For this reason the Double-Byte Character Set (DBCS) is provided. In this scheme each character is represented by a 16-bit code, each character occupying a pair of adjacent bytes. This scheme can represent thousands of different characters.

The assignment of DBCS character codes to characters varies from country to country.

The 8-bit code used by your COBOL system is the American Standard Code for Information Interchange (ASCII). In this chapter this will be referred to as the Single-Byte Character Set (SBCS).

Double-Byte Character set support is sensitive to the DBCS Compiler directive.

See also the chapter Micro Focus Extensions for Double-Byte Character Support, primarily for Japanese language support.