UTF-16 Byte Ordering

A UTF-16 character can be stored in one of two byte orders. The computer’s default byte order is normally used. For example, on Intel-based machines, the least significant byte of the character pair is stored at the byte with the lowest memory address (lo-hi byte ordering). On most other machines the most significant byte is stored first (hi-lo byte ordering). When you have a COBOL program containing Unicode characters, and that program must be accessed by machines that use either byte-order, you must change the byte order on a lo-hi machine to force the data to be stored in hi-lo order. You achieve this using the UNICODE directive. This directive enables you to specify NATIVE (whatever the machine defaults to) or PORTABLE (always hi-lo) byte ordering. Use the following syntax:

unicode({native | portable})

For example, the character “A” is represented by (0x00, 0x41) in UTF-16. On an Intel machine this is stored as:

0x41 0x00

However, if you specify the UNICODE(PORTABLE) directive, it is stored as:

0x00 0x41