Working with Different Character Sets

Attention: This topic applies to a feature that is in Early Adopter Program (EAP) release status. We intend to provide the finalized feature in a future release. Please contact Micro Focus SupportLine if you require further clarification.

The Data File Editor enables you to view and edit data within a variety of character sets, including for DBCS-encoded languages such as Japanese, Chinese and Korean.

To set the environment on Windows, set the system locale to the country associated with the character encoding you plan to use; alternatively, if you run the Data File Editor from the command line, you can set the MFCODESET variable to the appropriate country code - refer to Supported Country Codes for a list of available codes. Then, set the Windows display language to the appropriate language.

To set the environment on UNIX, set the system locale to the country associated with the character encoding you plan to use (the lang variable), and then set the MFCODESET variable - refer to Supported Country Codes for a list of available codes.

If you are dealing with DBCS-encoded languages, you also need to install and use an input method or IME in order to edit in those languages. Microsoft provides one as part of Windows: use the Region and Language settings to install the required language pack(s), and then use the keyboard input method as shown in the system tray. You can also use other input methods, but the Microsoft offering is the recommended one for the Data File Editor.

On UNIX, the available input method will vary on different UNIX/Linux platforms, so consult your system documentation to find a suitable solution.

After setting the environment, characters from that encoding scheme should display correctly in the editor (whereas, previously they would be displayed as ?). For DBCS character sets, data input is carried out using the input method currently enabled.

If you are working with EBCDIC data, the DBCS characters require shift-out and shift-in (SOSI) control characters at the start and end of the DBCS data. When editing a formatted record, the editor will automatically add these characters to any PIC X fields as long as you are editing in insert mode. You can see the added characters (0E and 0F) when the Hex pane is displayed. These characters are not automatically added if you are editing an unformatted record or editing directly from the Hex pane.