XML GENERATE Statement

The XML GENERATE statement generates an XML document from existing, COBOL data (i.e., it translates COBOL data to XML format). It is an implementation of the IBM Enterprise COBOL verb of the same name and is provided to simplify IBM migrations; however, any customer wishing to write XML data can use this verb. See Working with Non-Vision Data in A Guide to Interoperating with ACUCOBOL-GT for additional information on working with XML data.

General Format

XML GENERATE identifier-1 FROM identifier-2 [COUNT [IN] identifier-3]
  [[ON] EXCEPTION imperative-statement-1]
  [NOT [ON] EXCEPTION imperative-statement-2]
  [END-XML]

Syntax Rules

  1. identifier-1 is the receiving area for the XML document. It must be an alphanumeric data item, and it must not overlap identifier-2 or identifier-3.
  2. identifier-1 must be large enough to contain the generated XML document. Typically, it should be from five to eight times the size of identifier-2, depending on the length of the data-name or data-names within identifier-2. If identifier-1 is not large enough, an error condition exists at the end of the XML GENERATE statement.
  3. identifier-2 is the source of the data to be converted to an XML document. It must not overlap with identifier-1 or identifier-3.
  4. identifier-3 is a numeric data item. It may not overlap with identifier-1 or identifier-2.

General Rules

  1. XML GENERATE ignores certain data items when they are specified by identifier-2. These include:
    • Unnamed elementary data items or elementary FILLER items.
    • Slack bytes inserted for SYNCHRONIZED items.
    • Data items subordinate to identifier-2 that are described by the REDEFINES clause, or that are subordinate to such a redefining item.
    • Data items subordinate to identifier-2 that are described with the RENAMES clause.
    • Group data items whose subordinate data items are all ignored.
  2. All data items specified by identifier-2 that are not ignored as defined above must satisfy the following conditions:
    • Each elementary data item must either have class alphabetic, alphanumeric, numeric, or be an index data item.
    • There must be at least one such elementary item.
    • Each non-FILLER data name must be unique within any immediately superordinate group data item.
  3. The COUNT IN phrase indicates that the count of generated XML characters (in bytes) should be stored in identifier-3, the data count field. identifier-3 must be an integer data item without the symbol "P" in its picture string. identifier-3 must not overlap identifier-1 or identifier-2.
  4. ON EXCEPTION phrase. When an error occurs during XML document generation, an exception condition exists. An example of this is when identifier-1 is not large enough to contain the generated XML document. In this case, XML generation stops and the content of the receiver, identifier-1, is undefined. If the COUNT IN phrase was specified, identifier-3 contains the number of character positions that were generated. This can range from zero to the length of identifier-1.

    If the ON EXCEPTION phrase is specified, control is transferred to imperative-statement-1. If it is not specified, NOT ON EXCEPTION phrases are ignored, and control is transferred to the end of the XML GENERATE statement.

    At termination of an XML GENERATE statement, special register XML-CODE contains either 0, indicating successful completion of XML generation, or a non-zero error code, indicating that an exception occurred during XML generation. Following are the possible exception codes that you may encounter:

    Code Description
    400 The receiver was too small to contain the generated XML document. The COUNT IN data item, if specified, contains the count of character positions that were actually generated.
    600 – 699 Internal error. Please report the error to your customer support analyst.
  5. NOT ON EXCEPTION phrase. If no exception conditions arise during generation of the XML document, control is passed to imperative-statement-2, if specified, or to the end of the XML GENERATE statement. If an ON EXCEPTION phrase is specified, it is ignored. Special register XML-CODE contains a zero after the XML GENERATE statement has finished executing.
  6. The END-XML phrase is an explicit scope terminator that delimits the scope of both XML GENERATE and XML PARSE statements. With END-XML, conditional XML GENERATE or XML PARSE statements can be nested in other conditional statements. Conditional XML GENERATE or XML PARSE statements specify the ON EXCEPTION or NOT ON EXCEPTION phrase.

    The scope of a conditional XML GENERATE or XML PARSE statement is terminated by:

    • An END-XML phrase at the same level of nesting
    • A separator period

Operation of XML GENERATE

Eligible elementary data items in identifier-2 are converted to character format. (See Data conversion and Data trimming for details.) Only the first definition of each storage area is processed. Redefinitions are not included, nor are data items that are effectively defined by the RENAMES clause.

Once the data content is converted, it is inserted as element character content in XML markup. The XML element names are derived from the data-names in identifier-2. (See Element naming for more information.) The names of group items that contain the selected elementary items are retained as parent elements. No extra white space is inserted to make the generated XML more readable. An XML declaration is not generated.

If the receiving area specified by identifier-1 is not large enough to contain the resulting XML document, an error condition arises. See the ON EXCEPTION phrase, General Rule #4, for details.

CAUTION:
If identifier-1 is longer than the generated XML document, only the initial part of identifier-1 changes. The rest of identifier-1 contains the data that was present before this execution of the XML GENERATE statement. To avoid referring to that data, either initialize identifier-1 to spaces before the XML GENERATE statement or specify the COUNT IN phrase.

Use the COUNT IN phrase to determine the total number of character positions, in bytes, that were generated. identifier-3 will then contain this information after XML GENERATE executes. You can use identifier-3 as a reference modification length field to refer to the part of identifier-2 that contains the generated XML document.

After execution of the XML GENERATE statement, special register XML-CODE contains either zero, indicating successful completion, or a non-zero exception code.

Please note that the XML PARSE statement also uses special register XML-CODE. Therefore, if you code an XML GENERATE statement in the processing procedure of an XML PARSE statement, save the value of XML-CODE before that XML GENERATE statement executes and restore the saved value after the XML GENERATE statement terminates.

Data Conversion

How elementary data items are converted to character format depends on the type of data item:

  • Alphabetic, alphanumeric, alphanumeric-edited, external floating-point, and numeric-edited items are not converted.
  • Fixed-point numeric data items are converted as if they were moved to a numeric-edited item that has:
    • An explicit decimal point, if the numeric item has at least one decimal position
    • The same number of decimal positions as the numeric item
    • A leading '-' picture symbol if the data item is signed and has an S in its PICTURE clause

    For COMPUTATIONAL-5 (COMP-5) binary data items, the number of integer positions depends on the number of '9' symbols in the picture character string. If the data item has one to four '9' picture symbols, the number of integer positions is five minus the number of decimal places. If the data item has five to nine '9' picture symbols, the number of integer positions is ten minus the number of decimal places. If the data item has 10 to 18 '9' picture symbols, the number of integer positions is 20 minus the number of decimal places.

    All other fixed-point numeric data items will have as many integer positions as the numeric item, but with at least one integer position.

  • Internal floating-point data items are converted as if they were moved to a data item as follows:
    • For COMP-1: an external floating-point data item with PICTURE -9.9(8)E+99
    • For COMP-2: an external floating-point data item with PICTURE -9.9(17)E+99 (illegal because of the number of digit positions)
  • Index data items are converted as if they were declared USAGE COMP-5 PICTURE S9(9). After conversion, leading and trailing spaces and leading zeroes are removed, as described under Data trimming.

After conversion, if a data item contains characters that are illegal in XML, the value in the data item before conversion or trimming is represented in hexadecimal, and an element tag name with the prefix "hex." is substituted for the regular tag name. For example, if data item Customer-Name is found at run time to contain LOW-VALUES, the XML element tag name 'hex.Customer-Name' is used instead of the normal 'Customer-Name', and the content is represented as a string of pairs of zero digits.

Any remaining instances of the five characters & (ampersand), ' (apostrophe), > (greater-than sign), < (less-than sign), and " (quotation mark) are converted into the equivalent XML references '&amp;', '&apos;', '&gt;', '&lt;', and '&quot;', respectively.

Data Trimming

Data values are trimmed after they are converted to character format. (Conversion is described under Data conversion.) Values converted from signed numeric values have their leading space removed if the value is positive. Values converted from numeric items have leading zeroes eliminated (after any initial minus sign). This is up to but not including the digit immediately before the actual or implied decimal point. Trailing zeroes after a decimal point are retained. For example:

  • 012.340 becomes -12.340.
  • 0000.45 becomes 0.45.
  • 0013 becomes 13.
  • 0000 becomes 0.

Character values from alphabetic, alphanumeric data items have either trailing or leading spaces removed, depending on whether the corresponding data items have left or right justification, respectively--left being the default. Trailing spaces are removed from values whose corresponding data items do not specify the JUSTIFIED clause. Leading spaces are removed from values whose data items do specify the JUSTIFIED clause. If a character value consists solely of spaces, all spaces are removed but one.

Element Naming

The element tag names in the XML documents generated from identifier-2 are derived from the name of the data item specified by identifier-2 and from any eligible data-names that are subordinate to identifier-2. The following rules apply:

  • The exact mixed-case spelling of data-names from the data description entry is retained. The spellings from any references to that data item (for example, in an OCCURS DEPENDING ON clause) are not used.
  • data-names beginning with a digit are prefixed by an underscore. For example, the data-name4C becomes XML tag name _4C.
  • Names of data items that contain characters that are illegal in XML version 1.0 are prefixed by hex., and the content itself is expressed in hexadecimal.

Nested XML GENERATE Statements

When a given XML GENERATE statement appears as imperative-statement-1 or imperative-statement-2, or as part of imperative-statement-1 or imperative-statement-2 of another XML GENERATE statement, that given XML GENERATE statement is a nested XML GENERATE statement.

Nested XML GENERATE statements are considered to be matched XML GENERATE and END-XML combinations proceeding from left to right. For this reason, when END-XML phrases are encountered, they are matched with the nearest preceding XML GENERATE statements that have not already been terminated.