Calling Programs

Chapter 1: Writing Efficient Programs

The information in this chapter is designed to help you produce programs that run as efficiently as possible. It includes information on the following:

Producing fast, compact code
Implementation of floating-point arithmetic
Handling large programs

The information in this chapter applies to programs compiled to native code, not intermediate code, unless otherwise stated.

1.1 Performance Programming

This section gives some guidelines which, if followed, enable your Server Express system to optimize fully the native code produced for your programs. This results in smaller and faster applications. Do remember that these are only guidelines; programs that do not conform to these guidelines still run correctly, just less efficiently.

Programs using Server Express can be moved to other systems provided certain portability rules are followed.

1.1.1 Data Types

Using the correct data type is important to get the greatest efficiency from operations, particularly arithmetic operations. For details of how each data type is stored refer to your Language Reference. The following list give information about data types:

Use unsigned COMP-5 or COMP-X numeric data items; preferably COMP-5 (where possible).
COMP-5 usage is defined for binary storage with the value stored in the native byte sequence of the processor. This makes it ideal for arithmetic efficiency. However, it is not suitable for data stored in files or passed to other machines over a network as the data is not accessible by programs running in an environment in which COMP-5 is stored in a different sequence. In these cases use COMP-X.
In native code programs, it is more efficient to move integer data items that are not COMP-X or COMP-5 to COMP-5 data items before doing arithmetic operations on them.

The following list shows the order of speed of processing data of different types. This applies to any size of numeric data item. The list is ordered fastest to slowest.

Data-type	Speed/Description
COMP-5	Operations on COMP-5 data items are performed as true native binary operations. Fastest performance of COMP-5 is given using the Compiler directive COMP-5"2", the default. See the chapter Directives for Compiler in your *Server Express User's Guide* for more details of this directive.
COMP-X	Operations on COMP-X data items are performed as binary operations. The data items are stored in COBOL order, which might not be the same as the native byte order. For example, on Intel 80x86 and Digital Alpha systems byte ordering is different to that on other processors. In this case, arithmetic on items longer than one byte involve operations to change the byte order before the arithmetic can be carried out, resulting in slightly slower arithmetic than on COMP-5 items.
COMP	Operations on COMP data items, by default, operate as defined in the ANSI standard. This results in truncation of the result of operations before it is stored in a COMP item. This generally results in slower arithmetic than on COMP-X data items. However, if the directives COMP and NOTRUNC are used when the program is compiled, operations on COMP items behave exactly like operations on COMP-X items.
COMP-3	Arithmetic on COMP-3 data items is performed in packed decimal and is much slower than arithmetic on COMP items. It should be avoided. Arithmetic on COMP-3 data items is slower than DISPLAY items in intermediate code.
DISPLAY	Arithmetic on DISPLAY items is much slower than arithmetic on COMP items, and should be avoided. However: Arithmetic on DISPLAY items in intermediate code is faster than on COMP-3 items In generated code, arithmetic on DISPLAY items can be optimized if the numeric value can be stored in a four-byte COMP-5 item on 32-bit systems, or an eight-byte COMP-5 item on 64-bit systems. A four-byte COMP-5 item is also very efficient on 64-bit systems.

Addition and subtraction on non-integer data items is fastest if the items are COMP-5 and the decimal alignment is the same in both.
Mixing different usage types in the same statement is less efficient than using the same usage throughout. The main exception to this is mixing COMP-5 and COMP-X items, where there is very little impact on performance.
Use only numeric items that occupy one, two, four or eight bytes of storage and, furthermore, use the smallest of those that you can.
32-bit:
In 32-bit COBOL systems, the fastest and smallest code is produced for operations on items that contain up to nine digits, or four bytes for binary items.

Operations on numeric items containing more than nine digits, or more than four bytes for binary items, produces the slowest and largest code.

64-bit:
In 64-bit COBOL systems, the fastest and smallest code is produced for operations on items that contain up to eighteen digits, or eight bytes for binary items.
Align items on even byte boundaries. Align numeric items greater than two bytes on four-byte boundaries. Use the ALIGN"4" directive to ensure level-01 items are always aligned on such boundaries. Use ALIGN"8" for compatibility with 64-bit systems.
To ensure all items in a table are correctly aligned, check that the size of an occurrence in the table (the stride of the table) is a multiple of two, four or eight bytes as required. Pad the table as necessary. For example:
```
 01 a occurs 10.
     03 b       pic x(4) comp-5.
     03 c       pic x.
     03 filler  pic x(3).
```
A three-byte filler has been added to each occurrence in the table to ensure that the numeric data item is always aligned on a four-byte boundary
Do not redefine COMP-5 items to access individual bytes; if access to individual bytes is required use COMP-X
Use edited items only when necessary and use only simple ones such as ZZ9. If possible use them in a subroutine so the total number of edited moves in your program is kept as small as possible.

1.1.2 Procedure Division Considerations

This section identifies items in the Procedure Division which affect the size and performance of your program, and suggests the most efficient ways of using them.

As a general rule, the simpler the operation, the faster it executes and the smaller the compiled code. To get the best performance it is often better to use a number of simple operations rather than one complex operation. The following are general guidelines that result in the fastest and smallest possible code.

1.1.2.1 Arithmetic Statements

To get the best performance from arithmetic statements always use the simplest forms:

Operations:
Use simple two-operand arithmetic statements wherever possible.
The following operations are optimized for COMP-5 and COMP-X data items up to four bytes long.
```
    move a to b 
    add a to b 
    subtract a from b 
    multiply a by b 
    divide a into b 
    if a condition b
```
where:

a is a numeric literal or data item up to four bytes long
b is a numeric data item up to four bytes long

On other data items, these simple operations result in faster code than more complex instructions, but the benefits are not so great as with COMP-5 or COMP-X items.

More complex forms of these instructions, involving more than two operands, might not produce code as efficient as the simple form.
Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING phrases if you want the fastest performance. The CORRESPONDING phrase should be as efficient as a set of separate statements.
No optimization is done on arithmetic statements if the ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if high performance is required.
The ROUNDED phrase impacts performance, but it is generally faster to use ROUNDED than try to round the result using your own routine. The only exception to this is when using the simple operations described above on COMP-5 or COMP-X items.
Do not mix items of different sizes in an arithmetic statement (for example, try to use all two-byte items or all four byte items).
COMPUTE:
COMPUTE statements are optimised, but should be kept simple where possible, especially where multiplication or division is involved.
Always use operands of the same type, preferably four-byte COMP-5. The following examples are fully optimized:
```
compute a = b * c + d
compute a = b * 4 + 2
compute a = b + c - d + e
```
In comparisons, it is often better to use a temporary item rather than a computed IF such as:
```
if a + b > c
```
Decimal Alignment:
Operations are fastest on integers. Operations on non-integer numbers are most efficient if operands have matching decimal point alignment (that is, the number of decimal positions to the right of the implied decimal point). For example:
- ADD and SUBTRACT operations are fastest if the source and target have the same decimal alignment.
- MULTIPLY operations are fastest if the decimal alignment of the target is the sum of the alignments of the sources.
- DIVIDE operations are fastest if the divisor has no fractional part and if the dividend and the target have the same decimal alignment.
Exponentiation:
Most exponentiation operations are relatively slow. No optimization is done for them. MULTIPLY and DIVIDE operations should be used instead where integer powers are involved.
Initialization:
By default, COBOL initializes all data items in the Working-Storage Section to spaces if no VALUE clause is specified. This includes all numeric items. The effect of doing arithmetic on such an item depends on the usage of the item:
- Usage DISPLAY and COMP-3:
  Intermediate code reports run-time error 163 ("Illegal character in numeric field"). Generated code gives unpredictable results.
- Any other usage:
  Results are unpredictable.
To avoid these problems, all numeric items should be initialized to numeric values before use.

1.1.2.2 Alphanumeric Data Manipulation

The following list suggests the most efficient ways of using alphanumeric data manipulation:

Reference modified fields are optimized if coded in one of the following forms:

    item (literal:)
    item (literal:literal)
    item (literal:variable)
    item (variable:variable)
    item (variable + literal:literal)
    item (variable - literal:literal)
    item (variable + literal:variable)
    item (variable - literal:variable)

Other forms of reference modification are inefficient.

If the offset or length of the reference modification is a data item, use a COMP-5 item of the smallest optimum size (one, two or four bytes) that accommodates the range of values involved. Define it in the Working-Storage Section. For 32-bit and 64-bit COBOL systems use a four byte COMP-5 item.
In a MOVE statement, the source item should be the same size as, or larger than, the target. This prevents space-padding code being generated.
Do not use the INITIALIZE verb.
Do not use the CORRESPONDING option of the MOVE verb.
Do not use the STRING or UNSTRING verbs - they create a lot of code. For manipulating filenames use the COBOL System Library Routines CBL_SPLIT_FILENAME and CBL_JOIN_FILENAME.
If you attempt a MOVE between two numeric-edited items, the result is undefined although no error status is returned.

1.1.2.3 Table Handling

The following list suggests the most efficient ways of table handling:

A subscript should be a COMP-5 item of the smallest optimum size that accommodates the range of values involved. The optimal size for a subscript is four bytes.
Subscripts for items that have the same stride and are used in consecutive statements are optimized so that they are only evaluated once. For example:
```
 01 a pic xx occurs 10.
 01 b pic xx occurs 10.
 01 c pic xx occurs 10.
 01 d pic xx occurs 10.
      . . .
     move a(i) to b(i)
     if c(i) = d(i)
         display "pass"
     end-if
```
would result in the subscript i being evaluated only once, although it is used four times in two statements. The stride of each of these tables is the same: two.
When compiling your program for use in production, use the NOBOUND directive. Use BOUND only when debugging. It causes code to be generated, every time a subscript or index is used, to check that it is within the defined bounds of the table.
If you are using USAGE DISPLAY subscripts, the BOUNDOPT directive (also switched on by NOBOUND) can give performance improvements. For example:
```
      . . .
 01 array              pic x occurs 20.
 01 array-index        pic 9(5) value 2.
      . . .
     move "a" to array(array-index).
      . . .
```
If the program is compiled without BOUNDOPT, all five digits of array-index are used to evaluate the subscript. If BOUNDOPT is specified, only the last two digits of array-index are used, as only two digits are needed to access all elements of a 20 element table.
Access to tables defined with OCCURS ... DEPENDING is less efficient than access to tables of fixed size, and so should be avoided where high performance is needed.
Bound checking on a variable length table checks only if the subscript or index points outside the maximum length of the table, it does not take account of the table's current length (that is, the value of the item specified in the DEPENDING phrase).

1.1.2.4 Conditional Statements

The following list suggests the most efficient ways of using conditional statements:

In IF statements, conditions within combined conditions are evaluated in the order that they occur. Therefore, you should put the conditions that are most likely to produce a false result before others. Similarly, you should put those conditions that can be evaluated fastest before slower conditions.
Comparisons using EQUALS (=) or NOT EQUAL are faster than comparisons using GREATER (>) or LESS (<). In some systems, comparisons against binary zero are more efficient than against other literals.
In both alphanumeric and numeric comparisons, have the source and target items the same size.
The selection-subject of an EVALUATE should not be an expression.
Order an EVALUATE statement so that the most commonly satisfied condition is tried first.
Use a GO TO ... DEPENDING statement if the range of possible values is fairly close. Although this construct has the disadvantage of not being particularly suited to structured programming, it is efficient.

1.1.2.5 Logical Operations

A number of COBOL System Library Routines (call-by-name) are available to perform bit-wise logical operations on data items. These are described in the chapter Advanced Language Features. They perform operations such as bit-wise AND, OR and XOR.

The Compiler recognizes calls to these routines and, if possible, optimizes them to produce in-line code rather than calls to the run-time system. The calls are optimized if the length is specified as a literal. In-line code is native code which performs the function directly without making any calls. The alternative is a call to a generic run-time routine which must allow for many cases.

The calls are optimized if the length is specified as a literal.

Logical AND and logical OR operations can also be carried out using the VALUE clause. See your Language Reference for details.

1.1.2.6 The PERFORM Statement

Using PERFORM is generally very efficient, and is a very good way of keeping the size of your program down as well as giving it an easy-to-maintain structure. The following rules enable you to use it in the most efficient ways.

Put commonly used pieces of code in sections and perform them.
Apart from being good coding practice, this saves space. It is often beneficial even on single statements; for example, edited moves, subprogram calls or file I/O
When incrementing or decrementing a counter, terminate it with a literal value rather than a value held in a data item. For example, to execute a loop n times, set the counter to n and then decrement the counter until it becomes zero, rather than incrementing the counter from zero to n.
Perform sections, not paragraphs. Put EXIT PROGRAM and STOP RUN statements at the end of the first (main) section.
The range of an out-of-line PERFORM statement should not contain the end of another perform range. If it does, the program is said to trickle ; that is execution is allowed to go past the end of a perform range. Applying the rule above ensures that this does not occur.
Only use GO TO to paragraphs within the same section.
Do not use PERFORM .. THRU statements
For example, the following produces very inefficient code for the first PERFORM because the end of the range of the second PERFORM lies within the range of the first PERFORM.
```
     perform a thru e
     perform b thru d
     stop run
 a.
     . . .
 b.
     . . .
 c.
     . . .
 d.
     . . .
 e.
     . . .
```
Do not use ALTER statements.
The presence of an ALTER statement in a program prevents optimization of PERFORM statements.

1.1.2.7 CALL Statements

The following list suggests the most efficient ways of using alphanumeric CALL statements:

If you are not using nested programs, ensure that the NONESTCALL Compiler directive is specified.
Some operating systems share the code portion of statically linked applications, resulting in greater system efficiency between multiple processes.
Try to limit the number of CALL statements a program makes, if necessary by avoiding splitting it into too many subprograms.
CALL statements that do not alter the RETURN-CODE special register or whose effect on RETURN-CODE are of no interest should use a calling convention of 4. The Compiler directive DEFAULTCALLS can be used to set this globally.
Calls to the COBOL system library routines that carry out logical operations, such as CBL_AND, are optimized by the Generator to actual machine logic operators, providing the parameters are eight bytes long or less and the length parameter is a literal. These too should use a calling convention of 4.
Ensure that parameters appear in the same order in the CALL statement as they do in the Procedure Division header.
Ensure that the order of parameters is the same as the order of their descriptions in the called program's Linkage Section. Ensure that any Linkage Section items which are not referenced in the Procedure Division header appear after those that do.
Ensure that all parameters are level-01 or level-77 items.

1.1.2.8 Parameters

Avoid making many references to linkage items. These include items defined in the Linkage Section, items set to CBL_ALLOC_MEM allocated memory, and items defined as EXTERNAL.

Accessing linkage items is always slower than accessing Working-Storage Section items. If a Linkage Section item is used frequently, it is faster to move it into a Working-Storage Section item when the program is entered and move it back to the Linkage Section if necessary before exiting to the calling program. The Working-Storage Section item should then be accessed throughout the program rather than the item in the Linkage Section.

1.1.2.9 Sorting Files

If you use input and output procedures with a file sort in your program it is important that you write them efficiently as they are executed once for each record you are sorting. Inefficient input and output procedures can make the sort process appear to be very slow.

1.1.3 Compiler Directives

A number of Compiler directives can be used to make the native code for a program better optimized. Some of these directives must be used with care; ensure that the behavior you get with them is acceptable.

In general, always use the following directives when compiling your programs to native code:

NOALTER
ALIGN"4" (32-bit systems)
ALIGN"8" (64-bit systems)
COMP
NOBOUND
NOCHECK
NOCHECKDIV
NONESTCALL
NOODOSLIDE
NOQUAL
NOSEG
NOTRUNC

Other suggestions (to help prevent inefficient coding)

REMOVE "ROUNDED"
REMOVE "ERROR"
REMOVE "INITIALIZE"
REMOVE "CORRESPONDING"
REMOVE "THRU"
REMOVE "THROUGH"

By removing these reserved words you prevent the possibility that code using these inefficient constructs will be added to the program.

1.1.3.1 Using Directives to Optimize for Speed

There are many directives you can use to optimize for speed. In some cases, the defaults for Compiler directives are the ones that provide speed optimization. This section points out directives that need to be changed from their default values to provide for speed optimization.

ALTER Directive

For efficiency reasons, you should not use ALTER statements in programs. It is recommended that you avoid them altogether, and compile with NOALTER, to prevent the Compiler from having to produce code to look for them.

BOUND Directive

The BOUND directive does boundary checking on table items.

Your applications can be made faster (and smaller) by compiling with NOBOUND. Otherwise, the Compiler inserts extra code to do boundary checking on all references to table items.

During testing, you should use the BOUND directive until you are satisfied that your program is not referencing data beyond your table limits. For production, NOBOUND gives you the desired efficiency.

BOUNDOPT Directive

The BOUNDOPT directive can be used to optimize your code if the following apply:

You are using USAGE DISPLAY subscripts
You are using NOBOUND (see the BOUND directive above)

When BOUNDOPT is specified, digits in a USAGE DISPLAY subscript above the size of the table are ignored. For example, a PIC 9(3) subscript would be treated as PIC 9(2) for a table with less than 100 entries.

We recommend that you do not use USAGE DISPLAY subscripts.

COMP Directive

The COMP directive prevents code checking for numeric overflow. This produces highly compact and efficient code.

COMP changes the behavior of arithmetic on data items defined as USAGE COMP. It produces more efficient code, but the behavior does not conform to the ANSI standard.

If used with the proper care, COMP can improve the speed of your programs.

NESTCALL Directive

The NESTCALL directive enables nested programs to appear in your program. If you know you have no nested calls in your program, specifying NONESTCALL enables the Compiler to generate slightly more efficient code.

TRUNC Directive

The TRUNC directive causes the Compiler to create code to determine whether USAGE COMP data items need to be truncated or not.

If you are certain that you do not need truncation of USAGE COMP data items, NOTRUNC causes the creation of more efficient code.

1.1.3.2 Using Directives to Optimize for Size

There are many directives you can use to optimize for size. In some cases, the defaults for Compiler directives are the choices that provide size optimization. This section points out directives that need to be changed from their default values to provide for size optimization.

ALIGN Directive

This directive specifies the boundary on which level-01 and level-77 items start. This boundary should always be a power of two, such as two, four or eight. For 32-bit COBOL systems use a minimum of ALIGN"4". Use ALIGN"8" if you are using a 64-bit operating system. Higher powers of two retain efficiency, but increase the amount of unused space between data records.

1.2 Avoiding Data Inaccuracies

You might get unexpected results from arithmetical operations involving floating point calculations, as COBOL does not by default round numbers. For example, say WS02 is defined as a COMP-2 data item, and the following operations are performed:

accept ws02
compute ws02 = ws01
display ws01

If you enter a value of 2.3 when requested by the program, the value displayed will be 2.29.

If you want values to be rounded you must specify the ROUNDED clause. The example above would be rewritten as:

accept ws02
compute ws02 rounded = ws01
display ws01

You can specify that all data-items in a program are rounded or truncated using the Compiler directive FP-ROUNDING. See the chapter Directives for Compiler in your Server Express User's Guide for more details of this directive.

1.3 Implementation of Floating-point on 32-bit and 64-bit COBOL Systems

Server Express provides IEEE floating-point support. The following sections describe the range of values and the accuracy available using floating-point support.

1.3.1 Range

The range of values available with each of the two COBOL binary floating-point data types is as follows:

COMP-1 from	8.43E-37	through	3.37E38
	-8.43E-37	through	-3.37E38
COMP-2 from	4.19E-307	through	1.67E308
	-4.19E-307	through	-1.67E308

Note: Although the underlying floating point support used by this COBOL system does support the above ranges, this COBOL system only supports 2 digit exponents.

1.3.2 Accuracy

The following table shows the relationship between storage size and significance.

Type	Size	Significant Digits
COMP-1	4 bytes	6-7
COMP-2	8 bytes	15-16

1.3.3 External Items and Literals

The compiler validates floating point literals to ensure their values are compatible with the mainframe environment. Literal values that do not lie within the following range will cause a compile-time error:

from	0.54 E -78	through	0.72 E +76
and from	-0.54 E -78	through	-0.72 E +76

1.3.4 Inaccuracies in Floating-point

All IEEE format binary floating-point values consist of:

A sign (1 bit)
An exponent (8 or 11 bits)
A mantissa (23 or 52 bits)

If the value is non-zero, the exponent has a constant subtracted to give the starting value of the mantissa. Thus, the value of a double precision (COMP-2) item in terms of its mantissa (m) would be:

1*(2**m) + mb0*(2**(m-1)) + mb1*(2**(m-2)) + mb2*(2**(m-3)) + ... + 1*(2**(m-52))

where the mantissa bit (in this example, mb0 through mb51) can be 0 or 1. For COMP-1 (single precision) fields, this becomes an issue when the difference between the maximum and the minimum powers of 2 is greater than 23; for COMP-2 fields, the issue might arise when the difference between the maximum and minimum powers of 2 is greater than 52.

For example, if a COMP-2 field is expressed with one mantissa bit multiplied by 2**56, and another mantissa bit multiplied by 2**2, then the value in the field would need to be approximated, because 56 - 2 = 54, which is greater than 52. This would be true in any floating point implementation on any hardware platform. Additionally, the internal storage format of floating point numbers can differ from operating system to operating system; see your Language Reference for information. These discrepancies can occur because of variations in the internal storage of floating point numbers between operating systems.

Binary floating point can only represent powers of 2 (both positive and negative), and combinations of these powers of 2. For example, a fractional decimal number is represented by adding together the combination of values from the sequence 1/2, 1/4, 1/8, 1/16, 1/32, and so on, that comes closest to the required value. Thus, a value such as 0.625 (i.e., 1/2 + 1/8) can be represented exactly, whereas other values are represented by an approximation (which might be slightly above or slightly below the true value). Similarly, an integer is represented by adding together the combination of values from the sequence 1, 2, 4, 8, 16, and so on, that comes closest to the required value. Thus, a value such as 625 (which can be made up from 1 + 16 + 32 + 64 + 512) can be represented exactly, whereas other values are represented by an approximation.

1.4 Handling Large Programs

The Server Express system enables you to execute statically linked or dynamically loaded code. Statically linked code is embedded within the executable file. Dynamically loadable code is a COBOL callable shared object, or .int or .gnt file that is loaded and run only when the file is called. The code is held in a separate file to the executable file.

When designing a COBOL application program that is to be dynamically loadable, you want it to make efficient use of the available memory. It is possible, using this COBOL system, to create and run programs that use more memory than is physically present in your computer. You can do this by separating the program into smaller programs and using the COBOL call mechanism (see the chapter Calling Programs).

Calling Programs