PreviousSystem Limits and Programming Restrictions Creating 64-bit ProgramsNext

Chapter 5: Handling Protection Violation Errors

This chapter explains how your programs can produce illegal code and describes how you can identify and fix such errors. It describes the different types of protection violation you could encounter when using this COBOL system, and shows you how to identify and eliminate them.

5.1 Introduction

This COBOL system attempts to prevent or trap potentially illegal code. However, it is still possible for your programs to generate code that is illegal in some way, and when this happens the error might be trapped as a general protection violation by the operating system, or as a COBOL protection violation by this COBOL system.

5.2 What are Protection Violations?

A protection violation occurs when a machine instruction tries to execute and is found to be illegal in the context of the currently running process. This might be because:

There are two kinds of protection violation:

5.2.1 General Protection Violations

Operating systems typically allocate only a portion of a computer's resources to each process and monitor each process to see that it does not exceed its allocation. If the operating system detects a process exceeding its allocation, it takes some error action. The action it takes is a general protection violation (also known as an exception) or some other action, depending on the operating system.

When the operating system detects a general protection violation, it runs a special program called an exception handler before abandoning the process that caused the exception. The exception handler might be one specifically associated with the program that caused the exception. Alternatively, if there is no specific handler, the operating system runs its own default exception handler.

This COBOL system provides a COBOL exception handler for UNIX. See the next section for information on COBOL protection violations.

5.2.2 COBOL Protection Violations

Some operating systems are less vigilant than others in trapping protection violations. Where appropriate, this COBOL system provides facilities to monitor a COBOL application (or non-COBOL program running under the control of the COBOLrun-time system) to ensure that it does not exceed the resources allocated to it by the COBOL run-time system. A COBOL protection violation occurs when your COBOLrun-time system sees a program behaving in this way.

Whether the COBOL system detects a protection violation depends on which resources are monitored, how they are monitored, and how the resources are allocated. The hardware and operating system determine how effectively the COBOL system can monitor COBOL protection violations. A COBOL program might appear to run correctly in one environment, but might fail if available resources change or the operating system changes.

If a COBOL program causes a protection violation, the information provided about the failure and the tools available to supplement this information vary between operating systems. For this reason, if you find difficulty debugging a failing program in one environment you should consider debugging it under an environment that might provide more information.

5.3 Typical Effects of Protection Violations

UNIX traps general protection violations, enabling the run-time system to give run-time system error 114 ("Attempt to access item beyond bounds of memory") or run-time system error 115 ("Unexpected signal") depending on the type of trapped protection violation.

Protection violations typically occur when a process tries to access a memory location that lies outside the memory allocated to that process for the type of access it is attempting. Clearly the process miscalculated the location it wants to access.

However, if the same process moves, is rearranged in memory, or grows in size, the memory location might eventually lie in the range of the process. In such a case, the process would not give rise to a protection violation even though we would like it to. If this is the case:

5.4 Correcting Protection Violations

Common causes of protection violations (and run-time system errors associated with COBOL protection violations) are:

The following sections describe each of these problems and how to fix them.

5.4.1 Errors in Non-COBOL Portions of an Application

Because of the success of this COBOL system at detecting COBOL protection violations, it is uncommon for COBOL applications to result in general protection violations. In contrast, other programming languages, such as C, typically make heavy use of pointers, which increases the probability of causing general protection violations.

If your application fails with a general protection violation and contains a part that is written in a language other than COBOL, we recommend that you closely investigate the part of the application that is not written in COBOL. In most cases, that is where the general protection violation is caused.

5.4.2 Parameter Mismatches in a CALL Statement

The CALL statement provides a number of areas where general protection violations can occur if you make a mistake in specifying the call.

If you suspect that a CALL statement might be causing a general protection violation, you should check:

It becomes more difficult to manage such factors when one of the calling or called programs is not COBOL. As a result, protection violations are more likely in such applications. You must ensure that the non-COBOL language conforms with the default calling convention for COBOL, or COBOL conforms to the calling convention of the non-COBOL language (by using the CALL-CONVENTION clause). See the chapter The COBOL Interfacing Environment for more information on the CALL-CONVENTION clause.

The effects of a mismatch in the calling interface between two programs can be extremely varied depending on the interaction between the mismatch in question and the calling conventions used.

5.4.3 Stack Overflow

A less obvious cause for a general protection violation is stack overflow. The stack is the system area in which called parameters and the calling program's return address are temporarily stored for a CALL statement. Stack overflow occurs when the level of nesting of called subprograms, or the size or number of passed parameters, is greater than the stack can manage.

Stack overflow can occur due to excessive use of the stack, or because the default or requested stack size is small. You should bear in mind that operating system support routines might also make use of the stack and reduce its effective size, and that different versions of the same operating system can make different use of the stack, thus varying its effective size.

If one of your programs causes a general protection violation when you run it on a different version of an operating system, you could consider increasing the default stack size. Stack overflow is very rarely a problem on UNIX environments. If you suspect you need to change the size of your stack on UNIX, see your UNIX system documentation for details of how you can do this.

5.4.4 Illegal Reference Modification

Using reference modification you can easily reference an area of memory that lies outside the named data item. The results of illegal reference modification depend on the location of the area of memory that your program accesses. If the area of memory lies:

Example:

In the following example, the reference modification results in run-time system error 114 "Attempt to access item beyond bounds of memory" when run. This is caused by setting a to be greater than the maximum length of b. The first use of a reference modified item is a source field, so results in incorrect data being passed into c. This problem is detected by this COBOL system.

The second use of a reference modified item is potentially more dangerous, as the reference modified item is a target field. Again, this COBOL system detects this problem and produces run-time system error 114 "Attempt to access item beyond bounds of memory").

This example is used in the following sections to illustrate how to debug the program in various environments.

 program-id. "buggy".
 working-storage section.
 01 a   pic 9(6).
 procedure division.
     move 999999 to a
     call "bug" using a
     stop run.
 end program "buggy".
 program-id. "bug".
 working-storage section.
 01 b        pic x(20).
 01 c        pic x.
 linkage section.
 01 a    pic 9(6).
 procedure division using a.
     move b(a:1) to c
     move "1" to b(a:1)
     exit program.

5.4.5 Illegal Values for Pointers

Data items that you define with usage POINTER or PROCEDURE-POINTER must contain valid addresses when they are used. Using a pointer that does not contain a valid address results in run-time system error 114 ("Attempt to access item beyond bounds of memory"), because this COBOL system traps the error that would otherwise result in a general protection violation.

Example:

In the following example, data item undefined-pointer is defined as a pointer to a procedure, but is not given a value before it is used. As a result, the CALL statement fails, giving run-time system error 114 ("Attempt to access item beyond bounds of memory").

 working-storage section.
 01 undefined-pointer    usage procedure-pointer.
 procedure division.
     call undefined-pointer
     stop run.
         

5.4.6 Subscript Out of Range

The values of subscripts and index items are checked by this COBOL system, with run-time system error 153 "Subscript out of range" given if they are out of range. However, checking subscripts and index items decreases the performance of your application to some extent. To avoid this decrease in performance you can use the NOBOUND directive to disable such checking. If the NOBOUND directive is used and a subscript or index item is out of range when referencing a table, a protection violation might occur.

If you get unexpected results or a protection violation error in an application that has been compiled or generated with the NOBOUND directive, you should recompile the program without the directive and rerun it to determine if the fault is caused by a subscript error.

Example:

The following programs show how you can experience unexpected results as a result of inadvertently using an out of range subscript when the NOBOUND directive is specified. Because of the value passed from Bound-main, Build-sub moves "1" to the 21st element of an array that is only 20 elements in size. Because the NOBOUND directive is specified, this COBOL system permits this operation, which overwrites the next data item, c. If the NOBOUND directive was not set, the COBOL system would have given run-time system error 153 ("Subscript out of range") when the MOVE was performed.

 program-id. "bound-main".
 working-storage section.
 01 a   pic 99.
 procedure division.
     move 21 to a
     call "bound-sub" using a
     stop run.
 end program "bound-main".

$set align(2) nobound
 program-id. "bound-sub".
 working-storage section.
 01 b    pic x occurs 20.
 01 c    pic xx.
 linkage section.
 01 a    pic 99.
 procedure division using a.
     move (1) to b(a)
     exit program.

5.4.7 Incorrect Linking Options or Procedures

There are a number of linking options that if used incorrectly might indirectly cause a general protection violation. Problems commonly occur when the wrong objects or libraries are linked, or when the correct objects or libraries are used, but incorrect versions of them.

If you suspect that this might be the case, check that old versions of objects or libraries have not been left in search paths and that new versions have been installed correctly. See the chapter Linking to System Executables in your Server Express User's Guide for information on linking with this COBOL system.

5.5 Debugging Techniques

The techniques you use to debug a program depend on how the program is loaded or built.

You can use Animator to determine at which line of the source code a protection violation occurred in intermediate and generated code files, callable shared objects and system executables. However, the Animator information (.idy) files and source files must be available.

If you need to debug an application for which the source and Animator information (.idy) files are not available, you need to use FaultFinder. For information on using FaultFinder, see the chapter FaultFinder in your Debugging Handbook.

For detailed information on debugging, see your Debugging Handbook.

5.5.1 Debugging During Development

The easiest method of locating the source line that is causing a protection violation is to use just-in-time debugging. To enable just-in-time debugging you set the run-time tunable debug_on_error.

When you have set the debug_on_error tunable you execute your program. When the protection violation occurs, Animator is loaded and the line of source code containing the error is highlighted. You have full control of your application within Animator and can use any Animator functions to determine the problem.

You do not have to use just-in-time debugging; if you know at which line in your source code the protection violation occurs, you could load your program directly into Animator and zoom to that line.

5.5.2 Debugging a Production Application

If an application in a production environment or at a user's site is producing protection violation errors, you need to use core-dump debugging. You enable this using the run-time tunable core_on_error.

When core-dump debugging has been enabled in the application at the production or user's site, you execute the application. When the protection violation occurs the run-time system creates a core-dump file (rather than a 114/115 run-time error). You then obtain a copy of the core-dump file and use Animator on your development system to view it.

You use Animator with a core-dump file by specifying it as the input file; for example:

anim corefile

Animator loads the core-dump file and source file and highlights the line at which the protection violation occurred. When you use Animator with core-dump files, not all Animator functions are available. This is because the core-dump file is a snapshot of the application at the point at which the protection violation occurred; you cannot, therefore, execute any of the application. You can use query functions to determine the value of data items, and you can also use the Perform/Call stack view to change to a different stack context. This enables you to backtrack through the execution path of your application and query any data items that control the application flow.


Note: When a core-dump file is produced, the run-time system itself cannot clean up COBOL file buffers or free system resources. This could lead to data file corruption that might not have occurred if no core-dump file was produced.



Copyright © 2000 MERANT International Limited. All rights reserved.
This document and the proprietary marks and names used herein are protected by international law.

PreviousSystem Limits and Programming Restrictions Creating 64-bit ProgramsNext