Chapter 9: Performance and Troubleshooting

This chapter provides guidelines for improving system performance. Also included is an alphabetical listing of the error messages that can occur during the compilation of your program. Each message is linked to an explanation and a recommended recovery procedure.

Performance Issues

Adding a relational database management system (RDBMS) brings significant complexity to any computer system. A few key guidelines may help to improve the performance of the system and prevent problems as the database grows.

Guidelines

The following sections describe some areas for you to consider as you prepare your database and your COBOL program.

Guideline 1 — Database Administrator

A site with a database management system needs a database administrator (DBA).

The administrator is critical in any large database installation. The DBA checks performance statistics and memory usage, performs general maintenance and backup for the database, initiates traces, executes database utilities, sets buffer sizes, determines how often buffers are flushing, and, in general, understands how database settings and events affect overall performance.

The DBA also performs periodic tuning of the database, including:

If a site experiences a performance problem, this usually means there is a bottleneck somewhere in the system. The DBA can help to isolate the bottleneck. Once the bottleneck is identified, the site can determine whether to apply more resources or correct the situation that is stressing the existing resources.

Guideline 2 — Understand COBOL Operations and Database Operation

Some COBOL operations are particularly stressful to a database. The more the application uses these operations, the more likely it will slow the performance of the RDBMS.

The more you understand about your RDBMS and how it operates, the more you can help your COBOL applications to work efficiently with it.

File Input and Output

Consider these standard COBOL I/O operations:

Each has an associated cost in terms of database performance. Each asks the database to do something that takes time. So if there are I/O operations that are optional in your COBOL application, you may want to remove them.

For instance, let's examine file OPENs.

Developers sometimes OPEN and CLOSE files unnecessarily, using the OPEN–CLOSE pairing as a way to demarcate each new procedure:

OPEN file-A
procedural code
CLOSE file-A

But it's important to know that file OPENs are expensive in terms of performance. If you can eliminate non-essential OPENs from portions of your application, you can probably make an improvement in processing speed.

READ operations can also affect performance. All COBOL I/O is based on key indexes. Examining the output of your query optimizer allows you to determine if the most efficient execution path is being used for READs. The execution path for a given query can change as your data changes and as the size of the tables changes. It is also affected by database initialization parameters and any statistical information that you may have gathered on your tables. It might be helpful to know that, typically, the index path is the most efficient for Database Connectors™ applications.

Transactions

Large transactions are also very expensive. The main problem here is that the database will hold locks on indexes and rows throughout an entire transaction. Thus, the database is creating additional overhead for an extended period of time if the transaction is lengthy.

In addition, complex information tracking must take place to ensure that transactions can be successfully rolled back.

Often application designers decide to err on the side of safety when applying transaction management to a mature application. Which operations should be included in a single transaction? The safe approach is to group everything that is related into one transaction. But this scheme is expensive — and even more so when a database is involved. The lengthier the transaction, the longer the locks are held and system resources are tied up. The extensive data verification in COBOL programs only prolongs this.

If performance is an issue, give some thought to dividing each transaction into smaller and more efficient subgroups.

Tables with Multiple Indexes

If you use tables with multiple indexes, keep in mind that when a record is written or updated, locks are put onto all of the indexes and they are all basically rewritten during the course of the write/update. This is a costly process. There may be multiple columns per index, and multiple indexes per table. Each rewrite implies a certain amount of wait time. Tables with a large number of indexes can be slow on writes/updates, possibly leading other operations in the database query optimizer to become confused.

There are two things you can do in this circumstance:

Restructuring the Data

The benefits of data restructuring may be significant. For example, if you have any situations in which two indexes start out with the same column or set of columns, you may be able to improve performance appreciably by changing your data definition.

Suppose two indexes both start with MONTH, DAY, YEAR. These identical first three columns can cause the RDBMS's query optimizer to choose the wrong index, in which case you will generate a significant amount of unnecessary and unproductive overhead. Restructuring one of the indexes can make a difference.

Using reUZE Developer Files

If you cannot restructure your data but are finding certain operations to be too expensive, you might want to consider moving some data into the reUZE Developer indexed file system.

Guiding the Data Searches

You can guide the data searches that result from database queries, and thus improve performance, by making use of an external variable called A4GL_WHERE_CONSTRAINT. This process is explained in the section "The WHERE Constraint."

Related Topic:

The WHERE Constraint

Guideline 3 — Program and Database Interaction

A database can interact with your COBOL program in surprising ways. When you introduce a database to a mature COBOL application, your application may experience errors you have not seen before. Being aware of this possibility helps you to prepare your response.

For example, your existing application without the database may rarely exceed the limit on the number of data files that can be open simultaneously. But when you add a database, you increase the likelihood of this significantly. This is because the query processing may often require temporary tables. Also, the ORDER BY clause (used in SQL statements to sequence retrieved data) opens temporary work files. So you may find that these files cause you to reach the limit sooner. (Note that proper tuning of the query optimizer can reduce the number of temporary files required.)

When you upgrade to a new version of the RDBMS, be careful. The new software components may interact with your applications differently than their predecessors did. This is to be expected. It's important to rerun your pre-installation simulations (see Guideline 4 — Plan for Growth) to determine whether your system resources are still adequate. Investigate any differences in the two simulations. You may have to make adjustments to compensate for the differences in the RDBMS versions.

Upgrading or switching to a new database may also mean that you need to modify your makefile to coordinate with the settings of the new database. You can edit your makefile manually, or use one of the editing tools distributed with Database Connectors. For more information, see the appendix specific to the Connector interface you are using.

If you notice a change in performance with a new release of your RDBMS, keep in mind that certain database settings can have a significant effect on speed. Fine-tuning your settings can make a difference.

Several other options to help you improve performance are listed below:

Guideline 4 — Plan for Growth

We cannot emphasize enough the importance of planning ahead for growth. You need to be able to predict the system resources that your application will require when it reaches a full load, both in terms of users and database size.

Before you choose and install hardware, it is best to run a simulation. Your hardware vendor or RDBMS vendor can help you to simulate a large number of users on a given platform with a given RDBMS.

Setting up and running a simulation that includes your own application does cost money. But if you are moving into an installation of any size, the consequences of not knowing what to expect can be far greater than the cost of the simulation.

A potentially costly mistake is to test in your office with a small database and a small number of users. Databases respond differently to small amounts of data and a small number of users than they do to large amounts of data and a large number of users. Functionally, the database works differently as the load increases. You might think of this adjustment as switching into a different gear. Significant changes in the load on your database can lead to large increases in overhead. The behaviors and loads you will encounter as the database expands cannot be determined from a linear projection based on the smaller scenario.

The WHERE Constraint

The Database Connectors WHERE constraint is an external variable that gives the developer some control over the data searches that result from database queries. It can help to improve performance in some situations. This section describes purpose of the WHERE constraint and shows how it is implemented.

COBOL data processing is based on keyed READ operations following a positioning operation. Records are read until the key value being processed changes. Because traditional COBOL data processing is based on a B+ tree file system, the overhead for such operations is relatively minor.

RDBMS data processing introduces a new level of complexity to data processing. The database's query optimizer receives the SQL query for the COBOL operation being performed and then builds a working set of data that satisfies that query. Because the database optimizer has many different possible execution methods, this can result in poor performance if the optimizer chooses a query execution path that is less than optimal.

Performance degradation may also result from the fact that queries generated by COBOL operations result in unbounded index queries. Unbounded queries are generated because COBOL positioning operations (Start Not Less Than and Start Not Greater Than) provide only one of the bounding conditions for the working set, instead of both an upper and lower boundary.

As an example, consider the case in which an application needs to process all items in a warehouse on aisle 17, shelf 8, and bin 2. If each of these items is a field in a key, the COBOL program might generate the following query for a READ operation:

Note: The following example applies to the Database Connector for Oracle product. The SQL generated will be different for the different interfaces.

 
SELECT * FROM warehouse_items WHERE 
   aisle = 17 and 
   shelf = 8 and 
   bin = 2 
ORDER BY aisle ASC, shelf ASC, bin ASC;

This query achieves the desired result but has one problem. For the COBOL program to end its processing, it must read a record that has a new value for "bin". The COBOL application has no way of specifying an upper boundary for the read operation, so when all rows of data from bin 2 have been read, the Connector will attempt to read the next record by generating the following query:

SELECT * FROM warehouse_items WHERE 
   aisle = 17 and 
   shelf = 8 and 
   bin > 2 
ORDER BY aisle ASC, shelf ASC, bin ASC;

This query will cause the database query optimizer to gather all records pertaining to items on the remainder of shelf 8 to build its working set. This is excessive from the COBOL application's point of view, because the COBOL program needs only the first record of the working set to determine that it has finished processing.

This problem can be even more serious if the application is processing the last bin on a shelf. Because there are no more bins on that shelf, the query would drop down a level and generate the following:

SELECT * FROM warehouse_items WHERE 
   aisle = 17 and 
   shelf > 8 
ORDER BY aisle ASC, shelf ASC, bin ASC;

This would select all items on the remainder of that aisle of the warehouse, which could be a very large working set if each aisle had 130 shelves!

In reality, most of the time the database query optimizer will not build the entire working set if it has been properly tuned, but will instead perform INDEXED READS to process the query. This means that the query optimizer will traverse an index tree to retrieve the records, much as COBOL index files do, as opposed to using combinations of indexes and sort and merge operations.

It can be helpful for the COBOL developer to influence precisely which information is to be returned. If the application developer knows at compile time (or before the query is executed) the precise scope of the record processing that needs to be performed by the read operations, the developer can more finely tune the information being retrieved.

Database Connectors provides a method by which the COBOL programmer can provide additional information to the database query optimizer by providing more specific selection information to the Connector. This selection information is added to the WHERE clause of the SQL queries generated by the Connector. This can be particularly useful in providing upper boundaries to queries being generated, with the result that the working set is smaller.

The developer may provide upper boundaries on the key segments for a select, or any other selection criteria needed to constrain the working set to just the desired subset of rows. This additional information is added to generated queries with the AND condition. It is not possible for the application developer to specify a larger working set than would otherwise have resulted. The developer may only constrain the working set to a smaller subset.

How to...

Troubleshooting

The remainder of this chapter lists the possible error messages that can occur during compilation. You can find recommended recovery procedures for each situation by following the link of the error message.

Note: It is possible that you will experience slower performance simply because the Database Connectors application is limited because of the rules of COBOL. If you want to keep track of performance levels, there are many third-party tools available to help monitor performance.

Compiler Errors

The errors listed below could occur when you compile with the CREATEXFD directive. In some cases, the eXtended File Descriptor (XFD) cannot be built until you remove the error condition. (XFD errors do not, however, prevent the object code from being generated.) For information about XFDs, see the chapter XFDs.

For detailed information about a particular error message, go to Reference > Error Messages in your Database Connectors documentation.

Bad picture for DATE: keyname
Data missing from key segment keyname
Directive word too long: keyname
GROUP expected after USE
Missing '=' in XFD directive
Missing field name after WHEN
xxx: unknown XFD directive
Value should be a name: xxx
Value should be numeric: xxx
Value should be a literal: xxx
Variable file name requires File directive
WHEN variable xxx not found in record

Related Topics

Compiler Warnings

The Compiler may also generate the following warning messages.

For detailed information about a particular error message, go to Reference > Error Messages in your Database Connectors documentation.

xxx not unique in first 30 characters
Field xxx causes duplicate database data

Run-time System Errors

You can determine the meanings of your database error codes by referring to the database documentation.

You can see run-time system errors by using the FILE_TRACE ACUFH configuration variable with the Consolidated Tracing Facility (CTF).

Unless noted otherwise, this method applies to all RDBMs supported by the Database Connectors family of interfaces. For a listing of run-time system errors, refer to the Troubleshooting section of the appendix that pertains to your product.

Related Topic:

FILE_TRACE configuration variable

Using the FILE_TRACE ACUFH Configuration Variable

At run time, if you specify a CTF trace file and set the FILE_TRACE ACUFH configuration variable, the run-time system puts the extended error code and some text associated with the error into the error file.

The text of the error would then have this format in the file:

*** File system error value = 3 ***
*** Dictionary (.xfd) file not found***
File error 9D,03 on filename
Dictionary (.xfd) file not found

Occasionally you may receive an error message that means syntax error. You can examine the CTF trace file and determine the cause of the problem if you receive this error code.

How to...

Retrieving Messages Using the CTF

Use the CTF to trace diagnostic information in ACUFH. (CTF is described in detail in the reference topic Introduction to the Consolidated Tracing Facility under General Reference in your Help.)

How to...


Copyright © 2009 Micro Focus (IP) Ltd. All rights reserved.