IMS Syncpoint Coordination

This chapter describes the syncpoint coordination that IMS Option provides for its own databases and for SQL systems. It focuses on:

Commit protocol used
Resources that IMS Option manages
Failures that may occur
Information available to identify and help recover from potential problems

4.1 Overview

Syncpoint control is a complicated task when accessing resources or databases which reside on multiple machines. It is further complicated by the variety of networks, communications protocols, operating systems and products available. Many of the issues described in this chapter are inherent in managing resources distributed on multiple machines and are not unique to IMS Option.

When using IMS Option for database support with the Micro Focus CICS Option, the CICS system provides syncpoint coordination. See your CICS documentation for a description of its processing.

4.2 Terminology

Term	Description
Rollback and Backout	These two terms are commonly used to describe the same processing. Here we follow the convention of the IMS/ESA and IBM SAA Resource Recovery manuals and use the term Backout but Rollback is used in other chapters
Syncpoint	A point when recoverable resources are synchronized. They can be synchronized by either commit or backout processing.
Recoverable resource	A resource which can perform dynamic backout. It may provide forward recovery and/or recovery from syncpoint failures. A database system is a common example of a recoverable resource.
Unit-of-work	Work which is performed against a recoverable resource between syncpoints. "Work" is anything which affects the status of the resource. For example, setting a lock by issuing a retrieval call to a database system would be work.
Syncpoint Coordinator	A function which controls syncpoint processing. It sends syncpoint requests to recoverable resources and manages the responses to these requests. It may provide recovery for some or all syncpoint failures. It is responsible for notifying the application program or user of results of syncpoint processing.
In-flight and in-doubt	A unit-of-work is in-flight until syncpoint processing begins. Once syncpoint processing begins, the unit-of-work is in doubt until syncpoint processing completes and the results returned to the application program or user.
Mixed outcome	When accessing multiple recoverable resources within one unit-of-work, the syncpoint did not complete and some resources might have been committed and others backed out

4.3 Recoverable Resources

IMS Option supports several different types of processing for its databases. The Fileshare Database subsystem is one recoverable resource. Remote IMS Databases are another recoverable resource. The single-user, exclusive use databases are not recoverable resources. The shared read-only databases are not, strictly speaking, recoverable resources. Recoverability is not required for shared read-only databases since they cannot be updated by anyone and record locks are not required.

Exclusive use databases are committed when the database calls are issued and are not dynamically backed out. You may mix Exclusive use databases with other databases in the same unit-of-work. However, any updates are not synchronized. You need to consider this when selecting the type of processing for a database.

IMS Option notifies a supported SQL/DB2 system of explicit and implicit syncpoints. An SQL system is a separate recoverable resource. IMS Option only notifies the SQL system of the syncpoint; the SQL system must process the request. See the SQL product documentation for any steps required to enable support for commit and dynamic backout support.

4.4 Syncpoint Coordinator

A Syncpoint Coordinator is required for managing syncpoints with recoverable resources. Other vendors may use different terms, but the function is still required. It may be simple or extremely complex. It is always complex when accessing multiple resources or when accessing resources across networks. When running under a transaction processing system, such as IMS Option or CICS Option, it coordinates syncpointing for your application. Otherwise, your program has to perform this function.

Your application program can be its own Syncpoint Coordinator. For example, you can add a COMMIT statement in an SQL program to cause SQL commits. The application program must report the success or failure of the commit to another program or the end user. Also, your program may have to determine the difference between a failed commit which means "backout was successful" and one which means "commit status unknown". The possible results for a commit depend on the actual SQL system or other product you are using. You need to understand the processing that the system uses to determine possible outcomes and corrective actions.

4.5 Commit and Backout

An implicit backout can occur for a number of specific situations. The general rule is that an implicit backout occurs whenever a program ends without reaching a syncpoint. Some examples of the situations which cause implicit backout are:

When a program causes an abend, such as a 261 Abend indicating an invalid parameter count, IMS Option displays a popup window describing the error. You are given the choice of ignoring the abend or terminating. Selecting termination causes a backout and termination. Ignoring the error causes a return to your application as if the call succeeded.

For other types of program abends, such as a Micro Focus RTS error 163 (numeric validation failure), whether or not the program abends depends on whether you are debugging the program which caused the error. If you are debugging the program, the debugger stops on the statement in error enabling you to correct it - this does not cause a backout or an abend. If the failing program is a GNT program or you are running the program instead of debugging, an abend and backout occur.

4.6 IMS Option Two-step Commit

The IMS Option Syncpoint Coordinator uses a two-step commit protocol. All resources registered with the Coordinator are sent commit requests. The sequence for commit depends on whether the resource supports a two-step sequence or not. For two-step resources, the Coordinator issues a "Prepare" to each resource prior to sending the second step "Commit". If all Prepares succeed, commit processing begins. If any prepare fails, backout starts. When a resource responds negatively to the prepare, it is stating that it cannot complete a commit but it can complete backout or has already backed out. These prepare and commit steps are basically the same steps which occur with a two-phase commit protocol.

This is called two-step to distinguish it from the two-phase commit support in IMS/ESA. Two-step shares some of the same benefits as the IMS/ESA two-phase support but does not provide recovery from failures which may occur during backout processing, during the second step of commit processing or failures to the Syncpoint Coordinator itself.

The main benefit of two-step over a "single-phase" commit protocol is that it eliminates the integrity window between the time the application last accessed a resource to when commit processing begins. The Prepare step "catches" any failures which occur during this interval and begin backout processing. Contrast this with two separate resources using a single-phase protocol. In single-phase there is no Prepare phase, only a commit or backout. If the first commit succeeded but the commit to the second resource failed, you could have a mixed outcome.

The Remote IMS Requester product and the IMS Option Fileshare Database subsystem both register as a two-step resource with the IMS Option Syncpoint Coordinator. The SQL products supported by IMS Option are only sent the second step commit or backout request as only a single-phase process is supported for SQL.

4.6.1 Failure Reporting

The Syncpoint Coordinator displays a popup window for any syncpoint failure it detects. The popup window describes the general result of the syncpoint. A syncpoint failure report is written to the IMS System Log. This report gives the detailed return codes and results for each step of the commit or backout for each managed resource. See the section IMS System Log in the chapter IMS Option Administration Information in your Administrator's Guide.

There is explanatory text in the popup window which identifies these general categories. The popups also contain a MSG ID field on the screen. The MSG ID for a successful backout for a commit is SYNC01. The MSG ID for the mixed or unknown outcome is SYNC02.

A successful backout for a commit can occur when a two-step resource responds negatively to the prepare step. This causes the Syncpoint Coordinator to issue backout to all resources. If they all backout successfully, the popup window error indicates the commit failed but that all resources were successfully backed out. The syncpoint status report in the imsmto.log file identifies the resource which caused backout processing. Although the commit failed, at least your resources are still synchronized - but not at the point you expected. To recover from this failure, you should review the syncpoint failure report. This gives the detailed return codes you can use to determine which resource failed and any steps which may be required to re-establish access to that resource. You must re-run your application to apply any updates which were not committed.

The Coordinator does not report a hard failure that occurs during a syncpoint. An example of this type of failure is switching off power to your workstation during syncpoint. For these failures, the state of any recoverable resources are not known. The Coordinator cannot display a popup or produce a syncpoint failure report as it was not allowed to complete (or perform) syncpoint processing. Generally, a recoverable resource performs backout when it detects a failed partner program. However, there is not an easy way to determine which resources may have been committed and which backed out. This kind of failure is not unique to the IMS Option Syncpoint Coordinator. Any program or system performing syncpoints is exposed to possible hardware or operating system failures and would need recovery services to address this.

4.6.2 Syncpoint Mixed Outcomes

A failed backout or failure during second step commit is really more of an unknown outcome. A failed commit may not mean that a backout was successful. A failed backout would not likely be a successful commit. The resource itself may not know what occurred to inform the Coordinator of the result. If the resource is communicating to another machine through a network, it may just have detected a network failure. Depending on the failure, it may mean that the commit or backout was successful on the target machine but the response could not travel back to the Coordinator. It may mean that the commit or backout request never got to the target machine for processing. In either case, it is more of an issue when performing updates than when reading data. All negative responses from a resource during second step commit or backout are reported by the Coordinator as a Mixed Outcome to highlight this situation.

In practical terms, a mixed outcome can only occur during commit processing when updating more than one recoverable resource in a unit-of-work. Once all of the two-step resources have responded positively to the Prepare, the commit step begins. If any resource fails commit, the Syncpoint Coordinator reports this as a mixed outcome. This error does not necessarily mean your data is corrupt. If no updates were made in this unit-of-work to the resource which failed commit, your data is still intact. If a mixed outcome is reported but all the updated resources committed successfully, it was effectively a successful commit.

The Coordinator reports this as a mixed outcome as it may not know whether updates were made. For Fileshare Databases, although IMS Option knows whether it made any update requests, you may have used Fileshare for updates using other products or your own files. The syncpoint failure report in the imsmto.log indicates whether or not IMS Option made any updates to Fileshare during this unit-of-work. For SQL systems, IMS Option does not know what was performed. Remote IMS provides a configuration setting to enable "heuristic completions". This enables syncpoint failures to be reported as successful if no updates were sent to the Remote IMS Server. With heuristic completions, mixed outcomes are not reported by the Coordinator unless necessary.

For mixed outcomes with multiple updated resources, you may not have loss of data even if the syncpoint failure report indicates one of the updated resources failed commit. In an environment where a resource communicates across a network, it may not know whether the commit succeeded or whether it resulted in a successful backout. It may just know that the network failed while attempting the commit. The commit request may not have been received by the remote system or the remote system may have processed the request but could not return the status. Thus, it may be possible for a resource to respond negatively to a backout or second step commit even though it backed out or committed successfully.

A simple way to ensure you do not get a mixed outcome is to update only one resource within a single unit-of-work. The Syncpoint Coordinator may still report the result of the syncpoint as a mixed outcome (actually unknown), but, you would know from your application design that the result was not mixed. For example, with Fileshare Databases and an SQL system, the prepare and commit to Fileshare succeeds but the commit to SQL fails. If your application only issued queries to SQL, the result might be the releasing of the SQL locks which is what would have occurred if the commit had been successful.

4.6.3 Remote IMS Syncpoint Failures

The Remote IMS product provides a detailed list for its possible in-flight and in-doubt failures. The Remote IMS scenarios are fairly specific because it is communicating using the SNA LU 6.2 protocol and your IMS/ESA system. These are very reliable and predictable systems. Fileshare runs on many operating systems with a variety of communications protocols which make it difficult to describe failure scenarios exactly. See your Fileshare User Guide for details. See your SQL system documentation for details of its support.

4.7 Other Considerations

4.7.1 Message Queues

In IMS Option, the DC message queues are not part of commit and backout processing. Printer output is written to a data set named ims327op.dat. This file is a line sequential file and is unique for each Application Region of IMS Option. Output messages are committed as soon as the insert call completes. It is up to you to determine what is appropriate to view or print if a transaction has been backed out.

A more obscure aspect of not having the message queues as part of the transaction processing is for transactions which insert messages to background tasks, that is, NRMP or QBMP type programs. When an implicit backout occurs due to software failure, IMS Option terminates and does not schedule any other program to run. Thus, if a message processing program (MPP) had scheduled an NRMP or a QBMP and then abended with a 261 abend, IMS Option terminates and the NRMP or QBMP would not be executed, even if the PCB indicated EXPRESS=YES. Also, for example, if a message processing program had scheduled three different NRMPs and completed normally, but, the second NRMP abended, you would have a complete transaction from the message processing program and the first NRMP, but the second would be backed out and the third would not be run.

4.7.2 GSAM and Checkpoint/Restart

GSAM database inserts are always committed when the insert call completes. This may cause duplicate records in a GSAM output file if a batch program abends and is restarted. For more information about GSAM databases, see the sections GSAM Considerations in the chapter For the Database Administrator (DBA) and Unsupported DB Features in the chapter Product Summary.

4.7.3 Fast Path

Fast Path MSDB and DEDB databases are committed and backed out the same as full function databases. For other details concerning Fast Path, see the chapter For the Database Administrator (DBA).

4.7.4 SETS and ROLS

The SETS and ROLS calls have no effect on syncpoint processing. They can be issued by an application but do not set or backout to intermediate points.

4.7.5 Dynamic Database Attach

The IMS Option Syncpoint Coordinator provides the same control of recoverable resources when using the Dynamic Database Attach (DDBA) API as it does for batch and online programs. The only difference is that it does not issue syncpoints to any SQL system, even if one is defined in the IMS System Configuration. See the the section System Exits in the chapter Advanced Customization in this guide for more information on the DDBA interface.