HLI: Model 204 recovery and checkpoints

From m204wiki
Revision as of 23:34, 29 January 2016 by ELowell (talk | contribs)
Jump to navigation Jump to search

Overview

This topic describes Model 204 recovery and checkpoints, which are used for recovery, for application programmers who are using the Host Language Interface facility.

Refer to the descriptions of particular HLI calls and their use in transaction processing, and code your HLI application to minimize the amount of work required for recovery.

Read the information about checkpoints on a multiple cursor IFSTRT thread if you are using multiple cursor functionality in your HLI application for the first time.

For more information

Follow the guidelines in this topic for using the Model 204 recovery facilities for HLI jobs. Note that you might need to ask your Model 204 system administrator for additional assistance when running recovery for HLI jobs.

See HLI: Transactions for information about managing HLI transactions.

For more information about recovery and checkpointing, see:

Model 204 recovery facilities

IFAM1 Roll Back recovery

In IFAM1, only Roll Back recovery is available. You cannot do roll forward or media recovery in IFAM1.

For roll back recovery, you must include the CHKPOINT file in the IFAM1 job. To run the recovery step, run a BATCH204 job with the CHKPOINT file from the IFAM1 job as input and issue the RESTART command.

On a single cursor IFSTRT thread, Model 204 automatically marks the start and end points on the CHKPOINT file. You can specify additional checkpointing to be performed throughout the IFAM1 job run by specifying either or both the CPTIME and CPSORT user zero parameters.

On a multiple cursor IFSTRT thread, you can specify checkpointing to be performed throughout the IFAM1 job by specifying the CPTIME user zero parameter.

Note that you cannot issue the IFCHKPT call in your IFAM1 application.

See Enabling the checkpoint facility for more information about IFAM1 checkpointing.

Recovery Logging

The Model 204 journals--CCAJRNL and the optional CCAJLOG--and audit trail (CCAAUDIT) files provide a log of information about a Model 204 run. A single execution of Model 204 can log information in a journal(s), in an audit trail, or in both files.

These logs for HLI use are described in more detail in the following sections. For complete information about the journals and audit trail, see Tracking system activity (CCAJRNL, CCAAUDIT, CCAJLOG).

Journals

The journals are sequential files that maintains run information and can be used to analyze the functional operation of Model 204. Except for IFAM1, CCAJRNL alone contains the following information, which is provided during execution of a Model 204 job:

  • User input
  • System messages
  • Roll forward entries

If both CCAJRNL and CCAJLOG are defined, CCAJRNL collect the recovery records and CCAJLOG collects the messages and statistics.

In IFAM1, the journal does not contain roll forward information.

Note: The CCAJRNL is required if system RESTART recovery or media recovery is being used. The CCAJRNL is produced in a nonprintable format that is used during system recovery and media recovery.

To use the journal in IFAM1, include a CCAJRNL DD, DDBL, or FILEDEF statement in the job setup. In IFAM4, include a CCAJRNL DD statement.

You can use the Audit204 utility to print the journals, to format billing information, and to analyze some types of system statistics. The Audit204 utility accepts CCAJRNL, if you define only one journal. If you define both journals, the Audit204 utility accepts only CCAJLOG.

Audit trail

The audit trail is a formatted form of the journal. It contains all the information that is kept in the journal except the roll forward information (that is, the user input and system messages).

The audit trail can be printed directly by a Model 204 run without requiring a separate job step. It is invaluable for debugging application programs and for analyzing system performance.

Technical Support recommends that you use an audit trail under the following conditions:

  • For HLI batch jobs, to avoid running an extra job step to print the journal
  • In jobs that require a printed log

To use the audit trail in IFAM1, include a CCAAUDIT DD, DDBL, or FILEDEF statement in the job setup. In IFAM4, include a CCAAUDIT DD statement.

Checkpoints

The Model 204 checkpoint facility consists of pseudo subtasks, which work together to take checkpoints, and the CHKPOINT file.

The CHKPOINT is a sequential file that contains copies of file pages before updates are applied (called before-images or preimages) and marker records (called checkpoints) that record the date and time when the system is quiescent, that is, when no updating activity is occurring.

Each checkpoint taken during a Model 204 run marks a time when no updates are in progress. When a checkpoint is taken, Model 204 writes a record containing a date and time stamp to CHKPOINT.

Using the checkpoint facility in conjunction with the recovery facilities, a valid copy of the Model 204 database can be recovered after a system failure. See Model 204 recovery facilities for a description of roll back, roll forward, and media recovery.

For more information about recovery and checkpointing, see:

Enabling the checkpoint facility

In IFAM1 and IFAM4, to enable checkpointing, include the following in the HLI job setup:

  • A job control CHKPOINT DD, DLBL, or FILEDEF statement (which defines the CHKPOINT file).
  • The RCVOPT (recovery options) system parameter, set to 1 or to 9 if roll forward logging is also used, either on the User 0 parameter line in the CCAIN file or in the JCL EXEC parameter (which indicates that checkpoints are to be taken).

In IFAM2, the CHKPOINT file and RCVOPT parameter are specified in the Model 204 online run. Refer to the Rocket Model 204 Host Language Interface Reference Manual for information about HLI jobs.

For detailed information about the CHKPOINT file and RCVOPT parameter, see Checkpoints: Storing before-images of changed pages.

Four different checkpointing mechanisms

Model 204 provides the HLI user with four different mechanisms by which checkpointing can be activated, either automatically or explicitly at the user's request.

When used, the following User 0 parameters cause Model 204 to automatically initiate an attempt to take a checkpoint:

  • CPTIME, which performs a checkpoint attempt at timed intervals
  • CPSORT, for which a checkpoint attempt is initiated by HLI calls to IFSTRT and IFFNSH on a single cursor IFSTRT thread

The following functions provide the Model 204 user with a means to explicitly initiate checkpointing:

  • HLI call to IFCHKPT in IFAM2 or IFAM4
  • CHECKPOINT command, which can be issued only on an IFDIAL thread

Checkpointing specific to HLI processing for CPTIME, CPSORT, and IFCHKPT is described in more detail later in this topic.

Automatic checkpointing: CPTIME

When timed checkpointing is activated, Model 204 attempts to take a checkpoint at regular timed intervals. This process is controlled by the following User 0 parameters:

  • CPTIME, which is the number of minutes between attempts to take a checkpoint
  • CPTQ, which is the number of seconds to wait for IFSTRT threads to quiesce before timing out a checkpoint
  • CPTO, which is the number of seconds to wait for IFDIAL threads, or SOUL threads in IFAM2, to quiesce before timing out a checkpoint.

When the CPTIME interval expires

When the CPTIME interval expires, Model 204 disallows new update units. Model 204 performs other actions depending on the status of update units and the CPTQ and CPTO checkpoint parameters.

Model 204 performs the following actions:

  • If the system is quiescent, that is, no updates are currently in progress, Model 204 first takes a checkpoint and then allows new updates to begin.
  • If any IFSTRT threads have updates in progress, Model 204 starts the CPTQ timer.

    Whenever an update unit ends, Model 204 checks for any other IFSTRT updates in progress and repeats the process until no more IFSTRT threads are updating or until CPTQ expires.

  • If any IFDIAL threads, or SOUL threads in the IFAM2 environment, have update units in progress, and if the CPTQ interval has not expired, Model 204 starts the CPTO timer.

    Whenever an update unit ends, Model 204 checks for any other IFDIAL updates (or SOUL thread updates in the IFAM2 environment) in progress and repeats the process until no more IFDIAL threads, or SOUL threads in the IFAM2 environment, are updating or until CPTO expires.

Specifying a CPTIME value

In IFAM2, CPTIME is set in the Model 204 Online run. In IFAM1 and IFAM4, you can specify the CPTIME User 0 parameter in the CCAIN input file. If you do not specify a value, CPTIME defaults to a value of 0, and timed checkpoints are disallowed for the HLI/Model 204 run.

A user having system manager privileges can reset the CPTIME parameter only if CPTIME is set to a nonzero value on the User 0 parameter line.

CPTIME processing steps

See the following processing flow charts for a detailed description of the steps involved in CPTIME checkpointing:

For more information about the CPTIME parameter, see Checkpoints: Storing before-images of changed pages.

Automatic checkpointing: CPSORT

CPSORT checkpointing attempts to take a checkpoint upon the execution of the initial IFSTRT call (only for a single cursor IFSTRT thread) in an HLI job and upon execution of an IFFNSH call on a single cursor IFSTRT thread for every IFAM2 job (which uses single cursor IFSTRT threads).

Note: CPSORT can be used on a single cursor IFSTRT thread, and is not available for use on a multiple cursor IFSTRT thread.

CPSORT checkpointing is controlled by the following User 0 parameters:

Parameter Description
CPSORT (Checkpoint Sign-On Retry) Number of times that Model 204 attempts to take a checkpoint at the beginning (the initial IFSTRT call) and end (an IFFNSH call) of HLI jobs.
CPTQ Number of seconds to wait for IFSTRT threads to quiesce before timing out a checkpoint
CPTO Number of seconds to wait for IFDIAL threads, or SOUL threads in IFAM2, to quiesce before timing out a checkpoint

CPSORT operates separately from CPTIME and has no effect on the timing of CPTIME checkpoints.

Specifying a CPSORT value

In IFAM2, CPSORT is set in the Model 204 Online run. The CPSORT parameter can be reset by a user having system manager privileges.

Note: CPSORT defaults to a value of 1. You can specify a higher value. However, a high CPSORT value can inhibit new update units for long periods of time and affect overall system throughput.

If you specify a value of 0 for CPSORT, Model 204 does not attempt to take a checkpoint at the beginning and end of HLI jobs.

CPSORT is useful for roll back recovery in the IFAM2 multiuser environment, because it marks the beginning and end of each HLI job.

CPSORT processing steps

See the following processing flow charts for a detailed description of the steps involved in CPSORT checkpointing:

For more information about the CPSORT parameter, see Checkpoints: Storing before-images of changed pages.

IFCHKPT checkpointing

The IFCHKPT call provides a mechanism for initiating attempts to take checkpoints from within an HLI application.

You can use IFCHKPT in IFAM2 and IFAM4 applications. You cannot issue a call to IFCHKPT in an IFAM1 application.

Differences in checkpointing procedure

There are differences in the procedure that is used for checkpointing depending on whether IFCHKPT is issued on a multiple cursor IFSTRT thread or on single cursor IFSTRT threads.

For example, in a multithreaded IFSTRT application, each single cursor IFSTRT thread that is updating must indicate to Model 204 that it is quiescing in preparation for an attempt to take a checkpoint. A single cursor IFSTRT thread performing update processing prevents checkpoints from occurring unless the thread specifically requests a checkpoint by issuing an IFCHKPT call.

Refer to the Rocket Model 204 Host Language Interface Reference Manual for a description IFCHKPT and detailed information about using the IFCHKPT call on different types of IFSTRT threads.

IFCHKPT processing steps

See the following processing flow charts for a detailed description of the steps involved in IFCHKPT checkpointing:

Refer to the Rocket Model 204 Host Language Interface Reference Manual for more information about IFCHKPT.

Checkpoint processing steps

The processing flow charts in this section provide details on the steps involved in CPTIME, CPSORT, and IFCHKPT checkpointing.

CPTIME main flow step

CPTIME checkpointing: main processing flow

CPTQ timer step

CPTQ checkpointing timer processing flow

CPTO timer step

CPTO checkpointing timer processing flow

CPTIME time-out step

CPTIME checkpointing: time-out processing flow

CPSORT main flow step

The following figure shows the main flow of CPSORT checkpointing. The CPSORT parameter must be set to a value that is not equal to 0 for CPSORT processing to be enabled. A call to IFSTRT or IFFNSH, as described in Automatic checkpointing: CPSORT, initiates the CPSORT process shown below.

CPSORT checkpointing: main processing flow

CPSORT time-out step

CPSORT checkpointing: time-out processing flow

IFCHKPT main flow step

IFCHKPT checkpointing: main processing flow

IFCHKPT time-out step

IFCHKPT checkpointing: time-out processing flow

See also