Media recovery: Difference between revisions
m (→Overview) |
mNo edit summary |
||
Line 16: | Line 16: | ||
<li>Installing the MERGEJ utility, which is used to merge into a single file the journals produced by different <var class="product">Model 204</var> runs. This task is described in the Rocket <var class="product">Model 204</var> installation guide for your operating system. </li> | <li>Installing the MERGEJ utility, which is used to merge into a single file the journals produced by different <var class="product">Model 204</var> runs. This task is described in the Rocket <var class="product">Model 204</var> installation guide for your operating system. </li> | ||
<li>Dumping <var class="product">Model 204</var> files on a regular basis. This can be done either with the DUMP command, described in [[File dumping and restoring]], or with any standard DASD management software such as FDR. The installation must establish procedures for backing up all <var class="product">Model 204</var> files periodically, so that backups are available for recovery runs. </li> | <li>Dumping <var class="product">Model 204</var> files on a regular basis. This can be done either with the DUMP command, described in [[File dumping and restoring]], or with any standard DASD management software such as FDR. The installation must establish procedures for backing up all <var class="product">Model 204</var> files periodically, so that backups are available for recovery runs. </li> | ||
<li>Producing and archiving journals from all <var class="product">Model 204</var> update runs. If journaling is active for a run, <var class="product">Model 204</var> writes to the journal data set all updates made to the file during the run. <var class="product">Model 204</var> also logs such events as checkpoints, discontinuities, and other run information. | <li>Producing and archiving journals from all <var class="product">Model 204</var> update runs. If journaling is active for a run, <var class="product">Model 204</var> writes to the journal data set all updates made to the file during the run. <var class="product">Model 204</var> also logs such events as checkpoints, discontinuities, and other run information. | ||
< | <p class="note"><b>Note</b> | ||
<b>Note</b> | If media recovery is to be run, an installation must maintain journals containing a complete record of update activity, in all kinds of runs - Online, batch, and recovery. </p> | ||
</li> | |||
< | |||
<li>Running the MERGEJ utility to merge together any overlapping journals needed for a media recovery run.</li> | <li>Running the MERGEJ utility to merge together any overlapping journals needed for a media recovery run.</li> | ||
<li>Creating the file, if it has to be moved to a different physical location. </li> | <li>Creating the file, if it has to be moved to a different physical location. </li> |
Revision as of 20:40, 17 January 2014
Overview
Media recovery is the process of restoring a Model 204 file from a backup copy of that file, and then using the Roll Forward facility to reapply the updates that were made to the file since the time of the dump. These updates are obtained from the Model 204 journals written by all jobs in which updates were performed since the backup copy was made. The media recovery facility automatically performs Roll Forward and can (optionally) automatically perform the restoration; you do not need to invoke the RESTORE and RESTART commands.
The Rocket Model 204 System Manager's Guide describes the procedures to follow to recover entire databases in case of a system crash or some other systemwide problem. This chapter presents procedures for recovering individual files after a media failure such as a disk head crash.
Media recovery example
Although this chapter focuses on recovery from media failures, media recovery can be used in circumstances other than media failure as well. The facility can be helpful in any situation where you need to return a file to its state at a given point in time. For example, consider the following case. Suppose that you:
- Dumped a file.
- Updated the file in Job A (with journaling active).
- Started to update the file in Job B (with journaling not active). This might be a File Load run.
Now, suppose that Model 204 crashed during Job B. At this point media recovery can be run, which optionally restores the file and applies Job A's updates. The file is returned to the state it was in before you ran Job B. Then you can rerun Job B.
Process of using media recovery to restore files
To use media recovery effectively, an installation must establish procedures for dumping files on a regular basis, and for maintaining journal files that contain a complete history of file update activity.
The Model 204 file manager and system manager share the responsibility for successful media recovery. At different installations, file managers and system managers allocate media recovery tasks in different ways. These tasks include the following:
- Installing the MERGEJ utility, which is used to merge into a single file the journals produced by different Model 204 runs. This task is described in the Rocket Model 204 installation guide for your operating system.
- Dumping Model 204 files on a regular basis. This can be done either with the DUMP command, described in File dumping and restoring, or with any standard DASD management software such as FDR. The installation must establish procedures for backing up all Model 204 files periodically, so that backups are available for recovery runs.
- Producing and archiving journals from all Model 204 update runs. If journaling is active for a run, Model 204 writes to the journal data set all updates made to the file during the run. Model 204 also logs such events as checkpoints, discontinuities, and other run information.
Note If media recovery is to be run, an installation must maintain journals containing a complete record of update activity, in all kinds of runs - Online, batch, and recovery.
- Running the MERGEJ utility to merge together any overlapping journals needed for a media recovery run.
- Creating the file, if it has to be moved to a different physical location.
- Restoring the backup copy of the file if it is something other than a Model 204 dump.
- Running the media recovery run to regenerate files from restored versions.
These tasks are described in detail in this chapter.
Backing up Model 204 files
Because the success of media recovery depends upon the availability of backup copies of all Model 204 files, be sure to back up files on a regular basis. The Model 204 DUMP/RESTORE utility allows you to copy a Model 204 file to a sequential data set. The DUMP utility can be run either as a batch job, or in an Online while other users are accessing the file. See Using DUMP and RESTORE with media recovery for a description of the DUMP/RESTORE utility.
Backups can also be taken using other DASD management software (such as FDR or DFDSS). However, if you need to back up a file while an Online that accesses it is still running, take the backup with the Model 204 DUMP command issued from the running Online. This ensures that the backup copy of the database is physically consistent.
Producing and archiving journals
The journal produced by a Model 204 run contains a record of file updates, as well as information about run characteristics, error messages, and utilization statistics. The journal data set (described in the Rocket Model 204 System Manager's Guide) is vital to system and media recovery.
For a summary of statistics written to the journal, see File statistics summary.
Creating a journal
If you want to produce a journal for a run and use it in media recovery, the file manager or system manager must take the following actions for each job that updates the file:
- Include a CCAJRNL data set or file definition statement in the JCL or EXEC for the run. (If multiple journal data sets are generated, additional data set definition statements can be specified.) See the Rocket Model 204 System Manager's Guide for details.
- Set the Model 204 SYSOPT (system options) parameter to include the 128 option (produce a journal or audit trail).
- In the EXEC parameter or on User 0's parameter line, set the RCVOPT (recovery options) parameter to include the X'08' option (log Roll Forward information).
- During file creation or with the RESET command, set the FRCVOPT (file recovery options) parameter for each file to be regenerated. Ensure that FRCVOPT does not include the X'04' (suppress Roll Forward logging) option.
Usage
Unlike for system recovery runs, a CHKPOINT data set definition statement is not required for media recovery.
Be sure that journals are produced for all Online update runs. Archive the journals so that they are available in case of a media failure.
Concatenated journals
Journals used in a media recovery run can be either merged or concatenated. However, if they are concatenated and any one of the journals does not contain a valid EOF marker (because the job that produced it ended abnormally), media recovery assumes EOF for that journal and does not read any of the journals that follow it in the concatenation.
To solve this problem, choose one of the following methods:
- Merge the journals together using the MERGEJ utility. Merging journals is discussed on Merging journals.
- Write an EOF marker to any journal missing its EOF marker with the UTILJ utility. The UTILJ utility is described in detail in the Rocket Model 204 System Manager's Guide.
Merging journals
During media recovery, Model 204 regenerates files from restored versions, applying updates to those files. The updates are obtained from the journals associated with the runs that originally updated the files.
When multiple Model 204 jobs overlap in time, that is, some portions of them run concurrently, overlapping journal files are produced from the run. Overlapping journals must be merged before they can be used as input to media recovery. The MERGEJ utility is provided to perform this function.
MERGEJ utility
Before you can run MERGEJ, the system manager must have installed the MERGEJ utility. See the Rocket Model 204 installation guide for your operating system for details.
The MERGEJ utility allows as many as 32 journal files (9 for z/VSE systems) to be combined into one file. The number of journal files that can be merged with one MERGEJ command is a function of the capacity of the SORT/MERGE utility that is available for use.
The journal files can be individual journal files or the result of a previous merge operation. If a previously merged file is being combined with individual journal files, the merged file must be the first input file specified. Only one merged file can be input into a MERGEJ run.
When to use MERGEJ
MERGEJ not needed
MERGEJ utility not required illustrates a case in which media failure occurs but you do not need to run the MERGEJ utility. S is the starting time of each job, F is the finishing time.
MERGEJ utility not required
MERGEJ needed
MERGEJ utility required illustrates a case in which media failure occurs and MERGEJ does need to be run.
MERGEJ utility required
SORTOUT and SORTIN data sets with MERGEJ
For z/OS and z/VM, the MERGEJ utility allows duplicate SORTOUT data sets. You can define up to nine additional SORTOUT data sets. In sequence after the first SORTOUT ddname, you can specify SORTOUT1 through SORTOUT9. The SORTOUT data sets are identical.
Also, the default number of buffers for each SORTIN data set and SORTOUT data set is 99. You can override this default on the DD card through the JCL parameter BUFNO. The valid range of values for BUFNO is 1 to 255.
All buffers for the SORTIN and SORTOUT data sets are obtained from below-the-line storage. Therefore, if too many buffers are allocated, an ABEND S878 RC=10 may occur for lack of storage. If this happens, either reduce the number of SORTIN and/or SORTOUT data sets or reduce the number of buffers per SORTIN and/or SORTOUT data sets through the use of the BUFNO JCL parameter.
Running MERGEJ in a z/OS environment
To run the MERGEJ utility under z/OS, the following data sets are required:
Data set(s) | Function |
---|---|
SORTIN01 through
SORTINnn (where nn cannot exceed 32) |
Define the input to MERGEJ. These are the journal files from the Model 204 runs that originally updated the files that are being recovered. Input files must be numbered sequentially without gaps. If any of these files is itself a merged journal, it must be numbered SORTIN01. Two merged journals cannot be merged. |
SORTOUT | Required to define the merged journal output file. |
SORTOUT1 through SORTOUT9 | Optional, identical SORTOUT files. |
SORTLIB and SORTMSGS | Standard SORTLIB and SORTMSGS data sets that you must specify. These are described in the documentation for the sort package that is invoked by MERGEJ. Your sort package must be an IBM, SYNCSORT, or compatible package. |
Considerations when using the MERGEJ utility
- The SORTIN files must be numbered sequentially without gaps, beginning with SORTIN01.
- If any of the input files is a merged journal, it must be numbered SORTIN01.
- You cannot merge two merged journals.
- The SORTOUT files must be numbered sequentially without gaps, beginning with SORTOUT.
- You cannot use streams as input to MERGEJ.
Use the COPY command to copy journal streams into individual data sets before using them as input to the MERGEJ utility.
- Although UTILJ can handle a CCAJLOG data set, it will only find message type records, So if you ask for record types 1-6, the output is empty, since they do not exist on the CCAJLOG.
z/OS example
Except for the SORTOUT1 line, the following z/OS JCL example is required to run the MERGEJ utility. The optional SORTOUT1 line is provided to illustrate a copy output.
//MERGEJ EXEC PGM=MERGEJ,REGION=0M //STEPLIB DD DSN=YOUR.Vnnn.LOADLIB,DISP=SHR //SORTLIB DD DSN=SYS1.SORTLIB,DISP=SHR //SORTIN01 DD DSN=YOUR.JRNL01,DISP=SHR,DCB=BUFNO=50 //SORTIN02 DD DSN=YOUR.JRNL02,DISP=SHR,DCB=BUFNO=50 //SORTIN03 DD DSN=YOUR.JRNL03,DISP=SHR,DCB=BUFNO=50 //SORTIN04 DD DSN=YOUR.JRNL04,DISP=SHR,DCB=BUFNO=50 //SORTIN05 DD DSN=YOUR.JRNL05,DISP=SHR,DCB=BUFNO=50 //SORTIN06 DD DSN=YOUR.JRNL06,DISP=SHR,DCB=BUFNO=50 //SORTOUT DD DSN=YOUR.MERGED.OUT1,DISP=OLD,DCB=BUFNO=75 //SORTOUT1 DD DSN=YOUR.MERGED.OUT2,DISP=OLD,DCB=BUFNO=75 //SORTOUT2 DD DSN=YOUR.MERGED.OUT3,DISP=OLD,DCB=BUFNO=75 //CCAPRINT DD SYSOUT=* //CCASNAP DD SYSOUT=* //SYSOUT DD SYSOUT=* //SYSUDUMP DD SYSOUT=*
Running MERGEJ in a z/VM environment
To run the MERGEJ utility under z/VM, the following data sets are required:
Data set(s) | Function |
---|---|
SORTIN01 through SORTINnn (where nn cannot exceed 32)
|
Define the input to MERGEJ. These are the journal files from the Model 204 runs that originally updated the files that are being recovered. Input files must be numbered sequentially without gaps. If any of these files is itself a merged journal, it must be numbered SORTIN01. Two merged journals cannot be merged. |
SORTOUT | Defines the single output file. |
Availability of SORT program
Under z/VM, MERGEJ requires the availability of a SORT program that can be invoked dynamically, such as SYNCSORT/CMS.
z/VM example
The MERGEJ EXEC is invoked as in the following example:
MERGEJ M204 JOURNAL 910301 M / M204 JOURNAL 910302 M / M204 MERGED JOURNAL 910302 M
where:
M204 JOURNAL 910301 M and M204 JOURNAL 910302 M are the journal files merged to M204 MERGED JOURNAL 910302 M.
Running MERGEJ in a z/VSE environment
To run the MERGEJ utility under z/VM, the following data sets are required:
Data set(s) | Function |
---|---|
SORTIN1 through
SORTINn (where nn cannot exceed 9) |
Define the input to MERGEJ. These are the journal files from the Model 204 runs that originally updated the files that are being recovered. Input files must be numbered sequentially without gaps. If any of these files is itself a merged journal, it must be numbered SORTIN1. Two merged journals cannot be merged. |
SORTOUT | Defines the single output file. |
z/VSE example
The following z/VSE JCL is required to run the MERGEJ utility:
// JOB MERGEJ // DLBL M204LIB,'M204.PROD.LIBRARY' // EXTENT ,volser // LIBDEF PHASE,SEARCH=M204LIB.R210 // TLBL SORTOUT // ASSGN SYS020,cuu // DLBL SORTIN1,'M204.CCAJRNL.1',,SD // EXTENT SYSnnn,balance of extent information // DLBL SORTIN2,'M204.CCAJRNL.2',,SD // EXTENT SYSnnn,balance of extent information // TLBL SORTIN3 // ASSGN SYS023,cuu // DLBL SORTIN4,'M204.CCAJRNL.4',,SD // EXTENT SYSnnn,balance of extent information // TLBL SORTIN5 // ASSGN SYS025,cuu // TLBL SORTIN6 // ASSGN SYS026,cuu // EXEC MERGEJ,SIZE=(AUTO,xxK) /* /&
z/VSE Notes:
- If using tape for the SORTOUT or any of the SORTINn data sets, the SYS number is fixed. The SYS number must be set to 020 for the SORTOUT data set (for example, // ASSGN SYS020,cuu), and to 020 plus the number of the SORTIN data set (for example, SORTIN1 = SYS021 and SORTIN5 = SYS025) for the SORTIN data sets. Disk data sets can use any SYS number.
- SIZE parameter on the // EXEC JCL card is required. AUTO reserves storage for the MERGEJ program, and xxK reserves storage for the SORT/MERGE program. To determine the amount of additional storage needed for the SORT/MERGE program, refer to the IBM SORT/MERGE Programmer's Guide or equivalent manual if using a sort package other than IBM.
- The return code is set on program termination so that conditional job control can be used.
Handling MERGEJ errors
Model 204 can detect a number of error conditions while the MERGEJ utility is running. These include the following situations:
Error condition | MERGEJ response to error |
---|---|
Missing input journal file(s) | Terminates without performing a merge. |
Missing output journal file | Terminates without performing a merge. |
I/O error on input or output data sets | Terminates without performing a merge. |
Merged journal used as input for other than the SORTIN01 data set | Terminates without performing a merge. |
Sequence error: a journal file was not closed properly and the EOF marker is missing | Assumes the presence of an EOF marker and proceeds with processing. |
Regenerating files in a new physical location
If you must re-create a file on another physical device or in a different location on the same device, you can either:
- Restore the file in the new location in a separate job or job step and then run media recovery using REGEN without RESTORE (see Media recovery run examples).
- Allocate the data set to contain the file and use a Model 204 CREATE command in a batch job to format that data set with the name of the file being recovered. Then, use the REGEN WITH RESTORE option.
Note
This requires that the file backup have been taken with the Model 204 DUMP command.
Media recovery using the REGENERATE command
You can use the REGENERATE command (abbreviation: REGEN) against a file that has been restored already to an earlier version in a separate job or job step.
REGEN without RESTORE
You can use the REGENERATE command without the RESTORE option to recover files that have been backed up with anything other than the Model 204 DUMP command.
REGEN WITH RESTORE
You can use the REGENERATE command to perform the restore automatically from a backup created with the Model 204 DUMP command. This option is known as REGEN WITH RESTORE.
Using the RESTORE option
To use the RESTORE option, specify a FROM dumpname clause on the REGENERATE command for the file. If the FROM clause is omitted, REGENERATE assumes that the file has been restored already. See Media recovery run examples, for the syntax of the REGENERATE command.
For REGEN without the RESTORE option, additional database validity checking occurs. When REGENERATE opens a file for which it is not performing a restore, it discontinues processing for that file if:
- File is marked physically inconsistent.
- File was created prior to Release 9.0 of Model 204.
Running media recovery
A media recovery job regenerates files within a single-user (batch) Model 204 run. The job contains a Model 204 REGENERATE command, which specifies the name(s) of the file(s) to be regenerated. The syntax of the REGENERATE command is shown In the Model 204 Parameter and Command Reference.
Data sets required to run media recovery
In addition to the standard STEPLIB, CCAIN, CCAAUDIT, and CCAPRINT data sets, the following data sets must be defined. Other data sets also might be needed as shown in the example. These data sets are described in the Rocket Model 204 System Manager's Guide.
Note
CHKPOINT data set is not required in a media recovery run.
CCAGEN data set
CCAGEN defines the journal file used as input to the run. This data set can be:
- Single journal file.
- z/OS and z/VM only: A series of concatenated journal files. As many as 255 (or 16 partitioned) data sets can be concatenated. The journal files must be concatenated through JCL (z/OS) or M204APND (z/VM). Do not use copy utilities.
Concatenated journals can be individual data sets, merged journals, or a combination of the two. The journals must be concatenated in chronological order with the oldest journal first. All journals from jobs which overlapped in time must be merged. If they are concatenated, media recovery assumes EOF prematurely on CCAGEN and does not read all journal data.
For z/VSE: Concatenated data sets are not supported under z/VSE.
- Combined journal file created by merging individual journal files using the MERGEJ utility (as many as 32 journal files can be merged for z/OS and z/VM; as many as 9 for z/VSE).
File data sets
You must define the file data sets for all files being recovered. In the examples that follow, these are shown as DB1 though DB4.
Dump data sets
If the REGEN WITH RESTORE option is used, the dump data set(s) that are input to the automatic restore must also be specified. In the example that follows, these are specified as DUMPDBn.
Note
If an INCREASE DATASETS command was issued during the recovered time interval, the data sets added to the file by the INCREASE command also must be specified.
Media recovery run examples
z/OS example
The following example of the z/OS and OS/390 JCL is required to invoke a media recovery run:
//REGEN JOB REGEN,MSGLEVEL=(1,1) //M204 EXEC PGM=BATCH204,PARM='SYSOPT=136' //STEPLIB DD DSN=M204.PROGMS,DISP=SHR //CCAJRNL DD DSN=REGEN,CCAJRNL,DISP=(NEW,CATLG,CATLG) // VOL=SER=WORK,UNIT=DISK,SPACE=(TRK,(5)) //CCAGEN DD DSN=M204.JOURNAL.910301,DISP=SHR // DD DSN=M204.JOURNAL.910302,DISP=SHR // DD DSN=M204.MERGED.JOURNAL.910302,DISP=SHR //CCAAUDIT DD SYSOUT=A //CCAPRINT DD SYSOUT=A //CCASNAP DD SYSOUT=A //CCATEMP DD DISP=NEW,UNIT=WORK,SPACE=(CYL,(5,3)) //CCASTAT DD DSN=M204.CCASTAT,DISP=SHR //CCAGRP DD DSN=M204.CCAGRP,DISP=SHR //TAPE2 DD DSN=M204.DB1.DEFERF,DISP=SHR //TAPE3 DD DSN=M204.DB1.DEFERV,DISP=SHR //DB1 DD DSN=M204.DB1,DISP=SHR //DB2 DD DSN=M204.DB2,DISP=SHR //DB3 DD DSN=M204.DB3,DISP=SHR //DB4 DD DSN=M204.DB4,DISP=SHR //DB5 DD DSN=M204.DB5,DISP=SHR //DUMPDB1 DD DSN=DUMP.DB1,DISP=SHR //DUMPDB4 DD DSN=DUMP.DB4,DISP=SHR //CCAIN DD * NFILES=10,NDIR=10,LAUDIT=1,SPCORE = 10000 * * Open the file in deferred update mode * before the REGENERATE command * OPEN DB1,TAPE2,TAPE3 REGENERATE
FILE DB1 FROM DUMPDB1 FILE DB2 TO LAST CHECKPOINT FILE DB3 TO UPDATE 3 OF 91.063 06:28:42.98 FILE DB4 FROM DUMPDB4 TO CHECKPOINT 91.063 01:05:22.99 FILE DB5 TO LAST UPDATE BEFORE 91.063 12:00:00.00 END EOJ
If the media recovery run completes successfully, Model 204 displays the message:
*** M204.1437: REGENERATE IS NOW COMPLETE
Possible error conditions are discussed beginning on Media recovery error conditions.
z/VSE example
The following example of the z/VSE JCL is required to invoke a media recovery run:
// JOB REGEN RUN MEDIA RECOVERY // DLBL M204LIB,'M204.PROD.LIBRARY' // EXTENT ,volser // LIBDEF PHASE,SEARCH=M204LIB.R210 // DLBL CCAJRNL,'M204.CCAJRNL',,SD // EXTENT SYSnnn,balance of extent information // DLBL CCAGEN,'M204.CCAJRNL.MERGED',,SD // EXTENT SYSnnn,balance of extent information // DLBL CCATEMP,'M204.CCATEMP',,DA // EXTENT SYSnnn,balance of extent information // DLBL CCASTAT,'M204.CCASTAT',,SD // EXTENT SYSnnn,balance of extent information // DLBL CCAGRP,'M204.CCAGRP',,DA // EXTENT SYSnnn,balance of extent information // TLBL SYSjjj,'M204.DB1.DEFERF' // ASSGN SYSjjj,TAPE // TLBL SYSkkk,'M204.DB1.DEFERV' // ASSGN SYSkkk,TAPE // DLBL DB1,'M204.DB1',,DA // EXTENT SYSnnn,balance of extent information // DLBL DB2,'M204.DB2',,DA // EXTENT SYSnnn,balance of extent information // DLBL DB3,'M204.DB3',,DA // EXTENT SYSnnn,balance of extent information // DLBL DB4,'M204.DB4',,DA // EXTENT SYSnnn,balance of extent information // DLBL DB5,'M204.DB5',,DA // EXTENT SYSnnn,balance of extent information // DLBL DUMPDB1,'M204.DUMP.DB1',,SD // EXTENT SYSnnn,balance of extent information // DLBL DUMPDB4,'M204.DUMP.DB4',,SD // EXTENT SYSnnn,balance of extent information // EXEC BATCH204,SIZE=AUTO NFILES=10,NDIR=10,LAUDIT=1,MAXBUF=20 DEFINE DATASET TAPEDB1F WITH SCOPE=SYSTEM - DDNAME=SYSjjj RECFM=FB LRECL=24 BLKSIZE=6000 DEFINE DATASET TAPEDB1V WITH SCOPE=SYSTEM - DDNAME=SYSkkk RECFM=VB LRECL=270 BLKSIZE=6000 * * Open the file in deferred update mode * before the REGENERATE command * OPEN DB1,TAPEDB1F,TAPEDB1V REGENERATE FILE DB1 FROM DUMPDB1 FILE DB2 TO LAST CHECKPOINT FILE DB3 TO UPDATE 3 OF 91.063 06:28:42.98 FILE DB4 FROM DUMPDB4 TO CHECKPOINT 91.063 01:05:22.99 FILE DB5 TO LAST UPDATE BEFORE 91.063 12:00:00.00 END EOJ
z/VM example
For z/VM, invoke the REGEN EXEC by entering the following command:
ONLINE BYPASS REGEN
The REGEN EXEC follows:
&CONTROL OFF FILEDEF CCAPRINT DISK REGEN CCAPRINT A FILEDEF CCAAUDIT DISK REGEN CCAAUDIT A FILEDEF CCATEMP M DSN WORK CCATEMP FILEDEF CCASNAP PRINTER FILEDEF CCAJRNL M DSN M204 JOURNAL FILEDEF CCAGEN M DSN M204 MERGED JOURNAL FILEDEF TAPE2 M DSN M204 DB1 DEFERF FILEDEF TAPE3 M DSN M204 DB1 DEFERV FILEDEF DB1 M DSN M204 DB1 FILEDEF DB2 M DSN M204 DB2 FILEDEF DB3 M DSN M204 DB3 FILEDEF DB4 M DSN M204 DB4 FILEDEF DB5 M DSN M204 DB5 FILEDEF DUMPDB1 M DSN M204 DUMP DB1 FILEDEF DUMPDB4 M DSN M204 DUMP DB4 FILEDEF CCAIN DISK REGEN CCAIN A &STACK SYSOPT 128 LIBUFF 600
where the CCAIN file, REGEN CCAIN, is:
NFILES=10,NDIR=10,LAUDIT=1,SPCORE=10000 * * Open the file in deferred update mode * before the REGENERATE command * OPEN DB1,TAPE2,TAPE3 REGENERATE FILE DB1 FROM DUMPDB1 FILE DB2 TO LAST CHECKPOINT FILE DB3 TO UPDATE 3 OF 91.063 06:28:42.98 FILE DB4 FROM DUMPDB4 TO CHECKPOINT 91.063 01:05:22.85 FILE DB5 TO LAST UPDATE BEFORE 91.063 12:00:00.00 END EOJ
REGENERATE command processing
REGENERATE command processing consists of the following steps:
- Model 204 parses the REGENERATE command and detects any syntax errors. If an error is encountered, the command is rejected and no further processing is performed.
- Model 204 performs a restore from the specified dump data set for any files where a FROM clause is specified on the REGENERATE command.
- Model 204 processes the input journal file:.
REGENERATE scans the input journal and reapplies all updates that were originally made to the files between the starting and stopping points. If any errors are detected, Model 204 terminates processing for the file(s) affected by the error.
REGENERATE examples
The starting point is the first start of an update unit that occurred after the "last updated" time stamp in the restored version of the database.
For example, Site A merges the daily journals into a Saturday-to-Saturday weekly collection and also performs full pack backups periodically during the week. A disk failure occurs on Thursday afternoon. To recover (after fixing the drive or moving the database to another location):
- Restore the volume from any one of the full volume backups (presumably the most recent, however, if that did not work for some reason, another backup).
- Run REGENERATE with CCAGEN as either the weekly merged journal or the weekly merged journal with the same Thursday's Online journal concatenated or merged into it.
- Model 204 examines the database and finds the date and time of the last update unit. which completed before the backup was taken. This is the time that appears in the following message, when the file is opened with update privileges:
M204.1203: FILE WAS LAST UPDATED ON ...
- Model 204 then scans the input journals (CCAGEN), and looks for the first update unit that began after this time. This becomes the "switch" that turns on the application of updates.
The starting point does not depend on a specific entry in the journal, and the database can be restored from any backup taken since the previous Saturday without affecting the media recovery run.
Also, for example, suppose that a file manager at Site A accidentally ran a suite of month-end updating procedures yesterday afternoon. The system manager needs to return the database back to the state it was before the month-end procedures were run.
The system manager can use the site's standard media recovery JCL, which restores the database from the last backup before yesterday's Online, and find the approximate time that the file manager made the mistake. This time is then used in the REGENERATE command. For example:
REGEN FILE X TO LAST UPDATE BEFORE 95.088 12:00:00.00
Model 204 reapplies all updates that were part of update units completed before the time specified in this command.
Parameter settings
The following parameters must normally be set on User 0's parameter line for a media recovery job:
Parameter | Specifies the number of... |
---|---|
NFILES | File save areas to allocate. NFILES must be at least as large as the number of files to be regenerated in a single media recovery run. |
NDIR | File directory entries. NDIR must be at least as large as the number of files to be regenerated in a single media recovery run. |
Remember that additional files might be needed if the run performs other functions in addition to the REGENERATE.
Database consistency
Files recovered through the media recovery feature might not be logically consistent with other files. Each file specified in a REGENERATE command is recovered independently of other files specified in the same command. Therefore, Rocket Software recommends that you recover files that are logically connected (such as the DICTIONARY METADATA and DATALINK files) at the same time, and that you specify the same stopping point for all connected files whenever one file requires media recovery.
File discontinuities
The handling of discontinuities depends on their type and location. During media recovery, if Model 204 detects a discontinuity between the starting and stopping points for a file, that file is not recovered and an error message is displayed as follows:
*** M204.1407: discontinuity type DISCONTINUITY OCCURRED AT dd mmm yyyy hh:mm:ss.th FOR FILE filename
RESTART recovery discontinuity logging
RESTART (ROLL BACK/ROLL FORWARD) recovery logs discontinuities on completion for each file that it processes. These discontinuity logs record what was done to the file during the recovery, such as:
- File was rolled back to a checkpoint
- File was rolled back to a discontinuity
- File was rolled forward to an update unit
REGENERATE uses this information to allow it to cross RESTART recovery discontinuities.
Discontinuities that stop roll forward
The following discontinuities, if they appear between a file's starting and stopping points, cause the file to be deactivated. If you must perform any of the following functions, take a new backup immediately after the discontinuity is created, to ensure that no period of time elapses for which file updates cannot be recovered through media recovery:
- CREATE FILE command
- CREATEG command
- INITIALIZE command
- File is updated by a second job
- REGENERATE command (media recovery)
- RESTORE command
- RESTOREG command
- RESET FISTAT from physically broken
- RESET FRCVOPT if recovery logging is changed
- System initialization
Discontinuities that do not stop roll forward
The following discontinuities do not cause media recovery to fail for that file:
- Discontinuities that appear before a file's starting point or after its final stopping point.
- Discontinuities logged because two jobs updating the same file overlapped in time (provided that both jobs' journals are merged and present in CCAGEN). If journals from different jobs contain overlapping records, they must be merged using the MERGEJ utility. For information about using MERGEJ, see Merging journals.
- Restart recovery discontinuities.
Deferred update mode
You can run media recovery for files that were updated in deferred update mode during part or all of the time period you specify in the REGENERATE command. Be sure to open such files in deferred update mode prior to the REGENERATE command, and Model 204 defers the index updates as in any other batch job.
For more information about deferring updates see Deferred update feature.
Additional operation considerations
Because some file discontinuities cannot be "crossed" by media recovery (that is, media recovery fails for the file if such a discontinuity appears between the file's starting and stopping points), take new file backups whenever one of these discontinuities is created. The list of actions that create these discontinuities appears in File discontinuities.
Take new file backups whenever a file is updated by the Model 204 File Load Utility (FLOD and FILELOAD commands), or by IFAM1, because neither writes a log of file updates to the journal. For information about recovery under IFAM1, see the Rocket Model 204 Host Language Interface Programming Guide.
Media recovery cannot "cross" RESTART recovery discontinuities unless journals created by those recovery runs are included in CCAGEN. Archive journals created by recovery jobs just as you would any other updating job's journal.
Media recovery error conditions
During media recovery, Model 204 detects and reports a variety of error conditions, which include:
- Syntax errors
- Individual file errors
- Critical errors
These major types of error conditions are described in the following sections.
Syntax errors
If Model 204 detects a syntax error in a REGENERATE command, it rejects the command and does not proceed with the media recovery run. In addition to the specific syntax message, the following message is displayed:
*** M204.1410: REGENERATE COMMAND REJECTED
Correct the command syntax and rerun the media recovery job.
Model 204 reports on syntax errors such as missing or invalid keywords or phrases. For example:
REGENERATE DB1 FROM DUMPDB1 FILE keyword missing
or:
REGENERATE FILE DB1 TO LST UPDATE BEFORE 90.296 12:00:00.00 LAST keyword misspelled
Syntax errors in multiple file recovery runs
A syntax error is reported if the improper syntax is specified for a multiple-file recovery run. For example:
REGEN FILE DB1 FROM DUMPDB1 FILE DB2 FROM DUMPDB2 END
Individual file errors
If the media recovery run is being performed for multiple files and one or more individual files cannot be recovered, Model 204 still attempts to recover the remaining files. Some of the most likely file errors are listed here.
If an individual file error occurs, attempt to correct the file condition and rerun media recovery for any files not recovered by the run.
Note
User Language requests, procedures, or system commands that follow the REGENERATE command are run despite individual file errors. You might want to verify that a file has been regenerated correctly before performing any updates.
Errors in DD statements or FILEDEFs
During the process of restoring a dumped version of a file, errors can occur if the DD statements or FILEDEFs provided in the JCL or EXEC for the run do not correspond to the names of the dump data sets specified in the REGENERATE command. Be sure to provide definitions for all dump data sets specified in the media recovery job. If definitions do not correspond to DD statements or FILEDEFs, locate the missing dump data sets, correct the JCL or EXEC, and rerun the job for the affected files.
Setting NDIR and NFILES parameters too low
Errors also can occur if the NDIR or NFILES parameters are set too low on User 0's parameter line in the media recovery run:
- Set NDIR (number of file directory entries) to greater than or equal to the total number of files to be regenerated during the run. If additional files are to be opened during the run, increase the value of NDIR by the number of additional files.
- Set NFILES (number of file save areas) to greater than or equal to the total number of files to be regenerated by one REGENERATE command. If additional files are to be opened during the run, increase the value of NFILES to be greater than or equal to the largest number of files to be open simultaneously.
Errors specifying stopping points
Model 204 might not be able to find the stopping point for a specified file. In general, this means that a journal was omitted from the run, or the stopping point was specified incorrectly. Check the journal and audit trails to determine the correct stopping point. The AUDIT204 utility can be run for the journals being used to help determine the correct ID for the stopping point.
Missing journal errors
Model 204 might detect gaps in the journal. This occurs if the file was updated but no journal file was included for the specified time period. When this happens, Model 204 issues the following error message:
*** M204.1406: MISSING JOURNAL WAS DETECTED BETWEEN dd mmm yyy hh:mm:ss.th AND dd mmm yyy hh:mm:ss.th FOR FILE filename
Try to locate the missing journal(s) and rerun the job. If the missing journals are unavailable, you can partially recover the affected file by specifying a stopping point for that file that is prior to the gap in the journal.
Missing deferred update data set errors
Model 204 might detect that a file was in deferred update mode during part or all of the time period for which you are regenerating updates. If this happens, Model 204 issues the following error message:
*** M204.0169: BUG .. WHILE REAPPLYING TYPE 06 RF ENTRY FROM UPDATE UNIT nn TO FILE filename
Modify the media recovery job or EXEC to open the file in deferred update mode before the REGENERATE command.
For more information about deferring updates, see Deferred update feature.
Critical errors
Critical error conditions cause the entire media recovery run to terminate abnormally. The major error conditions in this category are discussed in the following sections.
EOF marker errors
During media recovery, Model 204 verifies that the journal blocks in the input journal data set (CCAGEN) and that they are in chronological order. If a sequence error occurs, it indicates that the end of a journal was reached but that an EOF marker was not written. If a series of journals were concatenated, the UTILJ utility can be run to write an EOF marker on any journals that were not closed properly. Media recovery can then be rerun.
The UTILJ utility is described in the Rocket Model 204 System Manager's Guide.
Concatenation errors
If journals are concatenated in the wrong order, Model 204 cannot process the CCAGEN data set. Ensure that the concatenated input journals are in chronological order. If journals from different jobs contain overlapping records, they must be merged using the MERGEJ utility.
For information about using MERGEJ, see Merging journals.
Other sequencing errors
Similarly, if a sequence error is encountered during the first read of a volume in a multi-volume journal, the operator is notified and is allowed either to remount the correct volume and try again or to cancel the run.