File Load utility

From m204wiki
Revision as of 02:27, 12 December 2014 by JAL (talk | contribs) (more conversion cleanup)
Jump to navigation Jump to search

File Load utility overview

When a sequential data set contains the raw data for records to be stored in a Model 204 file, the File Load utility generally provides the most efficient means of loading that data into Model 204.

File Load language

The heart of this utility is a file load program that is constructed out of file load statements to describe the raw data to Model 204. The file load statements are elements of a separate file load language and are described in detail starting with the section, .

The file load language provides limited programming capability. If complicated edits or large amounts of data manipulation are required, consider using FLOD exits, a pre-processing program, or a host language or SOUL program to load the file.

Input data for the File Load utility

The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a Model 204 file by using special file load statements, which convert this data to character string format.

Attention: Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run.

Building a Large Object descriptor

If you need to build the Large Object descriptor yourself (for example, for an initial load of a file that contains Large Objects) the descriptor must be built correctly. It is exactly 32 bytes long, as follows.

Bytes Contains
0-3 Value X'1B800000'
4-87 Value X'00000000'
8-18 An 8-byte field containing a floating point value that is the number of bytes of Large Object data that follow; that is, the actual size of the Large Object value.
16-23 An 8-byte field containing a floating point value that is the number of bytes reserved for this Large Object; that is, either the RESERVE value, or zero.
25-28 Value X'000000FF'
28-31 A binary value X'000000FF'
32-35 A binary value that indicates the number of bytes of Large Object data on the following input lines; that is, all lines except possibly the last will contain this much data.

File load methods

The File Load utility is a two-phase process of Model 204 file table updating. The two phases are:

  1. Loading Table A and Table B and deferring the Table C and Table D updates to the deferred update data set or data sets.
  2. Sorting and then applying the deferred index updates in Tables C and D.

The File Load utility's two phases can be run in two ways: as a multistep job or as a single-step job:

  • In a multistep File Load, the two phases are run in three to seven separate job steps (under z/OS and z/VSE). The exact number of job steps in the multistep File Load depends on the attributes of the fields being updated and the number of deferred index update data sets specified in the JCL.
  • In a one-step File Load, the two phases are run as one job step in which the second phase is invoked automatically.

Operating system dependencies

The one-step File Load is not available under z/VSE.

Under z/VM, both the multistep File Load and the one-step File Load are run by a single command (FASTLOAD) that drives a user-written EXEC through the desired number of steps.

Comparing File Load methods

The table below compares the multistep File Load with the one-step File Load:

  • The multistep File Load has between three and seven separate job steps and all the index updates are deferred to a variable-length deferred update data set.
  • The one-step File Load performs the same work as the multistep File Load, but it requires only one job step.

For purposes of comparison, this work is divided into five tasks that correspond to the five job steps of the multistep File Load.

Multistep file load vs. one-step file load
Phase Job step or task Multistep method task One-step method task
1 1 File load program loads and/or updates records and creates a variable-length deferred update data set for all index entries As the file load program creates a variable-length deferred update record, that record is passed directly to an automatically invoked sort utility.
2 2 Sort package is invoked to sort the variable-length deferred update data set. Deferred update records are sorted automatically.
2 3 Sorted deferred update data set is read by the Z command, which updates the index and writes any FRV entries into a fixed-length deferred update data set. Output from the sort is passed to the Z command. The Z command is issued automatically.
2 4 Sort package is invoked to sort the fixed-length FRV deferred update data set. Z command accepts each deferred update record and applies the index entries to the file. If the Z command encounters an index entry for an FRV field, the Z command builds an FRV deferred update record and passes the record to a second sort process.
2 5 Second Z command is issued. It adds the sorted FRV deferred updates to the index. Second Z command is automatically issued after the first Z command completes. The second Z command adds the deferred FRV entries to the index.

Using one-step File Load

For small amounts of data, the one-step File Load can be much more efficient than the multistep procedure. The major reason for this efficiency is that each deferred update record has four fewer I/O operations performed on it in the one-step File Load. The one-step File Load bypasses:

  • Output from the File Load program
  • Input to the sort utility
  • Output from the sort utility
  • Input to the Z commands

Another reason is that by using the one-step File Load, the sort key is shorter if the file load program performs no deletions. (This is true only if no ORDERED fields are updated in the one-step File Load.) In addition, the one-step method is operationally easier, because only one job step is involved. However, a one-step File Load can be less efficient for large amounts of data because there is less memory available for sorting.

Using multistep File Load

The advantages of the multistep procedure are:

  • If a partially completed load aborts, there are at least two and as many as six points from which processing can be resumed, if dumps are taken after each file updating step. When using the one-step File Load, the load cannot be restarted at the point of failure. The load must be repeated from the beginning.
  • Multistep load can be run in a much smaller memory region, because the first phase does not need to reserve spare core for two copies of the sort program and the sort work areas for each.
  • CPU usage is considerably less than for the one-step load.
  • Because the sort(s) run in separate steps, much more memory can be made available to the sort. This can dramatically improve sort performance.