File Load utility: Difference between revisions
(Automatically generated page update) |
mNo edit summary |
||
Line 7: | Line 7: | ||
<p>The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a <var class="product">Model 204</var> file by using special file load statements, which convert this data to character string format. Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use either system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run. </p> | <p>The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a <var class="product">Model 204</var> file by using special file load statements, which convert this data to character string format. Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use either system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run. </p> | ||
<p>Warning</p> | <p>Warning</p> | ||
<p>Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use either system recovery nor media recovery to reapply updates from a File Load run. In general, always | <p>Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use either system recovery nor media recovery to reapply updates from a File Load run. In general, always backup the files just before a File Load run. </p> | ||
===Building a Large Object descriptor=== | ===Building a Large Object descriptor=== | ||
<p>If you need to build the Large Object descriptor yourself-for example, for an initial load of a file that contains Large Objects-the descriptor must be built correctly. It is exactly 32 bytes long, as follows.</p> | <p>If you need to build the Large Object descriptor yourself-for example, for an initial load of a file that contains Large Objects-the descriptor must be built correctly. It is exactly 32 bytes long, as follows.</p> | ||
Line 126: | Line 126: | ||
<p> </p> | <p> </p> | ||
[[Category:File manager]] | [[Category:File manager]] | ||
[[Category:File management]] |
Revision as of 20:32, 30 March 2013
File Load utility overview
When a sequential data set contains the raw data for records to be stored in a Model 204 file, the File Load utility generally provides the most efficient means of loading that data into Model 204.
File Load language
The heart of this utility is a file load program that is constructed out of file load statements to describe the raw data to Model 204. The file load statements are elements of a separate file load language and are described in detail starting with the section, .
The file load language provides limited programming capability. If complicated edits or large amounts of data manipulation are required, consider using FLOD Exits, a preprocessing program, or a host language or User Language program to load the file.
Input data for the File Load utility
The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a Model 204 file by using special file load statements, which convert this data to character string format. Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use either system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run.
Warning
Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use either system recovery nor media recovery to reapply updates from a File Load run. In general, always backup the files just before a File Load run.
Building a Large Object descriptor
If you need to build the Large Object descriptor yourself-for example, for an initial load of a file that contains Large Objects-the descriptor must be built correctly. It is exactly 32 bytes long, as follows.
Bytes | Contains |
---|---|
0-3 | Value X'1B800000' |
4-87 | Value X'00000000' |
8-18 | An 8-byte field containing a floating point value that is the number of bytes of Large Object data that follow; that is, the actual size of the Large Object value. |
16-23 | An 8-byte field containing a floating point value that is the number of bytes reserved for this Large Object; that is, either the RESERVE value, or zero. |
25-28 | Value X'000000FF' |
28-31 | A binary value X'000000FF' |
32-35 | A binary value that indicates the number of bytes of Large Object data on the following input lines; that is, all lines except possibly the last will contain this much data. |
File load methods
The File Load utility is a two-phase process of Model 204 file table updating. The two phases are:
- Loading Table A and Table B and deferring the Table C and Table D updates to the deferred update data set or data sets.
Sorting, and then applying the deferred index updates in Tables C and D.
Two ways to run File Load shows the relationships between these tasks.
The File Load utility's two phases can be run in two ways; as either a multistep or single-step job.
Two ways to run File Load
The File Load utility's two phases can be run in two ways: as either a multistep or single-step job.
Multistep File Load
In a multistep File Load, the two phases are run in three to seven separate job steps (under z/OS and z/VSE). The exact number of job steps in the multistep File Load depends on the attributes of the fields being updated and the number of deferred index update data sets specified in the JCL.
One-step File Load
In a one-step File Load, the two phases are run as one job step in which the second phase is invoked automatically.
Operating system dependencies
The one-step File Load is not available under z/VSE.
Under z/VM, both the multistep File Load and the one-step File Load are run by a single command (FASTLOAD) that drives a user-written EXEC through the desired number of steps.
Comparing File Load methods
Two ways to run File Load compares the multistep File Load with the one-step File Load:
- Multistep File Load has between three and seven separate job steps and all the index updates are deferred to a variable-length deferred update data set.
- One-step File Load performs the same work as the multistep File Load, but it requires only one job step.
For purposes of comparison, this work is divided into five tasks that correspond to the five job steps of the multistep File Load.
Phase | Job step or task | Multistep method task | One-step method task |
---|---|---|---|
1 | 1 | File load program loads and/or updates records and creates a variable-length deferred update data set for all index entries | As the file load program creates a variable-length deferred update record, that record is passed directly to an automatically invoked sort utility. |
2 | 2 | Sort package is invoked to sort the variable-length deferred update data set. | Deferred update records are sorted automatically. |
2 | 3 | Sorted deferred update data set is read by the Z command (see "Z command"), which updates the index and writes any FRV entries into a fixed-length deferred update data set. | Output from the sort is passed to the Z command. The Z command is issued automatically. |
2 | 4 | Sort package is invoked to sort the fixed-length FRV deferred update data set. | Z command accepts each deferred update record and applies the index entries to the file. If the Z command encounters an index entry for an FRV (for-each-value) field, the Z command builds an FRV deferred update record and passes the record to a second sort process. |
2 | 5 | Second Z command is issued. It adds the sorted FRV deferred updates to the index. | Second Z command is automatically issued after the first Z command completes. The second Z command adds the deferred FRV entries to the index. |
Using one-step File Load
For small amounts of data, the one-step File Load can be much more efficient than the multistep procedure. The major reason for this efficiency is that each deferred update record has four fewer I/O operations performed on it in the one-step File Load. The one-step File Load bypasses:
- Output from the file load program
- Input to the sort utility
- Output from the sort utility
- Input to the Z commands
Another reason is that by using the one-step File Load, the sort key is shorter if the file load program performs no deletions. (This is true only if no ORDERED fields are updated in the one-step File Load.) In addition, the one-step method is operationally easier, because only one job step is involved. However, a one-step File Load can be less efficient for large amounts of data because there is less memory available for sorting.
Using multistep File Load
The advantages of the multistep procedure are:
- If a partially completed load aborts, there are at least two and as many as six points from which processing can be resumed, if dumps are taken after each file updating step. When using the one-step File Load, the load cannot be restarted at the point of failure. The load must be repeated from the beginning.
- Multistep load can be run in a much smaller memory region, because the first phase does not need to reserve spare core for two copies of the sort program and the sort work areas for each.
- CPU usage is considerably less than for the one-step load.
- Because the sort(s) run in separate steps, much more memory can be made available to the sort. This can dramatically improve sort performance.