File Load utility: Difference between revisions
No edit summary |
mNo edit summary |
||
Line 1: | Line 1: | ||
==File Load utility overview== | ==File Load utility overview== | ||
<p>When a sequential data set contains the raw data for records to be stored in a <var class="product">Model 204</var> file, the File Load utility generally provides the most efficient means of loading that data into <var class="product">Model 204</var>.</p> | <p> | ||
When a sequential data set contains the raw data for records to be stored in a <var class="product">Model 204</var> file, the File Load utility generally provides the most efficient means of loading that data into <var class="product">Model 204</var>.</p> | |||
===File Load language=== | ===File Load language=== | ||
<p>The heart of this utility is a file load program that is constructed out of file load statements to describe the raw data to <var class="product">Model 204</var>. The file load statements are elements of a separate file load language and are described in detail starting with the section, .</p> | <p> | ||
<p>The file load language provides limited programming capability. If complicated edits or large amounts of data manipulation are required, consider using FLOD | The heart of this utility is a file load program that is constructed out of file load statements to describe the raw data to <var class="product">Model 204</var>. The file load statements are elements of a separate file load language and are described in detail starting with the section, .</p> | ||
<p> | |||
The file load language provides limited programming capability. If complicated edits or large amounts of data manipulation are required, consider using <var>FLOD</var> exits, a pre-processing program, or a host language or [[SOUL]] program to load the file. </p> | |||
===Input data for the File Load utility=== | ===Input data for the File Load utility=== | ||
<p>The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a <var class="product">Model 204</var> file by using special file load statements, which convert this data to character string format. | <p> | ||
<p> | The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a <var class="product">Model 204</var> file by using special file load statements, which convert this data to character string format. </p> | ||
<p class="warn"><b>Attention:</b> | |||
Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run. </p> | |||
===Building a Large Object descriptor=== | ===Building a Large Object descriptor=== | ||
<p>If you need to build the Large Object descriptor yourself (for example, for an initial load of a file that contains Large Objects) the descriptor must be built correctly. It is exactly 32 bytes long, as follows.</p> | <p> | ||
If you need to build the Large Object descriptor yourself (for example, for an initial load of a file that contains Large Objects) the descriptor must be built correctly. It is exactly 32 bytes long, as follows.</p> | |||
<table> | <table> | ||
<tr class="head"> | <tr class="head"> | ||
Line 15: | Line 24: | ||
<th>Contains</th> | <th>Contains</th> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">0-3 </td> | <td align="right">0-3 </td> | ||
<td>Value X'1B800000'</td> | <td>Value X'1B800000'</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">4-87</td> | <td align="right">4-87</td> | ||
<td>Value X'00000000'</td> | <td>Value X'00000000'</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">8-18 </td> | <td align="right">8-18 </td> | ||
<td>An 8-byte field containing a floating point value that is the number of bytes of Large Object data that follow; that is, the actual size of the Large Object value.</td> | <td>An 8-byte field containing a floating point value that is the number of bytes of Large Object data that follow; that is, the actual size of the Large Object value.</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">16-23 </td> | <td align="right">16-23 </td> | ||
<td>An 8-byte field containing a floating point value that is the number of bytes reserved for this Large Object; that is, either the RESERVE value, or zero.</td> | <td>An 8-byte field containing a floating point value that is the number of bytes reserved for this Large Object; that is, either the RESERVE value, or zero.</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">25-28 </td> | <td align="right">25-28 </td> | ||
<td>Value X'000000FF'</td> | <td>Value X'000000FF'</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">28-31 </td> | <td align="right">28-31 </td> | ||
<td>A binary value X'000000FF'</td> | <td>A binary value X'000000FF'</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">32-35</td> | <td align="right">32-35</td> | ||
Line 46: | Line 62: | ||
==File load methods== | ==File load methods== | ||
<p>The File Load utility is a two-phase process of <var class="product">Model 204</var> file table updating. The two phases are:</p> | <p> | ||
The File Load utility is a two-phase process of <var class="product">Model 204</var> file table updating. The two phases are:</p> | |||
<ol> | <ol> | ||
<li>Loading Table A and Table B and deferring the Table C and Table D updates to the deferred update data set or data sets.</li> | <li>Loading Table A and Table B and deferring the Table C and Table D updates to the deferred update data set or data sets.</li> | ||
<li>Sorting, and then applying the deferred index updates in Tables C and D.</li> | |||
</ol> | </ol> | ||
< | <p> | ||
[[#Two ways to run File Load|Two ways to run File Load]] shows the relationships between these tasks.</p> | |||
<p>The File Load utility's two phases can be run in two ways | <p> | ||
The File Load utility's two phases can be run in two ways: as a multistep job or as a single-step job.</p> | |||
===Two ways to run File Load=== | ===Two ways to run File Load=== | ||
<p>The File Load utility's two phases can be run in two ways: as either a multistep or single-step job.</p> | <p> | ||
The File Load utility's two phases can be run in two ways: as either a multistep or single-step job.</p> | |||
<b>Multistep File Load</b> | <b>Multistep File Load</b> | ||
<p>In a multistep File Load, the two phases are run in three to seven separate job steps (under z/OS and z/VSE). The exact number of job steps in the multistep File Load depends on the attributes of the fields being updated and the number of deferred index update data sets specified in the JCL.</p> | <p> | ||
In a multistep File Load, the two phases are run in three to seven separate job steps (under z/OS and z/VSE). The exact number of job steps in the multistep File Load depends on the attributes of the fields being updated and the number of deferred index update data sets specified in the JCL.</p> | |||
<b>One-step File Load</b> | <b>One-step File Load</b> | ||
<p>In a one-step File Load, the two phases are run as one job step in which the second phase is invoked automatically.</p> | <p> | ||
In a one-step File Load, the two phases are run as one job step in which the second phase is invoked automatically.</p> | |||
<b>Operating system dependencies</b> | <b>Operating system dependencies</b> | ||
<p>The one-step File Load is not available under z/VSE. </p> | <p> | ||
<p>Under z/VM, both the multistep File Load and the one-step File Load are run by a single command (FASTLOAD) that drives a user-written EXEC through the desired number of steps. </p> | The one-step File Load is not available under z/VSE. </p> | ||
<p> | |||
Under z/VM, both the multistep File Load and the one-step File Load are run by a single command (FASTLOAD) that drives a user-written EXEC through the desired number of steps. </p> | |||
<b>Comparing File Load methods</b> | <b>Comparing File Load methods</b> | ||
<p>[[#Two ways to run File Load|Two ways to run File Load]] compares the multistep File Load with the one-step File Load: </p> | <p>[[#Two ways to run File Load|Two ways to run File Load]] compares the multistep File Load with the one-step File Load: </p> | ||
<ul> | <ul> | ||
<li>Multistep File Load has between three and seven separate job steps and all the index updates are deferred to a variable-length deferred update data set. </li> | <li>Multistep File Load has between three and seven separate job steps and all the index updates are deferred to a variable-length deferred update data set. </li> | ||
<li>One-step File Load performs the same work as the multistep File Load, but it requires only one job step. </li> | <li>One-step File Load performs the same work as the multistep File Load, but it requires only one job step. </li> | ||
</ul> | </ul> | ||
<p>For purposes of comparison, this work is divided into five tasks that correspond to the five job steps of the multistep File Load. </p> | <p> | ||
For purposes of comparison, this work is divided into five tasks that correspond to the five job steps of the multistep File Load. </p> | |||
<table> | <table> | ||
<caption>Multistep file load vs. one-step file load</caption> | <caption>Multistep file load vs. one-step file load</caption> | ||
Line 77: | Line 109: | ||
<th>One-step method task</th> | <th>One-step method task</th> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">1</td> | <td align="right">1</td> | ||
Line 83: | Line 116: | ||
<td>As the file load program creates a variable-length deferred update record, that record is passed directly to an automatically invoked sort utility.</td> | <td>As the file load program creates a variable-length deferred update record, that record is passed directly to an automatically invoked sort utility.</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">2</td> | <td align="right">2</td> | ||
Line 89: | Line 123: | ||
<td>Deferred update records are sorted automatically.</td> | <td>Deferred update records are sorted automatically.</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">2</td> | <td align="right">2</td> | ||
Line 95: | Line 130: | ||
<td>Output from the sort is passed to the Z command. The Z command is issued automatically.</td> | <td>Output from the sort is passed to the Z command. The Z command is issued automatically.</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">2</td> | <td align="right">2</td> | ||
Line 101: | Line 137: | ||
<td>Z command accepts each deferred update record and applies the index entries to the file. If the Z command encounters an index entry for an FRV (for-each-value) field, the Z command builds an FRV deferred update record and passes the record to a second sort process.</td> | <td>Z command accepts each deferred update record and applies the index entries to the file. If the Z command encounters an index entry for an FRV (for-each-value) field, the Z command builds an FRV deferred update record and passes the record to a second sort process.</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td align="right">2</td> | <td align="right">2</td> | ||
Line 108: | Line 145: | ||
</tr> | </tr> | ||
</table> | </table> | ||
===Using one-step File Load=== | ===Using one-step File Load=== | ||
<p>For small amounts of data, the one-step File Load can be much more efficient than the multistep procedure. The major reason for this efficiency is that each deferred update record has four fewer I/O operations performed on it in the one-step File Load. The one-step File Load bypasses:</p> | <p> | ||
For small amounts of data, the one-step File Load can be much more efficient than the multistep procedure. The major reason for this efficiency is that each deferred update record has four fewer I/O operations performed on it in the one-step File Load. The one-step File Load bypasses:</p> | |||
<ul> | <ul> | ||
<li>Output from the file load program</li> | <li>Output from the file load program</li> | ||
Line 116: | Line 155: | ||
<li>Input to the Z commands </li> | <li>Input to the Z commands </li> | ||
</ul> | </ul> | ||
<p>Another reason is that by using the one-step File Load, the sort key is shorter if the file load program performs no deletions. (This is true only if no ORDERED fields are updated in the one-step File Load.) In addition, the one-step method is operationally easier, because only one job step is involved. However, a one-step File Load can be less efficient for large amounts of data because there is less memory available for sorting.</p> | <p> | ||
Another reason is that by using the one-step File Load, the sort key is shorter if the file load program performs no deletions. (This is true only if no ORDERED fields are updated in the one-step File Load.) In addition, the one-step method is operationally easier, because only one job step is involved. However, a one-step File Load can be less efficient for large amounts of data because there is less memory available for sorting.</p> | |||
===Using multistep File Load=== | ===Using multistep File Load=== | ||
<p>The advantages of the multistep procedure are:</p> | <p> | ||
The advantages of the multistep procedure are:</p> | |||
<ul> | <ul> | ||
<li>If a partially completed load aborts, there are at least two and as many as six points from which processing can be resumed, if dumps are taken after each file updating step. When using the one-step File Load, the load cannot be restarted at the point of failure. The load must be repeated from the beginning.</li> | <li>If a partially completed load aborts, there are at least two and as many as six points from which processing can be resumed, if dumps are taken after each file updating step. When using the one-step File Load, the load cannot be restarted at the point of failure. The load must be repeated from the beginning.</li> | ||
<li>Multistep load can be run in a much smaller memory region, because the first phase does not need to reserve spare core for two copies of the sort program and the sort work areas for each.</li> | <li>Multistep load can be run in a much smaller memory region, because the first phase does not need to reserve spare core for two copies of the sort program and the sort work areas for each.</li> | ||
<li>CPU usage is considerably less than for the one-step load.</li> | <li>CPU usage is considerably less than for the one-step load.</li> | ||
<li>Because the sort(s) run in separate steps, much more memory can be made available to the sort. This can dramatically improve sort performance. </li> | <li>Because the sort(s) run in separate steps, much more memory can be made available to the sort. This can dramatically improve sort performance. </li> | ||
</ul> | </ul> | ||
[[Category:File loading and reorganization]] | [[Category:File loading and reorganization]] |
Revision as of 01:16, 12 December 2014
File Load utility overview
When a sequential data set contains the raw data for records to be stored in a Model 204 file, the File Load utility generally provides the most efficient means of loading that data into Model 204.
File Load language
The heart of this utility is a file load program that is constructed out of file load statements to describe the raw data to Model 204. The file load statements are elements of a separate file load language and are described in detail starting with the section, .
The file load language provides limited programming capability. If complicated edits or large amounts of data manipulation are required, consider using FLOD exits, a pre-processing program, or a host language or SOUL program to load the file.
Input data for the File Load utility
The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a Model 204 file by using special file load statements, which convert this data to character string format.
Attention: Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run.
Building a Large Object descriptor
If you need to build the Large Object descriptor yourself (for example, for an initial load of a file that contains Large Objects) the descriptor must be built correctly. It is exactly 32 bytes long, as follows.
Bytes | Contains |
---|---|
0-3 | Value X'1B800000' |
4-87 | Value X'00000000' |
8-18 | An 8-byte field containing a floating point value that is the number of bytes of Large Object data that follow; that is, the actual size of the Large Object value. |
16-23 | An 8-byte field containing a floating point value that is the number of bytes reserved for this Large Object; that is, either the RESERVE value, or zero. |
25-28 | Value X'000000FF' |
28-31 | A binary value X'000000FF' |
32-35 | A binary value that indicates the number of bytes of Large Object data on the following input lines; that is, all lines except possibly the last will contain this much data. |
File load methods
The File Load utility is a two-phase process of Model 204 file table updating. The two phases are:
- Loading Table A and Table B and deferring the Table C and Table D updates to the deferred update data set or data sets.
- Sorting, and then applying the deferred index updates in Tables C and D.
Two ways to run File Load shows the relationships between these tasks.
The File Load utility's two phases can be run in two ways: as a multistep job or as a single-step job.
Two ways to run File Load
The File Load utility's two phases can be run in two ways: as either a multistep or single-step job.
Multistep File Load
In a multistep File Load, the two phases are run in three to seven separate job steps (under z/OS and z/VSE). The exact number of job steps in the multistep File Load depends on the attributes of the fields being updated and the number of deferred index update data sets specified in the JCL.
One-step File Load
In a one-step File Load, the two phases are run as one job step in which the second phase is invoked automatically.
Operating system dependencies
The one-step File Load is not available under z/VSE.
Under z/VM, both the multistep File Load and the one-step File Load are run by a single command (FASTLOAD) that drives a user-written EXEC through the desired number of steps.
Comparing File Load methods
Two ways to run File Load compares the multistep File Load with the one-step File Load:
- Multistep File Load has between three and seven separate job steps and all the index updates are deferred to a variable-length deferred update data set.
- One-step File Load performs the same work as the multistep File Load, but it requires only one job step.
For purposes of comparison, this work is divided into five tasks that correspond to the five job steps of the multistep File Load.
Phase | Job step or task | Multistep method task | One-step method task |
---|---|---|---|
1 | 1 | File load program loads and/or updates records and creates a variable-length deferred update data set for all index entries | As the file load program creates a variable-length deferred update record, that record is passed directly to an automatically invoked sort utility. |
2 | 2 | Sort package is invoked to sort the variable-length deferred update data set. | Deferred update records are sorted automatically. |
2 | 3 | Sorted deferred update data set is read by the Z command (see "Z command"), which updates the index and writes any FRV entries into a fixed-length deferred update data set. | Output from the sort is passed to the Z command. The Z command is issued automatically. |
2 | 4 | Sort package is invoked to sort the fixed-length FRV deferred update data set. | Z command accepts each deferred update record and applies the index entries to the file. If the Z command encounters an index entry for an FRV (for-each-value) field, the Z command builds an FRV deferred update record and passes the record to a second sort process. |
2 | 5 | Second Z command is issued. It adds the sorted FRV deferred updates to the index. | Second Z command is automatically issued after the first Z command completes. The second Z command adds the deferred FRV entries to the index. |
Using one-step File Load
For small amounts of data, the one-step File Load can be much more efficient than the multistep procedure. The major reason for this efficiency is that each deferred update record has four fewer I/O operations performed on it in the one-step File Load. The one-step File Load bypasses:
- Output from the file load program
- Input to the sort utility
- Output from the sort utility
- Input to the Z commands
Another reason is that by using the one-step File Load, the sort key is shorter if the file load program performs no deletions. (This is true only if no ORDERED fields are updated in the one-step File Load.) In addition, the one-step method is operationally easier, because only one job step is involved. However, a one-step File Load can be less efficient for large amounts of data because there is less memory available for sorting.
Using multistep File Load
The advantages of the multistep procedure are:
- If a partially completed load aborts, there are at least two and as many as six points from which processing can be resumed, if dumps are taken after each file updating step. When using the one-step File Load, the load cannot be restarted at the point of failure. The load must be repeated from the beginning.
- Multistep load can be run in a much smaller memory region, because the first phase does not need to reserve spare core for two copies of the sort program and the sort work areas for each.
- CPU usage is considerably less than for the one-step load.
- Because the sort(s) run in separate steps, much more memory can be made available to the sort. This can dramatically improve sort performance.