Difference between revisions of "File Load utility"

From m204wiki
Jump to navigation Jump to search
m
m
 
(9 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
==File Load utility overview==
 
==File Load utility overview==
<p>When a sequential data set contains the raw data for records to be stored in a <var class="product">Model&nbsp;204</var> file, the File Load utility generally provides the most efficient means of loading that data into <var class="product">Model&nbsp;204</var>.</p>
+
<p>
 +
When a sequential data set contains the raw data for records to be stored in a <var class="product">Model&nbsp;204</var> file, the File Load utility generally provides the most efficient means of loading that data into <var class="product">Model&nbsp;204</var>.</p>
 +
 
 
===File Load language===
 
===File Load language===
<p>The heart of this utility is a file load program that is constructed out of file load statements to describe the raw data to <var class="product">Model&nbsp;204</var>. The file load statements are elements of a separate file load language and are described in detail starting with the section, .</p>
+
<p>
<p>The file load language provides limited programming capability. If complicated edits or large amounts of data manipulation are required, consider using FLOD Exits, a preprocessing program, or a host language or User Language program to load the file. </p>
+
The heart of this utility is a file load program that is constructed out of file load statements to describe the raw data to <var class="product">Model&nbsp;204</var>. The file load statements are elements of a separate file load language and are described in detail starting with the section, .</p>
 +
<p>
 +
The file load language provides limited programming capability. If complicated edits or large amounts of data manipulation are required, consider using <var>FLOD</var> exits, a pre-processing program, or a host language or [[SOUL]] program to load the file. </p>
 +
 
 
===Input data for the File Load utility===
 
===Input data for the File Load utility===
<p>The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a <var class="product">Model&nbsp;204</var> file by using special file load statements, which convert this data to character string format.  Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use either system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run.  </p>
+
<p>
<p>Warning</p>
+
The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a <var class="product">Model&nbsp;204</var> file by using special file load statements, which convert this data to character string format.  </p>
<p>Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use either system recovery nor media recovery to reapply updates from a File Load run. In general, always backup the files just before a File Load run.  </p>
+
 
 +
<p class="warn"><b>Attention:</b>
 +
Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run.  </p>
 +
 
 
===Building a Large Object descriptor===
 
===Building a Large Object descriptor===
<p>If you need to build the Large Object descriptor yourself-for example, for an initial load of a file that contains Large Objects-the descriptor must be built correctly. It is exactly 32 bytes long, as follows.</p>
+
<p>
 +
If you need to build the [[Field design#BLOB, CLOB, and MINLOBE attributes|Large Object]] descriptor yourself (for example, for an initial load of a file that contains Large Objects) the descriptor must be built correctly. It is exactly 32 bytes long, as follows.</p>
 
<table>
 
<table>
 
<tr class="head">
 
<tr class="head">
Line 15: Line 24:
 
<th>Contains</th>
 
<th>Contains</th>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">0-3 </td>
 
<td align="right">0-3 </td>
 
<td>Value X'1B800000'</td>
 
<td>Value X'1B800000'</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">4-87</td>
 
<td align="right">4-87</td>
 
<td>Value X'00000000'</td>
 
<td>Value X'00000000'</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">8-18 </td>
 
<td align="right">8-18 </td>
 
<td>An 8-byte field containing a floating point value that is the number of bytes of Large Object data that follow; that is, the actual size of the Large Object value.</td>
 
<td>An 8-byte field containing a floating point value that is the number of bytes of Large Object data that follow; that is, the actual size of the Large Object value.</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">16-23 </td>
 
<td align="right">16-23 </td>
<td>An 8-byte field containing a floating point value that is the number of bytes reserved for this Large Object; that is, either the RESERVE value, or zero.</td>
+
<td>An 8-byte field containing a floating point value that is the number of bytes reserved for this Large Object; that is, either the <var>[[Field design#Storing and updating LOBs|RESERVE]]</var> value, or zero.</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">25-28 </td>
 
<td align="right">25-28 </td>
 
<td>Value X'000000FF'</td>
 
<td>Value X'000000FF'</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">28-31 </td>
 
<td align="right">28-31 </td>
 
<td>A binary value X'000000FF'</td>
 
<td>A binary value X'000000FF'</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">32-35</td>
 
<td align="right">32-35</td>
Line 44: Line 60:
 
</tr>
 
</tr>
 
</table>
 
</table>
 +
 
==File load methods==
 
==File load methods==
<p>The File Load utility is a two-phase process of <var class="product">Model&nbsp;204</var> file table updating. The two phases are:</p>
+
<p>
 +
The File Load utility is a two-phase process of <var class="product">Model&nbsp;204</var> file table updating. The two phases are:</p>
 
<ol>
 
<ol>
 
<li>Loading Table A and Table B and deferring the Table C and Table D updates to the deferred update data set or data sets.</li>
 
<li>Loading Table A and Table B and deferring the Table C and Table D updates to the deferred update data set or data sets.</li>
 +
 +
<li>Sorting and then applying the deferred index updates in Tables C and D.</li>
 
</ol>
 
</ol>
<b>Sorting, and then applying the deferred index updates in Tables C and D.</b>
+
<p>
<p>[[#Two ways to run File Load|Two ways to run File Load]] shows the relationships between these tasks.</p>
+
The File Load utility's two phases can be run in two ways: as a multistep job or as a single-step job:</p>
<p>The File Load utility's two phases can be run in two ways; as either a multistep or single-step job.</p>
 
===Two ways to run File Load===
 
<p>The File Load utility's two phases can be run in two ways: as either a multistep or single-step job.</p>
 
<b>Multistep File Load</b>
 
<p>In a multistep File Load, the two phases are run in three to seven separate job steps (under z/OS and z/VSE). The exact number of job steps in the multistep File Load depends on the attributes of the fields being updated and the number of deferred index update data sets specified in the JCL.</p>
 
<b>One-step File Load</b>
 
<p>In a one-step File Load, the two phases are run as one job step in which the second phase is invoked automatically.</p>
 
<b>Operating system dependencies</b>
 
<p>The one-step File Load is not available under z/VSE. </p>
 
<p>Under z/VM, both the multistep File Load and the one-step File Load are run by a single command (FASTLOAD) that drives a user-written EXEC through the desired number of steps.      </p>
 
<b>Comparing File Load methods</b>
 
<p>[[#Two ways to run File Load|Two ways to run File Load]] compares the multistep File Load with the one-step File Load: </p>
 
 
<ul>
 
<ul>
<li>Multistep File Load has between three and seven separate job steps and all the index updates are deferred to a variable-length deferred update data set. </li>
+
<li>In a <b>multistep File Load</b>, the two phases are run in three to seven separate job steps (under z/OS and z/VSE). The exact number of job steps in the multistep File Load depends on the attributes of the fields being updated and the number of deferred index update data sets specified in the JCL.</li>
<li>One-step File Load performs the same work as the multistep File Load, but it requires only one job step. </li>
+
 
 +
<li>In a <b>one-step File Load</b>, the two phases are run as one job step in which the second phase is invoked automatically.</li>
 
</ul>
 
</ul>
<p>For purposes of comparison, this work is divided into five tasks that correspond to the five job steps of the multistep File Load.  </p>
+
 
 +
====Operating system dependencies====
 +
<p>
 +
The one-step File Load is not available under z/VSE. </p>
 +
<p>
 +
Under z/VM, both the multistep File Load and the one-step File Load are run by a single command (<var>FASTLOAD</var>) that drives a user-written EXEC through the desired number of steps. </p>
 +
 
 +
====Comparing File Load methods====
 +
<p>
 +
The table below compares the multistep File Load with the one-step File Load: </p>
 +
<ul>
 +
<li>The multistep File Load has between three and seven separate job steps and all the index updates are deferred to a variable-length deferred update data set. </li>
 +
 
 +
<li>The one-step File Load performs the same work as the multistep File Load, but it requires only one job step. </li>
 +
</ul>
 +
<p>
 +
For purposes of comparison, this work is divided into five tasks that correspond to the five job steps of the multistep File Load.  </p>
 
<table>
 
<table>
 
<caption>Multistep file load vs. one-step file load</caption>
 
<caption>Multistep file load vs. one-step file load</caption>
Line 76: Line 101:
 
<th>One-step method task</th>
 
<th>One-step method task</th>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">1</td>
 
<td align="right">1</td>
Line 82: Line 108:
 
<td>As the file load program creates a variable-length deferred update record, that record is passed directly to an automatically invoked sort utility.</td>
 
<td>As the file load program creates a variable-length deferred update record, that record is passed directly to an automatically invoked sort utility.</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">2</td>
 
<td align="right">2</td>
Line 88: Line 115:
 
<td>Deferred update records are sorted automatically.</td>
 
<td>Deferred update records are sorted automatically.</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">2</td>
 
<td align="right">2</td>
 
<td align="right">3</td>
 
<td align="right">3</td>
<td>Sorted deferred update data set is read by the Z command (see "Z command"), which updates the index and writes any FRV entries into a fixed-length deferred update data set.</td>
+
<td>Sorted deferred update data set is read by the <var>[[Z command|Z]]</var> command, which updates the index and writes any <var>FRV</var> entries into a fixed-length deferred update data set.</td>
<td>Output from the sort is passed to the Z command. The Z command is issued automatically.</td>
+
<td>Output from the sort is passed to the <var>Z</var> command. The <var>Z</var> command is issued automatically.</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">2</td>
 
<td align="right">2</td>
 
<td align="right">4</td>
 
<td align="right">4</td>
<td>Sort package is invoked to sort the fixed-length FRV deferred update data set.</td>
+
<td>Sort package is invoked to sort the fixed-length <var>FRV</var> deferred update data set.</td>
<td>Z command accepts each deferred update record and applies the index entries to the file. If the Z command encounters an index entry for an FRV (for-each-value) field, the Z command builds an FRV deferred update record and passes the record to a second sort process.</td>
+
<td><var>Z</var> command accepts each deferred update record and applies the index entries to the file. If the <var>Z</var> command encounters an index entry for an <var>FRV</var> field, the Z command builds an <var>FRV</var> deferred update record and passes the record to a second sort process.</td>
 
</tr>
 
</tr>
 +
 
<tr>
 
<tr>
 
<td align="right">2</td>
 
<td align="right">2</td>
 
<td align="right">5</td>
 
<td align="right">5</td>
<td>Second Z command is issued. It adds the sorted FRV deferred updates to the index.</td>
+
<td>Second <var>Z</var> command is issued. It adds the sorted <var>FRV</var> deferred updates to the index.</td>
<td>Second Z command is automatically issued after the first Z command completes. The second Z command adds the deferred FRV entries to the index.</td>
+
<td>Second <var>Z</var> command is automatically issued after the first <var>Z</var> command completes. The second <var>Z</var> command adds the deferred <var>FRV</var> entries to the index.</td>
 
</tr>
 
</tr>
 
</table>
 
</table>
 +
 
===Using one-step File Load===
 
===Using one-step File Load===
<p>For small amounts of data, the one-step File Load can be much more efficient than the multistep procedure. The major reason for this efficiency is that each deferred update record has four fewer I/O operations performed on it in the one-step File Load. The one-step File Load bypasses:</p>
+
<p>
 +
For small amounts of data, the one-step File Load can be much more efficient than the multistep procedure. The major reason for this efficiency is that each deferred update record has four fewer I/O operations performed on it in the one-step File Load. The one-step File Load bypasses:</p>
 
<ul>
 
<ul>
<li>Output from the file load program</li>
+
<li>Output from the File Load program</li>
 
<li>Input to the sort utility</li>
 
<li>Input to the sort utility</li>
 
<li>Output from the sort utility</li>
 
<li>Output from the sort utility</li>
<li>Input to the Z commands </li>
+
<li>Input to the <var>Z</var> commands </li>
 
</ul>
 
</ul>
<p>Another reason is that by using the one-step File Load, the sort key is shorter if the file load program performs no deletions. (This is true only if no ORDERED fields are updated in the one-step File Load.) In addition, the one-step method is operationally easier, because only one job step is involved. However, a one-step File Load can be less efficient for large amounts of data because there is less memory available for sorting.</p>
+
<p>
 +
Another reason is that by using the one-step File Load, the sort key is shorter if the file load program performs no deletions. (This is true only if no <var>ORDERED</var> fields are updated in the one-step File Load.) In addition, the one-step method is operationally easier, because only one job step is involved. However, a one-step File Load can be less efficient for large amounts of data because there is less memory available for sorting.</p>
 +
 
 
===Using multistep File Load===
 
===Using multistep File Load===
<p>The advantages of the multistep procedure are:</p>
+
<p>
 +
The advantages of the multistep procedure are:</p>
 
<ul>
 
<ul>
 
<li>If a partially completed load aborts, there are at least two and as many as six points from which processing can be resumed, if dumps are taken after each file updating step. When using the one-step File Load, the load cannot be restarted at the point of failure. The load must be repeated from the beginning.</li>
 
<li>If a partially completed load aborts, there are at least two and as many as six points from which processing can be resumed, if dumps are taken after each file updating step. When using the one-step File Load, the load cannot be restarted at the point of failure. The load must be repeated from the beginning.</li>
 +
 
<li>Multistep load can be run in a much smaller memory region, because the first phase does not need to reserve spare core for two copies of the sort program and the sort work areas for each.</li>
 
<li>Multistep load can be run in a much smaller memory region, because the first phase does not need to reserve spare core for two copies of the sort program and the sort work areas for each.</li>
 +
 
<li>CPU usage is considerably less than for the one-step load.</li>
 
<li>CPU usage is considerably less than for the one-step load.</li>
<li>Because the sort(s) run in separate steps, much more memory can be made available to the sort. This can dramatically improve sort performance.   </li>
+
 
 +
<li>Because the sort(s) run in separate steps, much more memory can be made available to the sort. This can dramatically improve sort performance. </li>
 +
</ul>
 +
 
 +
==See also==
 +
These topics further describe how to use the File Load utility:
 +
<ul>
 +
<li>[[Multistep File Load utility]] </li>
 +
<li>[[Seven-Step File Load examples]]
 +
<li>[[One-Step File Load utility]] </li>
 +
<li>[[File Load utility: FLOD and FILELOAD commands]] </li>
 +
<li>[[FLOD exits]] </li>
 +
<li>[[File loading techniques]] </li>
 
</ul>
 
</ul>
<p>&nbsp;</p>
+
 
[[Category:File manager]]
+
[[Category:File loading and reorganization]]
[[Category:File management]]
 

Latest revision as of 20:38, 15 December 2014

File Load utility overview

When a sequential data set contains the raw data for records to be stored in a Model 204 file, the File Load utility generally provides the most efficient means of loading that data into Model 204.

File Load language

The heart of this utility is a file load program that is constructed out of file load statements to describe the raw data to Model 204. The file load statements are elements of a separate file load language and are described in detail starting with the section, .

The file load language provides limited programming capability. If complicated edits or large amounts of data manipulation are required, consider using FLOD exits, a pre-processing program, or a host language or SOUL program to load the file.

Input data for the File Load utility

The input data for a File Load run can consist of fixed- or variable-length records in a sequential data set on magnetic tape or direct access storage. The data must be, for the most part, in EBCDIC character string format or in floating-point format. Data in binary, packed-decimal, and zoned-decimal formats can be loaded directly into a Model 204 file by using special file load statements, which convert this data to character string format.

Attention: Because the File Load utility is intended to be used when efficiency of system resources is a primary concern, this utility does not log updates or any other information to the journal. Thus, you cannot use system recovery nor media recovery to reapply updates from a File Load run. In general, always dump files just before a File Load run.

Building a Large Object descriptor

If you need to build the Large Object descriptor yourself (for example, for an initial load of a file that contains Large Objects) the descriptor must be built correctly. It is exactly 32 bytes long, as follows.

Bytes Contains
0-3 Value X'1B800000'
4-87 Value X'00000000'
8-18 An 8-byte field containing a floating point value that is the number of bytes of Large Object data that follow; that is, the actual size of the Large Object value.
16-23 An 8-byte field containing a floating point value that is the number of bytes reserved for this Large Object; that is, either the RESERVE value, or zero.
25-28 Value X'000000FF'
28-31 A binary value X'000000FF'
32-35 A binary value that indicates the number of bytes of Large Object data on the following input lines; that is, all lines except possibly the last will contain this much data.

File load methods

The File Load utility is a two-phase process of Model 204 file table updating. The two phases are:

  1. Loading Table A and Table B and deferring the Table C and Table D updates to the deferred update data set or data sets.
  2. Sorting and then applying the deferred index updates in Tables C and D.

The File Load utility's two phases can be run in two ways: as a multistep job or as a single-step job:

  • In a multistep File Load, the two phases are run in three to seven separate job steps (under z/OS and z/VSE). The exact number of job steps in the multistep File Load depends on the attributes of the fields being updated and the number of deferred index update data sets specified in the JCL.
  • In a one-step File Load, the two phases are run as one job step in which the second phase is invoked automatically.

Operating system dependencies

The one-step File Load is not available under z/VSE.

Under z/VM, both the multistep File Load and the one-step File Load are run by a single command (FASTLOAD) that drives a user-written EXEC through the desired number of steps.

Comparing File Load methods

The table below compares the multistep File Load with the one-step File Load:

  • The multistep File Load has between three and seven separate job steps and all the index updates are deferred to a variable-length deferred update data set.
  • The one-step File Load performs the same work as the multistep File Load, but it requires only one job step.

For purposes of comparison, this work is divided into five tasks that correspond to the five job steps of the multistep File Load.

Multistep file load vs. one-step file load
Phase Job step or task Multistep method task One-step method task
1 1 File load program loads and/or updates records and creates a variable-length deferred update data set for all index entries As the file load program creates a variable-length deferred update record, that record is passed directly to an automatically invoked sort utility.
2 2 Sort package is invoked to sort the variable-length deferred update data set. Deferred update records are sorted automatically.
2 3 Sorted deferred update data set is read by the Z command, which updates the index and writes any FRV entries into a fixed-length deferred update data set. Output from the sort is passed to the Z command. The Z command is issued automatically.
2 4 Sort package is invoked to sort the fixed-length FRV deferred update data set. Z command accepts each deferred update record and applies the index entries to the file. If the Z command encounters an index entry for an FRV field, the Z command builds an FRV deferred update record and passes the record to a second sort process.
2 5 Second Z command is issued. It adds the sorted FRV deferred updates to the index. Second Z command is automatically issued after the first Z command completes. The second Z command adds the deferred FRV entries to the index.

Using one-step File Load

For small amounts of data, the one-step File Load can be much more efficient than the multistep procedure. The major reason for this efficiency is that each deferred update record has four fewer I/O operations performed on it in the one-step File Load. The one-step File Load bypasses:

  • Output from the File Load program
  • Input to the sort utility
  • Output from the sort utility
  • Input to the Z commands

Another reason is that by using the one-step File Load, the sort key is shorter if the file load program performs no deletions. (This is true only if no ORDERED fields are updated in the one-step File Load.) In addition, the one-step method is operationally easier, because only one job step is involved. However, a one-step File Load can be less efficient for large amounts of data because there is less memory available for sorting.

Using multistep File Load

The advantages of the multistep procedure are:

  • If a partially completed load aborts, there are at least two and as many as six points from which processing can be resumed, if dumps are taken after each file updating step. When using the one-step File Load, the load cannot be restarted at the point of failure. The load must be repeated from the beginning.
  • Multistep load can be run in a much smaller memory region, because the first phase does not need to reserve spare core for two copies of the sort program and the sort work areas for each.
  • CPU usage is considerably less than for the one-step load.
  • Because the sort(s) run in separate steps, much more memory can be made available to the sort. This can dramatically improve sort performance.

See also

These topics further describe how to use the File Load utility: