File design: Difference between revisions
(57 intermediate revisions by 6 users not shown) | |||
Line 1: | Line 1: | ||
<div class="toclimit-4"> | |||
Model 204 provides many features that allow you, as the file designer, a great deal of flexibility in creating files that optimize the efficiency of Model 204 at your site. | Model 204 provides many features that allow you, as the file designer, a great deal of flexibility in creating files that optimize the efficiency of Model 204 at your site. | ||
This topic covers the options available to you when designing files, along with the associated performance issues. Understanding these options, and understanding the database and application needs of your site, helps you select the correct options for file design, and the associated data structures | This topic covers the options available to you when designing files, along with the associated performance issues. Understanding these options, and understanding the database and application needs of your site, helps you select the correct options for file design, and the associated data structures described in: | ||
:[[ | :[[Record design|Record design]] | ||
:[[Field | :[[Field group design]] | ||
:[[Field design]] | |||
As the file designer, your responsibility is to make informed decisions concerning these issues based on knowledge of the data and required results. In so doing, you can have a dramatic effect on the efficiency with which Model 204 runs. | As the file designer, your responsibility is to make informed decisions concerning these issues based on knowledge of the data and required results. In so doing, you can have a dramatic effect on the efficiency with which Model 204 runs. | ||
==Design and planning overview== | |||
==Design and | |||
<p>When creating a <var class="product">Model 204</var> file, you can:</p> | <p>When creating a <var class="product">Model 204</var> file, you can:</p> | ||
<ul> | <ul> | ||
<li>Use the default file settings. (See [[ Creating | <li>Use the default file settings. (See the <var>[[Creating a file#CREATE command|CREATE]]</var> command for information about its default parameter values.) <var class="product">Model 204</var> has been designed so that the default options meet the needs of most users; however it is useful to be aware of what you can customize.</li> | ||
<li>Customize the file for your environment with settings for organization, recovery and security.</li> | <li>Customize the file for your environment with settings for organization, recovery and security.</li> | ||
</ul> | </ul> | ||
<p>If you customize your file, consider the following questions, discussed in this | <p> | ||
If you customize your file, consider the following questions, discussed in this topic:</p> | |||
<ul> | <ul> | ||
<li>Which file organization type provides the most efficient data retrieval for my needs?</li> | <li>Which file organization type provides the most efficient data retrieval for my needs?</li> | ||
<li>What file recovery options are needed for this file?</li> | <li>What file recovery options are needed for this file?</li> | ||
<li>What level of security does this file need?</li> | <li>What level of security does this file need?</li> | ||
<li>What file sizing options do I need?</li> | <li>What file sizing options do I need?</li> | ||
<li>How do I want to set the non-resettable parameters?</li> | <li>How do I want to set the non-resettable parameters?</li> | ||
</ul> | </ul> | ||
==Specifying a file organization== | |||
During the design process, once you determine the requirement for a new file, you can look at the file design. The <var>[[FILEORG parameter|FILEORG]]</var> parameter lets you choose a file organization. | |||
This parameter controls two things: | |||
<ul> | |||
< | <li>The physical organization of the files [[Table B (File architecture)|Table B]].</li> | ||
<li>A variety of options concerning how record numbers are managed and the physical data storage.</li> | |||
</ul> | |||
===File types=== | |||
Entry order is the default. See the <var>FILEORG</var> parameter description for more information. | |||
The file organization you choose determines the organization of the file's Table B. The following types of file organization are available: | |||
<ul> | <ul> | ||
<li>Entry order</li> | <li>Entry order</li> | ||
<li>Unordered</li> | <li>Unordered</li> | ||
<li>Sorted</li> | <li>Sorted</li> | ||
<li>Hash key | <li>Hash key </li> | ||
</ul> | </ul> | ||
<p>Entry order, unordered, sorted, and hash key files can be used in any combination to make up a database. Any <var class="product">Model 204</var> file can be cross-referenced to any other <var class="product">Model 204</var> file regardless of the type of organization. </p> | <p> | ||
=== | Entry order, unordered, sorted, and hash key files can be used in any combination to make up a database. Any <var class="product">Model 204</var> file can be cross-referenced to any other <var class="product">Model 204</var> file regardless of the type of organization. </p> | ||
===Choosing a file organization=== | |||
<p>Choose Table B organization based upon these factors:</p> | <p>Choose Table B organization based upon these factors:</p> | ||
<ul> | <ul> | ||
Line 62: | Line 64: | ||
</ul> | </ul> | ||
==== Entry | ====Entry order files==== | ||
<p> | |||
In ordinary <var class="product">Model 204</var> files, each new record is simply stored in the available space after any other records in Table B. When a set of records is retrieved from such files, the records are processed in the order in which they were stored. This order is <var class="term">chronological</var>. These files are known as <b>entry order files</b>. </p> | |||
<p> | |||
Entry order files are the most widely used <var class="product">Model 204</var> files. They provide full inverted file capabilities but do not support the unordered file, sorted file, and hash key file facilities described later in this section. If you require sorted output, you can include the <var>SORT</var> statement in a SOUL request, use the <var>IFSORT</var> call in a <var class="product">[[Media:M204 HLIReference V75.pdf|Host Language Interface]]</var> program, or specify the <var>ORDERED</var> attribute.</p> | |||
<p> | |||
In an entry order file, new records are added only after other records in the file. A fixed number of pages is assigned to Table B of the file by the <var>CREATE</var> command. You can increase the number of pages any time, subject only to the limits of the total disk space available (or 16M pages). You can decrease them at any time to the value of the <var>[[BHIGHPG parameter|BHIGHP]]</var> parameter plus one. When an attempt is made to store another record in a file that has insufficient space available at the end of the file, the transaction is backed out if possible, and an error message is issued; the file might also be marked full. This occurs even though numerous records in the file might have been deleted. </p> | |||
In an entry order file, space made available by deleting records can be reused for the expansion of existing Table B records, but cannot be reused for adding new records to the file. These records must be added in whatever space is available after all the other records in the file. | In an entry order file, space made available by deleting records can be reused for the expansion of existing Table B records, but cannot be reused for adding new records to the file. These records must be added in whatever space is available after all the other records in the file. | ||
==== Unordered | ====Unordered files (X'20')==== | ||
<p> | |||
<p>The unordered file organization makes most deleted Table B space available for reuse for new records. A queue of pages formed from deleted Table B space is maintained and called the reuse queue. Most of the available space on these pages is reused before the file is considered full. </p> | The unordered file organization makes most deleted Table B space available for reuse for new records. A queue of pages formed from deleted Table B space is maintained and called the reuse queue. Most of the available space on these pages is reused before the file is considered full. </p> | ||
<p>Assuming random deletions, the longer a page has been on the reuse queue, the more likely it is that additional space has become available on it. When a record is to be added, <var class="product">Model 204</var> first tries the current Table B appends page. The appends page is the page to which records are currently being added; its page number is equal to the value of the BHIGHPG parameter. Page space is evaluated in the following sequence:</p> | <p> | ||
Assuming random deletions, the longer a page has been on the reuse queue, the more likely it is that additional space has become available on it. When a record is to be added, <var class="product">Model 204</var> first tries the current Table B appends page. The appends page is the page to which records are currently being added; its page number is equal to the value of the <var>BHIGHPG</var> parameter. Page space is evaluated in the following sequence:</p> | |||
<ol> | <ol> | ||
<li>If the appends page has insufficient space, the oldest page from the head of the reuse queue is examined.</li> | <li>If the appends page has insufficient space, the oldest page from the head of the reuse queue is examined.</li> | ||
<li>If that page is full, <var class="product">Model 204</var> removes it from the reuse queue and tries the next page. </li> | <li>If that page is full, <var class="product">Model 204</var> removes it from the reuse queue and tries the next page. </li> | ||
<li>If the reuse queue is empty, <var class="product">Model 204</var> starts a new page (BHIGHPG+1). </li> | <li>If the reuse queue is empty, <var class="product">Model 204</var> starts a new page (BHIGHPG+1). </li> | ||
<li>If no more pages are available, <var class="product">Model 204</var> looks for available space on as many as sixteen randomly selected already-used pages</li> | <li>If no more pages are available, <var class="product">Model 204</var> looks for available space on as many as sixteen randomly selected already-used pages</li> | ||
<li>If this fails the file is considered full. </li> | <li>If this fails the file is considered full. </li> | ||
</ol> | </ol> | ||
<p>In order to use space on any page for a new record, a record number must be available on that page. To be useful, an unordered file must either be designed with many extra record numbers per page, or must use the reuse record number (RRN) option. See [[# | =====Reusing record numbers===== | ||
<p> | |||
<p>For unordered files the BREUSE parameter specifies how much space a page must have available before being placed on the reuse queue. BREUSE is expressed as a percentage of the space available over and above BRESERVE. </p> | In order to use space on any page for a new record, a record number must be available on that page. To be useful, an unordered file must either be designed with many extra record numbers per page, or must use the reuse record number (RRN) option. See [[#Reuse Record Number (X'04')|Reuse Record Number (X'04')]].</p> | ||
<p>The BRESERVE parameter specifies the size of an area at the end of each Table B page in which new records cannot be started. This area is reserved for the expansion of existing records on the page. When deletions occur, if there are enough additional bytes of free space available beyond BRESERVE to provide the free space required for reuse (parameter BREUSE), and at least one record number is available, the page is added to the reuse queue. </p> | |||
<p>You can set BREUSE for files with any file organization, but the reuse queue is built for unordered files only. Files, other than unordered, may have BREUSE set but no reuse queue present.</p> | =====Planning open space===== | ||
<p> | |||
<p>The reuse queue is a single queue of pages chained together by a 4-byte pointer at the bottom of each page. The total number of pages on the queue, along with pointers to the top and bottom of the queue, are maintained on the FPL page. </p> | For unordered files, the <var>[[BREUSE parameter|BREUSE]]</var> parameter specifies how much space a page must have available before being placed on the reuse queue. <var>BREUSE</var> is expressed as a percentage of the space available over and above <var>[[BRESERVE parameter|BRESERVE]]</var>. </p> | ||
<p>Searching for pages with reusable record numbers begins at the top of the queue. As pages qualify for the reuse queue due to deleted records, those pages are added to the bottom of the queue. Starting a search at the top of the queue, where the pages are the oldest, allows additional time for constrained space to be released and helps guarantee that a record can be reused on that page. </p> | <p> | ||
<p>A page is added to the reuse queue if free space on the page is larger than:</p> | The <var>BRESERVE</var> parameter specifies the size of an area at the end of each Table B page in which new records cannot be started. This area is reserved for the expansion of existing records on the page. When deletions occur, if there are enough additional bytes of free space available beyond <var>BRESERVE</var> to provide the free space required for reuse (parameter <var>BREUSE</var>), and at least one record number is available, the page is added to the reuse queue. </p> | ||
<p> | |||
You can set <var>BREUSE</var> for files with any file organization, but the reuse queue is built for unordered files only. Files, other than unordered, may have <var>BREUSE</var> set but no reuse queue present.</p> | |||
=====Maintaining the reuse queue===== | |||
<p> | |||
The reuse queue is a single queue of pages chained together by a 4-byte pointer at the bottom of each page. The total number of pages on the queue, along with pointers to the top and bottom of the queue, are maintained on the FPL page. </p> | |||
<p> | |||
Searching for pages with reusable record numbers begins at the top of the queue. As pages qualify for the reuse queue due to deleted records, those pages are added to the bottom of the queue. Starting a search at the top of the queue, where the pages are the oldest, allows additional time for constrained space to be released and helps guarantee that a record can be reused on that page. </p> | |||
<p> | |||
A page is added to the reuse queue if free space on the page is larger than:</p> | |||
<p class="code">(PAGE SIZE - 8 - BRESERVE) * BREUSE/100 | <p class="code">(PAGE SIZE - 8 - BRESERVE) * BREUSE/100 | ||
</p> | </p> | ||
<p>When a page is removed from the queue, indicating that it no longer contains reusable record numbers, the reuse queue chain pointer on that page is set to zero and the pointer on the previous page in the queue is adjusted accordingly.</p> | <p> | ||
When a page is removed from the queue, indicating that it no longer contains reusable record numbers, the reuse queue chain pointer on that page is set to zero and the pointer on the previous page in the queue is adjusted accordingly.</p> | |||
<p>A page is added to the reuse queue when deletions occur on a Table B page and the page has BREUSE space available over and above BRESERVE, which is the space reserved only for expanding existing records. New records cannot be started in BRESERVE space. Also, there must be at least one available record number on the page. A page that meets this criteria as an eligible page.</p> | |||
<p>Once an eligible page is on the reuse queue it remains there even if it becomes ineligible, until one of the following events occurs:</p> | =====Understanding eligible and ineligible reuse queue pages===== | ||
<p> | |||
A page is added to the reuse queue when deletions occur on a Table B page and the page has <var>BREUSE</var> space available over and above <var>BRESERVE</var>, which is the space reserved only for expanding existing records. New records cannot be started in <var>BRESERVE</var> space. Also, there must be at least one available record number on the page. A page that meets this criteria as an eligible page.</p> | |||
<p> | |||
Once an eligible page is on the reuse queue it remains there even if it becomes ineligible, until one of the following events occurs:</p> | |||
<ul> | <ul> | ||
<li>During updating, if a page is selected from the reuse queue that is ineligible, it is removed from the queue. </li> | <li>During updating, if a page is selected from the reuse queue that is ineligible, it is removed from the queue. </li> | ||
<li>You issue a BLDREUSE command | |||
<li>You issue a <var>[[BLDREUSE command|BLDREUSE]]</var> command. </li> | |||
<li>The file is reorganized.</li> | <li>The file is reorganized.</li> | ||
</ul> | </ul> | ||
==== Sorted | =====Generating ineligible reuse queue pages===== | ||
<p> | |||
<p>If most processing or output is required in order by a certain key, consider a sorted file organization. Sorted files store and can retrieve records in order by a sort key field. Thus, when processing records in order by the sort key, a large amount of Online sorting can be eliminated. Because a sort key can be alphabetic, it also provides a convenient method of doing alphabetical ranging on a single key without doing a direct search on the data.</p> | Adjustment to file parameters <var>BREUSE</var> and/or <var>BRESERVE</var> does not immediately remove a page from the reuse queue although the new values may make the page ineligible.</p> | ||
<p>The sorted organization in Table B is similar to ISAM (Indexed Sequential Access Method), where there are master areas and overflow areas. Because of this, use a sorted organization only when at least half of the total records can be loaded in sorted order and the data is not volatile. | <p> | ||
<p>See [[Sorted | If the page is eligible with room to add the entire new record (physical base or extension record) to the page, the addition is made and the page remains on the reuse queue, even if it is now ineligible. </p> | ||
====Sorted files (X'01')==== | |||
<p> | |||
If most processing or output is required in order by a certain key, consider a sorted file organization. Sorted files store and can retrieve records in order by a sort key field. Thus, when processing records in order by the sort key, a large amount of Online sorting can be eliminated. Because a sort key can be alphabetic, it also provides a convenient method of doing alphabetical ranging on a single key without doing a direct search on the data.</p> | |||
<p> | |||
The sorted organization in Table B is similar to ISAM (Indexed Sequential Access Method), where there are master areas and overflow areas. Because of this, use a sorted organization only when at least half of the total records can be loaded in sorted order and the data is not volatile. </p> | |||
<ul> | |||
<li>In a sorted file, the X'01' bit of the <var>FILEORG</var> parameter is <b>on</b>. | |||
</ul> | |||
<p> | |||
See [[Sorted files]] for more information.</p> | |||
====Hash key files (X'08')==== | |||
<p> | |||
If all records contain a unique or fairly unique key and if most retrievals are done on the basis of that key alone, consider a hash key file organization. In a hash key file, <var class="product">Model 204</var> hashes directly to the Table B record, which can drastically reduce disk I/O. However, because Table B is ordered randomly in a hash key file, Table B must be made large enough to allow for the expected growth of the file; it cannot be expanded without reloading the entire file. </p> | |||
<ul> | |||
<li>In a hash key file, the X'08' bit of the <var>FILEORG</var> parameter is <b>on</b>. | |||
</ul> | |||
See [[Hash key files]] for more information. | |||
=== | === Fine tuning a file type === | ||
In addition to the basic structure of the file's Table B, there are four other settings which can change the way in which Model 204 stores and processed records. | In addition to the basic structure of the file's Table B, there are four other settings which can change the way in which Model 204 stores and processed records. | ||
<ul> | <ul> | ||
<li> | <li>X'04' — Reuse Record Numbers</li> | ||
<li> | <li>X'80' — Optimized Field Extraction Files</li> | ||
<li> | <li>X'100' — Enhanced Data Handling Files</li> | ||
<li> | <li>X'200' — Large files (up to 48 million records)</li> | ||
</ul> | </ul> | ||
<p>The | ====Reuse Record Number (X'04')==== | ||
<p> | |||
The Reuse Record Number (RRN) feature allows <var class="product">Model 204</var> to reuse the record number of a deleted record for a new record, if a record number is available on the Table B page to which the new record is added. Consider the following points during the design of applications that allow record number reuse: </p> | |||
<ul> | <ul> | ||
<li>Record number order might not correspond to chronological order. If record A is added to a file, and record B is added later, record A might have a record number higher than that of record B.</li> | <li>Record number order might not correspond to chronological order. If record A is added to a file, and record B is added later, record A might have a record number higher than that of record B.</li> | ||
<li>RRN is optional for unordered, sorted, and hash key files. It is activated by turning on the X'04' bit of the FILEORG parameter. Setting the RRN bit without setting the sorted or hash bits automatically sets the unordered file bit of FILEORG. | |||
<li>RRN is optional for unordered, sorted, and hash key files. It is activated by turning on the X'04' bit of the <var>FILEORG</var> parameter. Setting the RRN bit without setting the sorted or hash bits automatically sets the unordered file bit of <var>FILEORG</var>. </li> | |||
</ul> | </ul> | ||
Line 137: | Line 176: | ||
===== Using INVISIBLE fields in RRN files ===== | ===== Using INVISIBLE fields in RRN files ===== | ||
<p>Special care must be taken in the use of INVISIBLE fields within files that reuse record numbers. Because INVISIBLE fields reside only in the index portion of a file, they are not automatically deleted from <var class="product">Model 204</var> files when the records with which they are associated have been deleted. </p> | <p> | ||
<p>To avoid the possibility of new records inheriting the INVISIBLE fields of records previously deleted, it is necessary to delete explicitly a record's INVISIBLE fields at the time that the record itself is deleted. Use the | Special care must be taken in the use of <var>INVISIBLE</var> fields within files that reuse record numbers. Because <var>INVISIBLE</var> fields reside only in the index portion of a file, they are not automatically deleted from <var class="product">Model 204</var> files when the records with which they are associated have been deleted. </p> | ||
<p> | |||
To avoid the possibility of new records inheriting the <var>INVISIBLE</var> fields of records previously deleted, it is necessary to delete explicitly a record's INVISIBLE fields at the time that the record itself is deleted. Use the SOUL statement <code>DELETE <i>fieldname</i> = <i>value</i></code> or the Host Language Interface <var>IFDVAL</var> function to perform this task.</p> | |||
=====Deleting records properly in RRN files===== | =====Deleting records properly in RRN files===== | ||
<p>The | <p> | ||
The SOUL statement <var>DELETE ALL RECORDS IN</var> and the Host Language Interface <var>IFDSET</var> function make record sets inaccessible (that is, they logically delete the records), but neither physically remove the records from the file nor release the related record numbers. Therefore, neither the space that contains the deleted records nor the record numbers can be reused. Only record numbers freed by the <var>DELETE RECORD</var> statement or the <var>IFDREC</var> function can be reused. </p> | |||
<p>When deleting records in a found set, you must use COMMIT RELEASE after deleting the records to remove records from the found set and to make the record numbers available to be reused. The COMMIT RELEASE (rather than a simple COMMIT) is required to prevent future record-locking conflicts between the deleted records and any new records being stored.</p> | |||
=====Committing deleted records===== | |||
<p> | |||
When deleting records in a found set, you must use <var>COMMIT RELEASE</var> after deleting the records to remove records from the found set and to make the record numbers available to be reused. The <var>COMMIT RELEASE</var> (rather than a simple <var>COMMIT</var>) is required to prevent future record-locking conflicts between the deleted records and any new records being stored.</p> | |||
=====Using the RRN option in hash key files===== | =====Using the RRN option in hash key files===== | ||
<p>The efficiency of the RRN option with hash key files is that there is no reuse queue, because one is not built | <p> | ||
The efficiency of the RRN option with hash key files is that there is no reuse queue, because one is not built and therefore not maintained for hash key files. In a hash key file, when <var class="product">Model 204</var> hashes to a page and there is an available record number, the record is stored there. If the RRN option is defined for the file, <var class="product">Model 204</var> can reuse record numbers from previously deleted records on the hashed to page. </p> | |||
=====RRN files and deferred update mode===== | =====RRN files and deferred update mode===== | ||
<p><var class="product">Model 204</var> customer support suggests that you do not place RRN files in deferred update mode. Using deferred update mode might cause your site to lose updates without any warning. FLOD jobs are particularly susceptible, because they automatically open files for deferred update and require additional steps to override this behavior. However, any RRN file used in deferred update mode is in danger.</p> | <p> | ||
<p>The following scenario demonstrates what can happen in a FLOD job:</p> | <var class="product">Model 204</var> customer support suggests that you do not place RRN files in deferred update mode. Using deferred update mode might cause your site to lose updates without any warning. <var>FLOD</var> jobs are particularly susceptible, because they automatically open files for deferred update and require additional steps to override this behavior. However, any RRN file used in deferred update mode is in danger.</p> | ||
<p> | |||
The following scenario demonstrates what can happen in a <var>FLOD</var> job:</p> | |||
<table> | <table> | ||
<tr> | <tr class="head"> | ||
<th>Input record</th> | <th>Input record</th> | ||
<th>Key</th> | <th>Key</th> | ||
<th>Action </th> | <th>Action </th> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td rowspan="2">A</td> | <td rowspan="2">A</td> | ||
Line 166: | Line 215: | ||
</td> | </td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td colspan="3">many records later....</td> | <td colspan="3">many records later....</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td rowspan="2">J</td> | <td rowspan="2">J</td> | ||
Line 181: | Line 233: | ||
</td> | </td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td>K</td> | <td>K</td> | ||
Line 192: | Line 246: | ||
</td> | </td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td>L</td> | <td>L</td> | ||
Line 201: | Line 256: | ||
</td> | </td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td colspan="3">many records later....</td> | <td colspan="3">many records later....</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td>X</td> | <td>X</td> | ||
Line 214: | Line 271: | ||
</tr> | </tr> | ||
</table> | </table> | ||
<p>When records are deleted with Key = 1, <var class="product">Model 204</var> immediately updates the Existence Bit Pattern (EBP) in Table D. However, because the file is in deferred update mode, the index is not updated </p> | <p> | ||
<p>When input record J is added, it finds space on the Table B page previously occupied by records with Key = 1. So input records J, K, and L are added in the available slots. Again, index updates are deferred.</p> | When records are deleted with Key = 1, <var class="product">Model 204</var> immediately updates the Existence Bit Pattern (EBP) in Table D. However, because the file is in deferred update mode, the index is not updated </p> | ||
<p>Later, input record X deletes all records with key = 1. It finds the records that originally had key = 1 (the index has not been updated). It also finds that these records do exist (because the EBP was updated when records J, K, and L were added), so they are deleted. When the file is checked after the FLOD job, you find that input records J, K, and L were never added to the file.</p> | <p> | ||
When input record J is added, it finds space on the Table B page previously occupied by records with Key = 1. So input records J, K, and L are added in the available slots. Again, index updates are deferred.</p> | |||
<p> | |||
Later, input record X deletes all records with key = 1. It finds the records that originally had key = 1 (the index has not been updated). It also finds that these records do exist (because the EBP was updated when records J, K, and L were added), so they are deleted. When the file is checked after the <var>FLOD</var> job, you find that input records J, K, and L were never added to the file.</p> | |||
<p class="note"><b>Exceptions:</b> You can use deferred update mode for RRN files if you are doing simple record adds, such as a file reorganization, or if record adds and deletes are performed as separate jobs.</p> | <p class="note"><b>Exceptions:</b> You can use deferred update mode for RRN files if you are doing simple record adds, such as a file reorganization, or if record adds and deletes are performed as separate jobs.</p> | ||
=====Locating RRN records===== | =====Locating RRN records===== | ||
<p>Some <var class="product">Model 204</var> features locate records through explicit use of record numbers, notably the | <p> | ||
Some <var class="product">Model 204</var> features locate records through explicit use of record numbers, notably the SOUL <var>POINT$</var> retrieval condition and the Host Language Interface <var>IFPOINT</var> function. Avoid dependence upon record numbers wherever possible, because a record number that at one time pointed to a record that was deleted can, at some later time, again point to a valid record, which is not the same record originally stored with that record number. </p> | |||
==== | ====Optimized field extraction files (X'80')==== | ||
When this option is set, all non-preallocated (non-<var>OCCURS</var>) fields are preceded by a field-value length byte. | |||
With a length byte on every field, even <var>FLOAT</var>, <var>CODED</var> or <var>BINARY</var> fields, several instructions and one IF test are eliminated from the internal field scan loop. Having a length byte also allows some simple compression of <var>BINARY</var>, <var>CODED</var>, and <var>FLOAT</var> values which offsets, to some extent, the possible increase in space that this feature entails. | |||
If, for example, you have a large number of 'amount' fields defined as <var>BIN</var>, and they are predominantly small numbers, you might not only improve performance but save space by taking this option. Alternately, if you are storing mostly dates, for example, each field value pair will take an additional byte. | |||
====Enhanced data handling files (X'100')==== | |||
Setting this bit enables a number of enhancements to the file structure, including: | |||
<ul> | |||
<li>support for [[Field group design|Field groups]]</li> | |||
<li>the definition of up to 32000 fields in a file</li> | |||
<li>system maintained [[Field design#Automatic fields|Automatic fields]] and additional [[Field design#Field constraints|Field constraints]] providing content validation</li> | |||
<li>improved space management of fields containing [[Table E and FILEORG X'100' files (File architecture)|Large Objects]]</li> | |||
</ul> | |||
Setting the X'100' bit requires additional overhead in Table B, as described in <var>[[FILEORG parameter|FILEORG]]</var>. | |||
If X'100' is set, the X'80' bit is automatically also set. | |||
<p class="note"><b>Note:</b> <var>FILEORG</var> X'100' files cannot be loaded via <var>[[FLOD_command|FLOD]]</var> or <var>[[FILELOAD_command|FILELOAD]]</var> except with a <var class="product">[[Fast/Reload]]</var> Load All Information (<var>[[Fast/Reload statements#The LAI statement|LAI]]</var>) <var>FLOD/FILELOAD</var> program.</p> | |||
Enhanced data handling files are available as of version 7.5 of Model 204. | |||
====<b id="x200"></b>Large file support (X'200')==== | |||
Setting this <var>FILEORG</var> bit allows for a file to hold up to 48 million records (versus 16M records without this setting). | |||
<ul> | |||
<li>Space on file pages is used just as efficiently as a non-<var>FILEORG</var> X'200' file. There is one extremely small overhead: the extension record pointers are 4 bytes rather than 3, just as in a file that contains Table X. In fact, if there is any but the very smallest number of extension records, using Table X is recommended, so that all of the (up to 48M) record numbers are available for base records. </li> | |||
< | <li>May not be used in sorted (<code>FILEORG X'01'</code>) nor hashed (<code>FILEORG X'08'</code>) files.</li> | ||
</ | |||
<li>A few considerations are required for [[Deferred update feature#defUpdtX200|deferred updates]]. </li> | |||
</ul> | |||
Available as of version 7.5 of Model 204. | |||
Available | |||
==File recovery options== | ==File recovery options== | ||
<p>If a file transaction does not complete and the data is not fully updated, do you want the data transactions backed out? How much logging of file transactions do you want done in case recovery is needed? </p> | <p> | ||
<p>When creating each of your files, you can specify whether you want the file to use recovery features such as Transaction Back out (TBO) and logging. These features are enabled by default. In an online environment, it is recommended that you use the defaults so that the file you are creating can participate in recovery if the system crashes. </p> | If a file transaction does not complete and the data is not fully updated, do you want the data transactions backed out? How much logging of file transactions do you want done in case recovery is needed? </p> | ||
<p>File recovery options are determined by the FOPT and FRCVOPT parameters.</p> | <p> | ||
<p>Transaction back out is a file integrity facility that can logically undo the effects of incomplete transactions on file data. You can use checkpoints and TBO to restore file data. In the event of a system crash, <var class="product">Model 204</var> crash, or hard restart, you can use the RESTART command to roll back to the last valid checkpoint. If roll forward logging to the journal is also active during the Online processing, you can use the roll forward feature to restore as many file changes as possible.</p> | When creating each of your files, you can specify whether you want the file to use recovery features such as Transaction Back out (TBO) and logging. These features are enabled by default. In an online environment, it is recommended that you use the defaults so that the file you are creating can participate in recovery if the system crashes. </p> | ||
<p>See [[ File integrity and recovery]] and [[ Transaction back out]] for more information on TBO, logging, and other recovery features.</p> | <p> | ||
<p>To use file recovery, you will want to design your files to take best advantage of the recovery functionality and minimize any performance disadvantages of logging. See [[ | File recovery options are determined by the <var>[[FOPT parameter|FOPT]]</var> and <var>[[FRCVOPT parameter|FRCVOPT]]</var> parameters.</p> | ||
<p> | |||
Transaction back out is a file integrity facility that can logically undo the effects of incomplete transactions on file data. You can use checkpoints and TBO to restore file data. In the event of a system crash, <var class="product">Model 204</var> crash, or hard restart, you can use the <var>[[RESTART command|RESTART]]</var> command to roll back to the last valid checkpoint. If roll forward logging to the journal is also active during the Online processing, you can use the roll forward feature to restore as many file changes as possible.</p> | |||
<p> | |||
See [[ File integrity and recovery]] and [[ Transaction back out]] for more information on TBO, logging, and other recovery features.</p> | |||
<p> | |||
To use file recovery, you will want to design your files to take best advantage of the recovery functionality and minimize any performance disadvantages of logging. See [[ Transaction back out]] for more information on file considerations when using the TBO feature. </p> | |||
==File security levels== | ==File security levels== | ||
<p> | <p> | ||
These options of the <var>[[OPENCTL parameter|OPENCTL]]</var> parameter define file security:</p> | |||
<ul> | <ul> | ||
<li> | <li>Public file (<var>OPENCTL</var> X'80') | ||
< | <p> | ||
This option allows anyone to open the file without a password and everyone gets the default privileges depending on the value of | This option allows anyone to open the file without a password and everyone gets the default privileges depending on the value of <var>[[PRIVDEF parameter|PRIVDEF]]</var>. </p></li> | ||
<li> | |||
< | <li>Semi-public file (<var>OPENCTL</var> X'40) | ||
This option allows anyone to open the file and if they do not provide a password they get privileges associated with PRIVDEF. If they do provide a password, the privileges are determined by the password.</li> | <p> | ||
<li> | This option allows anyone to open the file and if they do not provide a password they get privileges associated with <var>PRIVDEF</var>. If they do provide a password, the privileges are determined by the password.</p></li> | ||
< | |||
The user must provide a password and privileges are determined from the password.</li> | <li>Private file (<var>OPENCTL</var> X'00') | ||
<p> | |||
The user must provide a password, and privileges are determined from the password.</p></li> | |||
</ul> | </ul> | ||
<p>You can also specify record security for individual records in a file and supply a record security key. See [[ | <p> | ||
<p class="note"><b>Note:</b> Most of the file access will likely be done through [[Application Subsystem]] | You can also specify record security for individual records in a file and supply a record security key. See [[Model 204 security features| Security]] for details on Security and <var>PRIVDEF</var> options.</p> | ||
<p class="note"><b>Note:</b> Most of the file access will likely be done through [[Application Subsystem development|Application Subsystems]] (APSY). The security options for APSY access are set in its definition.</p> | |||
==File sizing options== | ==File sizing options== | ||
<p>Decide on file sizing and whether you want to add pages dynamically to tables as needed. This is done with Auto Increase parameters for tables B, D, and X ([[BAUTOINC parameter|BAUTOINC]], [[DAUTOINC parameter|DAUTOINC]], and [[XAUTOINC parameter]]) as described in [[Managing file and table sizes | <p> | ||
Decide on file sizing and whether you want to add pages dynamically to tables as needed. This is done with Auto Increase parameters for tables B, D, and X (<var>[[BAUTOINC parameter|BAUTOINC]]</var>, <var>[[DAUTOINC parameter|DAUTOINC]]</var>, and <var>[[XAUTOINC parameter|XAUTOINC]]</var> parameters) as described in [[Managing file and table sizes]], which also describes how this maintenance may be done manually.</p> | |||
==Which tables do you need?== | ==Which tables do you need?== | ||
===Data files=== | ===Data files=== | ||
<p> | |||
File sizing is described in [[File sizing introduction]], but first consider a more basic question: which tables are needed in your file?</p> | |||
<p> | |||
Table A is required for the internal filed directory; Table B for the Base records (at least) of data; and Table D for the list and bit map pages; and the B-tree for any ordered indices.</p> | |||
<p> | |||
Table C can be left to its default (of a single page) if you do not have any <var>KEY</var> fields.</p> | |||
<p> | <p>Rocket software recommends that for almost any file, Table X should be turned on. See [[Table X (File architecture)#Advantages of using Table X|Advantages of using Table X]] for more information.</p> | ||
<p>If you expect to store data in Large Objects, then you should implement Table E. Because, once enabled (by setting Table E to a size greater than 0), Table E can be enlarged at any time, Rocket Software recommends that you err on the side of caution: set Table E to a single page (<var>FILEORG</var> X'100 files) in any file where there is even the possibility of wanting to store data in LOBs.</p> | |||
<p>If you expect to store data in Large Objects, then you should implement Table E. Because, once enabled (by setting Table E to a size greater than 0), Table E can be enlarged at any time, Rocket Software recommends that you err on the side of caution: set Table E to a single page (FILEORG X'100 files) in any file where there is even the possibility of wanting to store data in LOBs.</p> | |||
===Procedure files=== | ===Procedure files=== | ||
<p> | |||
Because procedure code is stored in [[Table D (File architecture)#Procedures|Table D]] this is the only Table that is required for files which will only contain procedures. (Note that Tables A, B, and C have minimums, so they will exist in procedure files.)</p> | |||
<p> | |||
For more information about storing procedures, see [[File size calculation in detail#Sizing the procedure dictionary|Sizing the procedure dictionary]]. </p> | |||
==Non-resettable parameters== | ==Non-resettable parameters== | ||
<p>Of the parameters that can be set by the file manager at file creation and file initialization, some can later be reset. The parameters below cannot be reset. </p> | <p> | ||
<p>Decide how you want to set the following parameters when you create and initialize your file.</p> | Of the parameters that can be set by the file manager at file creation and file initialization, some can later be reset. The parameters below cannot be reset. </p> | ||
<p> | |||
Decide how you want to set the following parameters when you create and initialize your file.</p> | |||
===FPARMS set during file creation=== | ===FPARMS set during file creation=== | ||
<p>The following parameters can be set by the file manager during file creation and cannot be reset:</p> | <p> | ||
< | The following parameters can be set by the file manager during file creation and cannot be reset:</p> | ||
< | <ul> | ||
< | <li><var>[[FILEORG parameter|FILEORG]]</var></li> | ||
</ | |||
< | <li><var>[[IVERIFY parameter|IVERIFY]]</var></li> | ||
< | |||
</ | <li><var>[[LANGFILE parameter|LANGFILE]]</var></li> | ||
< | </ul> | ||
< | |||
</ | |||
</ | |||
===Nonresettable FPARMS set during file initialization=== | ===Nonresettable FPARMS set during file initialization=== | ||
<p>The following parameters are set during file initialization (the INITIALIZE command), if applicable, and cannot be reset:</p> | <p> | ||
< | The following parameters are set during file initialization (the INITIALIZE command), if applicable, and cannot be reset:</p> | ||
< | <ul> | ||
< | <li><var>[[HASHKEY parameter|HASHKEY]]</var></li> | ||
</ | |||
< | <li><var>[[SORTKEY parameter|SORTKEY]]</var></li> | ||
< | |||
</ | |||
<li><var>[[RECSCTY parameter|RECSCTY]]</var></li> | |||
</ul> | |||
For more information on this topic, refer to [[FPARMS and TABLES file parameters]] | |||
<p>The next decision that the file designer must make is whether to store all data in one physical file or to separate data into several files. This section discusses the merits of storage in a single physical file, in multiple files, or in a file group.</p> | == Designing for scalability == | ||
<p> | |||
The next decision that the file designer must make is whether to store all data in one physical file or to separate data into several files. This section discusses the merits of storage in a single physical file, in multiple files, or in a file group.</p> | |||
===Combined logical files=== | ===Combined logical files=== | ||
<p>Because records in a <var class="product">Model 204</var> file are completely variable, a single physical file can contain more than one logical record type (for example, personnel and payroll records). Each record type can have its own distinct collection of fields. Each record might or might not contain a field such as RECTYPE, which identifies the type of record. </p> | <p> | ||
<p>If records of different types are to be related, each record should contain fields whose values form the relationship. </p> | Because records in a <var class="product">Model 204</var> file are completely variable, a single physical file can contain more than one logical record type (for example, personnel and payroll records). Each record type can have its own distinct collection of fields. Each record might or might not contain a field such as <code>RECTYPE</code>, which identifies the type of record. </p> | ||
<p> | |||
If records of different types are to be related, each record should contain fields whose values form the relationship. </p> | |||
<b>Example</b> | <b>Example</b> | ||
<p>For example, suppose that the payroll and personnel records are related by employee number. You can write the following request to display personnel information for employees who earn between $20,000 and $30,000:</p> | <p>For example, suppose that the payroll and personnel records are related by employee number. You can write the following request to display personnel information for employees who earn between $20,000 and $30,000:</p> | ||
Line 356: | Line 439: | ||
END | END | ||
</p> | </p> | ||
<p>The advantages of incorporating several logical files in one physical file include the following:</p> | <p> | ||
The advantages of incorporating several logical files in one physical file include the following:</p> | |||
<ul> | <ul> | ||
<li>Many requests can be done without explicit cross-referencing, thereby saving disk accesses. For example: | <li>Many requests can be done without explicit cross-referencing, thereby saving disk accesses. For example: | ||
Line 368: | Line 452: | ||
END | END | ||
</p> | </p> | ||
<p>The preceding request displays information from both the personnel and payroll records with only one FIND statement.</p> | <p>The preceding request displays information from both the personnel and payroll records with only one <var>FIND</var> statement.</p> | ||
</li> | </li> | ||
<li>Duplication of data can sometimes be avoided. For example, the payroll and personnel records both contain the field EMPL NO. Instead of having index entries for all values of EMPL NO in two files, one index entry for each value has pointers to all the records where it is found. Also, suppose that STATE, a CODED field, is part of the address in the personnel records and is carried for tax purposes on the payroll records. Maintaining these records in a single file allows coded values to be stored only once in Table A, but enables STATE to be decoded for both record types. </li> | |||
<li>Duplication of data can sometimes be avoided. For example, the payroll and personnel records both contain the field <code>EMPL NO</code>. Instead of having index entries for all values of <code>EMPL NO</code> in two files, one index entry for each value has pointers to all the records where it is found. Also, suppose that <code>STATE</code>, a <var>CODED</var> field, is part of the address in the personnel records and is carried for tax purposes on the payroll records. Maintaining these records in a single file allows coded values to be stored only once in Table A, but enables <code>STATE</code> to be decoded for both record types. </li> | |||
<li>When files are small (that is, less than 50,000 records), maintaining different kinds of data in a single file can reduce disk storage overhead. </li> | <li>When files are small (that is, less than 50,000 records), maintaining different kinds of data in a single file can reduce disk storage overhead. </li> | ||
</ul> | </ul> | ||
===Separate logical files=== | ===Separate logical files=== | ||
<p>If related data is organized into separate physical files, you can easily cross-reference the data according to values of fields in each of the files. Some of the advantages to organizing related data into separate physical files are:</p> | <p> | ||
If related data is organized into separate physical files, you can easily cross-reference the data according to values of fields in each of the files. Some of the advantages to organizing related data into separate physical files are:</p> | |||
<ul> | <ul> | ||
<li>Retrievals are usually faster if the RECTYPE condition is eliminated.</li> | <li>Retrievals are usually faster if the <var>RECTYPE</var> condition is eliminated.</li> | ||
<li>If online storage space is scarce, some files can be left off-line if the data they contain is not required.</li> | <li>If online storage space is scarce, some files can be left off-line if the data they contain is not required.</li> | ||
<li>Heavily updated data can be separated from data that is rarely or never updated, thus simplifying checkpointing and backup procedures.</li> | <li>Heavily updated data can be separated from data that is rarely or never updated, thus simplifying checkpointing and backup procedures.</li> | ||
<li>User access to logically separate data can be controlled easily by means of file security ( | |||
<li>User access to logically separate data can be controlled easily by means of file security (see [[Model 204 security features#File security|File security]]).</li> | |||
</ul> | </ul> | ||
===File groups=== | ===File groups=== | ||
<p>The file group feature provides a compromise between a single physical file and many separate files, and incorporates some of the advantages of both. File groups offer a separation of the physical aspects from the logical aspects of operating with complex databases. </p> | <p> | ||
<p>A < | The file group feature provides a compromise between a single physical file and many separate files, and incorporates some of the advantages of both. File groups offer a separation of the physical aspects from the logical aspects of operating with complex databases. </p> | ||
<p>File groups can be created either locally, with all files in the group owned by one copy of <var class="product">Model 204</var> or, with the Parallel Query Option/204 product, your file groups can be spread, or scattered, across multiple copies of <var class="product">Model 204</var>. Scattered file groups are discussed | <p> | ||
<p>Groups provide several important benefits:</p> | A <b>group</b> is a collection of physically distinct files that appear to the end user as a single logical entity. For example, the file group PEOPLE can be defined to contain the files PERSONEL and PAYROLL. The personnel and payroll files are accessible directly by their own names, and, in addition, the union of all of the data in both files is available under the name PEOPLE. The data are not duplicated; a special table relates the group PEOPLE to its member files. </p> | ||
<p> | |||
File groups can be created either locally, with all files in the group owned by one copy of <var class="product">Model 204</var> or, with the Parallel Query Option/204 product, your file groups can be spread, or scattered, across multiple copies of <var class="product">Model 204</var>. Scattered file groups are discussed in [[Files, groups, and reference context#Scattered groups|Scattered groups]].</p> | |||
<p> | |||
Groups provide several important benefits:</p> | |||
<ul> | <ul> | ||
<li>File group facility is ideal for data aging applications. Each member file is one aging unit that can be replaced easily.</li> | <li>File group facility is ideal for data aging applications. Each member file is one aging unit that can be replaced easily.</li> | ||
<li>Data can be combined into a number of different and possibly overlapping categories. For instance, files for each state in the United States can be members of multiple regional groups.</li> | <li>Data can be combined into a number of different and possibly overlapping categories. For instance, files for each state in the United States can be members of multiple regional groups.</li> | ||
<li>With Parallel Query Option/204 (PQO) you can create file groups that include remote files. See the <var class="book">Parallel Query Option/204 User's Guide</var> for more information.</li> | <li>With Parallel Query Option/204 (PQO) you can create file groups that include remote files. See the <var class="book">Parallel Query Option/204 User's Guide</var> for more information.</li> | ||
<li>File groups can provide an alias facility, allowing users to address a file or collection of files by different names. Because of the alias facility, file groups can provide a simple environment for testing and conversion. </li> | <li>File groups can provide an alias facility, allowing users to address a file or collection of files by different names. Because of the alias facility, file groups can provide a simple environment for testing and conversion. </li> | ||
<li>You can define a group of files, each with the same fields. After the group is formed, you can add new fields to one of the files, and the group continues to function as before without error messages. Files containing newly defined fields can be grouped with others that do not share the same field definitions. A search for a field does not return an error, if the field for which you are searching is in at least one file in the group. If none of the files contains a specified field, any statement referring to that field is rejected.</li> | <li>You can define a group of files, each with the same fields. After the group is formed, you can add new fields to one of the files, and the group continues to function as before without error messages. Files containing newly defined fields can be grouped with others that do not share the same field definitions. A search for a field does not return an error, if the field for which you are searching is in at least one file in the group. If none of the files contains a specified field, any statement referring to that field is rejected.</li> | ||
</ul> | </ul> | ||
<p>For information about creating a file group, see [[ | <p>For information about creating a file group, see [[Managing file groups|Managing file groups]]. For information about storing and using file group definitions, see [[Storing and using file group definitions (CCAGRP)]].</p> | ||
</div> <!-- end of toc limit div --> | |||
[[Category: | [[Category:Model 204 files]] |
Latest revision as of 15:01, 6 February 2019
Model 204 provides many features that allow you, as the file designer, a great deal of flexibility in creating files that optimize the efficiency of Model 204 at your site.
This topic covers the options available to you when designing files, along with the associated performance issues. Understanding these options, and understanding the database and application needs of your site, helps you select the correct options for file design, and the associated data structures described in:
As the file designer, your responsibility is to make informed decisions concerning these issues based on knowledge of the data and required results. In so doing, you can have a dramatic effect on the efficiency with which Model 204 runs.
Design and planning overview
When creating a Model 204 file, you can:
- Use the default file settings. (See the CREATE command for information about its default parameter values.) Model 204 has been designed so that the default options meet the needs of most users; however it is useful to be aware of what you can customize.
- Customize the file for your environment with settings for organization, recovery and security.
If you customize your file, consider the following questions, discussed in this topic:
- Which file organization type provides the most efficient data retrieval for my needs?
- What file recovery options are needed for this file?
- What level of security does this file need?
- What file sizing options do I need?
- How do I want to set the non-resettable parameters?
Specifying a file organization
During the design process, once you determine the requirement for a new file, you can look at the file design. The FILEORG parameter lets you choose a file organization.
This parameter controls two things:
- The physical organization of the files Table B.
- A variety of options concerning how record numbers are managed and the physical data storage.
File types
Entry order is the default. See the FILEORG parameter description for more information.
The file organization you choose determines the organization of the file's Table B. The following types of file organization are available:
- Entry order
- Unordered
- Sorted
- Hash key
Entry order, unordered, sorted, and hash key files can be used in any combination to make up a database. Any Model 204 file can be cross-referenced to any other Model 204 file regardless of the type of organization.
Choosing a file organization
Choose Table B organization based upon these factors:
- Frequency of retrievals
- Ordering of reports
- Volatility or stability of the data
- Available disk space
Entry order files
In ordinary Model 204 files, each new record is simply stored in the available space after any other records in Table B. When a set of records is retrieved from such files, the records are processed in the order in which they were stored. This order is chronological. These files are known as entry order files.
Entry order files are the most widely used Model 204 files. They provide full inverted file capabilities but do not support the unordered file, sorted file, and hash key file facilities described later in this section. If you require sorted output, you can include the SORT statement in a SOUL request, use the IFSORT call in a Host Language Interface program, or specify the ORDERED attribute.
In an entry order file, new records are added only after other records in the file. A fixed number of pages is assigned to Table B of the file by the CREATE command. You can increase the number of pages any time, subject only to the limits of the total disk space available (or 16M pages). You can decrease them at any time to the value of the BHIGHP parameter plus one. When an attempt is made to store another record in a file that has insufficient space available at the end of the file, the transaction is backed out if possible, and an error message is issued; the file might also be marked full. This occurs even though numerous records in the file might have been deleted.
In an entry order file, space made available by deleting records can be reused for the expansion of existing Table B records, but cannot be reused for adding new records to the file. These records must be added in whatever space is available after all the other records in the file.
Unordered files (X'20')
The unordered file organization makes most deleted Table B space available for reuse for new records. A queue of pages formed from deleted Table B space is maintained and called the reuse queue. Most of the available space on these pages is reused before the file is considered full.
Assuming random deletions, the longer a page has been on the reuse queue, the more likely it is that additional space has become available on it. When a record is to be added, Model 204 first tries the current Table B appends page. The appends page is the page to which records are currently being added; its page number is equal to the value of the BHIGHPG parameter. Page space is evaluated in the following sequence:
- If the appends page has insufficient space, the oldest page from the head of the reuse queue is examined.
- If that page is full, Model 204 removes it from the reuse queue and tries the next page.
- If the reuse queue is empty, Model 204 starts a new page (BHIGHPG+1).
- If no more pages are available, Model 204 looks for available space on as many as sixteen randomly selected already-used pages
- If this fails the file is considered full.
Reusing record numbers
In order to use space on any page for a new record, a record number must be available on that page. To be useful, an unordered file must either be designed with many extra record numbers per page, or must use the reuse record number (RRN) option. See Reuse Record Number (X'04').
Planning open space
For unordered files, the BREUSE parameter specifies how much space a page must have available before being placed on the reuse queue. BREUSE is expressed as a percentage of the space available over and above BRESERVE.
The BRESERVE parameter specifies the size of an area at the end of each Table B page in which new records cannot be started. This area is reserved for the expansion of existing records on the page. When deletions occur, if there are enough additional bytes of free space available beyond BRESERVE to provide the free space required for reuse (parameter BREUSE), and at least one record number is available, the page is added to the reuse queue.
You can set BREUSE for files with any file organization, but the reuse queue is built for unordered files only. Files, other than unordered, may have BREUSE set but no reuse queue present.
Maintaining the reuse queue
The reuse queue is a single queue of pages chained together by a 4-byte pointer at the bottom of each page. The total number of pages on the queue, along with pointers to the top and bottom of the queue, are maintained on the FPL page.
Searching for pages with reusable record numbers begins at the top of the queue. As pages qualify for the reuse queue due to deleted records, those pages are added to the bottom of the queue. Starting a search at the top of the queue, where the pages are the oldest, allows additional time for constrained space to be released and helps guarantee that a record can be reused on that page.
A page is added to the reuse queue if free space on the page is larger than:
(PAGE SIZE - 8 - BRESERVE) * BREUSE/100
When a page is removed from the queue, indicating that it no longer contains reusable record numbers, the reuse queue chain pointer on that page is set to zero and the pointer on the previous page in the queue is adjusted accordingly.
Understanding eligible and ineligible reuse queue pages
A page is added to the reuse queue when deletions occur on a Table B page and the page has BREUSE space available over and above BRESERVE, which is the space reserved only for expanding existing records. New records cannot be started in BRESERVE space. Also, there must be at least one available record number on the page. A page that meets this criteria as an eligible page.
Once an eligible page is on the reuse queue it remains there even if it becomes ineligible, until one of the following events occurs:
- During updating, if a page is selected from the reuse queue that is ineligible, it is removed from the queue.
- You issue a BLDREUSE command.
- The file is reorganized.
Generating ineligible reuse queue pages
Adjustment to file parameters BREUSE and/or BRESERVE does not immediately remove a page from the reuse queue although the new values may make the page ineligible.
If the page is eligible with room to add the entire new record (physical base or extension record) to the page, the addition is made and the page remains on the reuse queue, even if it is now ineligible.
Sorted files (X'01')
If most processing or output is required in order by a certain key, consider a sorted file organization. Sorted files store and can retrieve records in order by a sort key field. Thus, when processing records in order by the sort key, a large amount of Online sorting can be eliminated. Because a sort key can be alphabetic, it also provides a convenient method of doing alphabetical ranging on a single key without doing a direct search on the data.
The sorted organization in Table B is similar to ISAM (Indexed Sequential Access Method), where there are master areas and overflow areas. Because of this, use a sorted organization only when at least half of the total records can be loaded in sorted order and the data is not volatile.
- In a sorted file, the X'01' bit of the FILEORG parameter is on.
See Sorted files for more information.
Hash key files (X'08')
If all records contain a unique or fairly unique key and if most retrievals are done on the basis of that key alone, consider a hash key file organization. In a hash key file, Model 204 hashes directly to the Table B record, which can drastically reduce disk I/O. However, because Table B is ordered randomly in a hash key file, Table B must be made large enough to allow for the expected growth of the file; it cannot be expanded without reloading the entire file.
- In a hash key file, the X'08' bit of the FILEORG parameter is on.
See Hash key files for more information.
Fine tuning a file type
In addition to the basic structure of the file's Table B, there are four other settings which can change the way in which Model 204 stores and processed records.
- X'04' — Reuse Record Numbers
- X'80' — Optimized Field Extraction Files
- X'100' — Enhanced Data Handling Files
- X'200' — Large files (up to 48 million records)
Reuse Record Number (X'04')
The Reuse Record Number (RRN) feature allows Model 204 to reuse the record number of a deleted record for a new record, if a record number is available on the Table B page to which the new record is added. Consider the following points during the design of applications that allow record number reuse:
- Record number order might not correspond to chronological order. If record A is added to a file, and record B is added later, record A might have a record number higher than that of record B.
- RRN is optional for unordered, sorted, and hash key files. It is activated by turning on the X'04' bit of the FILEORG parameter. Setting the RRN bit without setting the sorted or hash bits automatically sets the unordered file bit of FILEORG.
The Reuse Record Number option may be used with unordered, sort, or hash files.
Using INVISIBLE fields in RRN files
Special care must be taken in the use of INVISIBLE fields within files that reuse record numbers. Because INVISIBLE fields reside only in the index portion of a file, they are not automatically deleted from Model 204 files when the records with which they are associated have been deleted.
To avoid the possibility of new records inheriting the INVISIBLE fields of records previously deleted, it is necessary to delete explicitly a record's INVISIBLE fields at the time that the record itself is deleted. Use the SOUL statement DELETE fieldname = value
or the Host Language Interface IFDVAL function to perform this task.
Deleting records properly in RRN files
The SOUL statement DELETE ALL RECORDS IN and the Host Language Interface IFDSET function make record sets inaccessible (that is, they logically delete the records), but neither physically remove the records from the file nor release the related record numbers. Therefore, neither the space that contains the deleted records nor the record numbers can be reused. Only record numbers freed by the DELETE RECORD statement or the IFDREC function can be reused.
Committing deleted records
When deleting records in a found set, you must use COMMIT RELEASE after deleting the records to remove records from the found set and to make the record numbers available to be reused. The COMMIT RELEASE (rather than a simple COMMIT) is required to prevent future record-locking conflicts between the deleted records and any new records being stored.
Using the RRN option in hash key files
The efficiency of the RRN option with hash key files is that there is no reuse queue, because one is not built and therefore not maintained for hash key files. In a hash key file, when Model 204 hashes to a page and there is an available record number, the record is stored there. If the RRN option is defined for the file, Model 204 can reuse record numbers from previously deleted records on the hashed to page.
RRN files and deferred update mode
Model 204 customer support suggests that you do not place RRN files in deferred update mode. Using deferred update mode might cause your site to lose updates without any warning. FLOD jobs are particularly susceptible, because they automatically open files for deferred update and require additional steps to override this behavior. However, any RRN file used in deferred update mode is in danger.
The following scenario demonstrates what can happen in a FLOD job:
Input record | Key | Action |
---|---|---|
A | 1 |
|
many records later.... | ||
J | 90 |
|
K | 91 |
|
L | 92 |
|
many records later.... | ||
X | 1 |
|
When records are deleted with Key = 1, Model 204 immediately updates the Existence Bit Pattern (EBP) in Table D. However, because the file is in deferred update mode, the index is not updated
When input record J is added, it finds space on the Table B page previously occupied by records with Key = 1. So input records J, K, and L are added in the available slots. Again, index updates are deferred.
Later, input record X deletes all records with key = 1. It finds the records that originally had key = 1 (the index has not been updated). It also finds that these records do exist (because the EBP was updated when records J, K, and L were added), so they are deleted. When the file is checked after the FLOD job, you find that input records J, K, and L were never added to the file.
Exceptions: You can use deferred update mode for RRN files if you are doing simple record adds, such as a file reorganization, or if record adds and deletes are performed as separate jobs.
Locating RRN records
Some Model 204 features locate records through explicit use of record numbers, notably the SOUL POINT$ retrieval condition and the Host Language Interface IFPOINT function. Avoid dependence upon record numbers wherever possible, because a record number that at one time pointed to a record that was deleted can, at some later time, again point to a valid record, which is not the same record originally stored with that record number.
Optimized field extraction files (X'80')
When this option is set, all non-preallocated (non-OCCURS) fields are preceded by a field-value length byte.
With a length byte on every field, even FLOAT, CODED or BINARY fields, several instructions and one IF test are eliminated from the internal field scan loop. Having a length byte also allows some simple compression of BINARY, CODED, and FLOAT values which offsets, to some extent, the possible increase in space that this feature entails.
If, for example, you have a large number of 'amount' fields defined as BIN, and they are predominantly small numbers, you might not only improve performance but save space by taking this option. Alternately, if you are storing mostly dates, for example, each field value pair will take an additional byte.
Enhanced data handling files (X'100')
Setting this bit enables a number of enhancements to the file structure, including:
- support for Field groups
- the definition of up to 32000 fields in a file
- system maintained Automatic fields and additional Field constraints providing content validation
- improved space management of fields containing Large Objects
Setting the X'100' bit requires additional overhead in Table B, as described in FILEORG.
If X'100' is set, the X'80' bit is automatically also set.
Note: FILEORG X'100' files cannot be loaded via FLOD or FILELOAD except with a Fast/Reload Load All Information (LAI) FLOD/FILELOAD program.
Enhanced data handling files are available as of version 7.5 of Model 204.
Large file support (X'200')
Setting this FILEORG bit allows for a file to hold up to 48 million records (versus 16M records without this setting).
- Space on file pages is used just as efficiently as a non-FILEORG X'200' file. There is one extremely small overhead: the extension record pointers are 4 bytes rather than 3, just as in a file that contains Table X. In fact, if there is any but the very smallest number of extension records, using Table X is recommended, so that all of the (up to 48M) record numbers are available for base records.
- May not be used in sorted (
FILEORG X'01'
) nor hashed (FILEORG X'08'
) files. - A few considerations are required for deferred updates.
Available as of version 7.5 of Model 204.
File recovery options
If a file transaction does not complete and the data is not fully updated, do you want the data transactions backed out? How much logging of file transactions do you want done in case recovery is needed?
When creating each of your files, you can specify whether you want the file to use recovery features such as Transaction Back out (TBO) and logging. These features are enabled by default. In an online environment, it is recommended that you use the defaults so that the file you are creating can participate in recovery if the system crashes.
File recovery options are determined by the FOPT and FRCVOPT parameters.
Transaction back out is a file integrity facility that can logically undo the effects of incomplete transactions on file data. You can use checkpoints and TBO to restore file data. In the event of a system crash, Model 204 crash, or hard restart, you can use the RESTART command to roll back to the last valid checkpoint. If roll forward logging to the journal is also active during the Online processing, you can use the roll forward feature to restore as many file changes as possible.
See File integrity and recovery and Transaction back out for more information on TBO, logging, and other recovery features.
To use file recovery, you will want to design your files to take best advantage of the recovery functionality and minimize any performance disadvantages of logging. See Transaction back out for more information on file considerations when using the TBO feature.
File security levels
These options of the OPENCTL parameter define file security:
- Public file (OPENCTL X'80')
This option allows anyone to open the file without a password and everyone gets the default privileges depending on the value of PRIVDEF.
- Semi-public file (OPENCTL X'40)
This option allows anyone to open the file and if they do not provide a password they get privileges associated with PRIVDEF. If they do provide a password, the privileges are determined by the password.
- Private file (OPENCTL X'00')
The user must provide a password, and privileges are determined from the password.
You can also specify record security for individual records in a file and supply a record security key. See Security for details on Security and PRIVDEF options.
Note: Most of the file access will likely be done through Application Subsystems (APSY). The security options for APSY access are set in its definition.
File sizing options
Decide on file sizing and whether you want to add pages dynamically to tables as needed. This is done with Auto Increase parameters for tables B, D, and X (BAUTOINC, DAUTOINC, and XAUTOINC parameters) as described in Managing file and table sizes, which also describes how this maintenance may be done manually.
Which tables do you need?
Data files
File sizing is described in File sizing introduction, but first consider a more basic question: which tables are needed in your file?
Table A is required for the internal filed directory; Table B for the Base records (at least) of data; and Table D for the list and bit map pages; and the B-tree for any ordered indices.
Table C can be left to its default (of a single page) if you do not have any KEY fields.
Rocket software recommends that for almost any file, Table X should be turned on. See Advantages of using Table X for more information.
If you expect to store data in Large Objects, then you should implement Table E. Because, once enabled (by setting Table E to a size greater than 0), Table E can be enlarged at any time, Rocket Software recommends that you err on the side of caution: set Table E to a single page (FILEORG X'100 files) in any file where there is even the possibility of wanting to store data in LOBs.
Procedure files
Because procedure code is stored in Table D this is the only Table that is required for files which will only contain procedures. (Note that Tables A, B, and C have minimums, so they will exist in procedure files.)
For more information about storing procedures, see Sizing the procedure dictionary.
Non-resettable parameters
Of the parameters that can be set by the file manager at file creation and file initialization, some can later be reset. The parameters below cannot be reset.
Decide how you want to set the following parameters when you create and initialize your file.
FPARMS set during file creation
The following parameters can be set by the file manager during file creation and cannot be reset:
Nonresettable FPARMS set during file initialization
The following parameters are set during file initialization (the INITIALIZE command), if applicable, and cannot be reset:
For more information on this topic, refer to FPARMS and TABLES file parameters
Designing for scalability
The next decision that the file designer must make is whether to store all data in one physical file or to separate data into several files. This section discusses the merits of storage in a single physical file, in multiple files, or in a file group.
Combined logical files
Because records in a Model 204 file are completely variable, a single physical file can contain more than one logical record type (for example, personnel and payroll records). Each record type can have its own distinct collection of fields. Each record might or might not contain a field such as RECTYPE
, which identifies the type of record.
If records of different types are to be related, each record should contain fields whose values form the relationship.
Example
For example, suppose that the payroll and personnel records are related by employee number. You can write the following request to display personnel information for employees who earn between $20,000 and $30,000:
BEGIN FIND.PAY: FIND ALL RECORDS FOR WHICH RECTYPE = PAYROLL SALARY IS BETWEEN 19999 AND 30001 END FIND PAY.LOOP: FOR EACH RECORD IN FIND.PAY EMPL.NO: NOTE EMPL NO FIND.PRSNL: FIND ALL RECORDS FOR WHICH RECTYPE = PERSONNEL EMPL NO = VALUE IN EMPL.NO END FIND INFO.LOOP: FOR EACH RECORD IN FIND.PERSNL PRINT: PRINT ALL INFORMATION END FOR END FOR END
The advantages of incorporating several logical files in one physical file include the following:
- Many requests can be done without explicit cross-referencing, thereby saving disk accesses. For example:
BEGIN ALL: FIND ALL RECORDS FOR WHICH EMPL NO = 1234 END FIND FOR.LOOP: FOR EACH RECORD IN ALL PRINT: PRINT ALL INFORMATION END FOR END
The preceding request displays information from both the personnel and payroll records with only one FIND statement.
- Duplication of data can sometimes be avoided. For example, the payroll and personnel records both contain the field
EMPL NO
. Instead of having index entries for all values ofEMPL NO
in two files, one index entry for each value has pointers to all the records where it is found. Also, suppose thatSTATE
, a CODED field, is part of the address in the personnel records and is carried for tax purposes on the payroll records. Maintaining these records in a single file allows coded values to be stored only once in Table A, but enablesSTATE
to be decoded for both record types. - When files are small (that is, less than 50,000 records), maintaining different kinds of data in a single file can reduce disk storage overhead.
Separate logical files
If related data is organized into separate physical files, you can easily cross-reference the data according to values of fields in each of the files. Some of the advantages to organizing related data into separate physical files are:
- Retrievals are usually faster if the RECTYPE condition is eliminated.
- If online storage space is scarce, some files can be left off-line if the data they contain is not required.
- Heavily updated data can be separated from data that is rarely or never updated, thus simplifying checkpointing and backup procedures.
- User access to logically separate data can be controlled easily by means of file security (see File security).
File groups
The file group feature provides a compromise between a single physical file and many separate files, and incorporates some of the advantages of both. File groups offer a separation of the physical aspects from the logical aspects of operating with complex databases.
A group is a collection of physically distinct files that appear to the end user as a single logical entity. For example, the file group PEOPLE can be defined to contain the files PERSONEL and PAYROLL. The personnel and payroll files are accessible directly by their own names, and, in addition, the union of all of the data in both files is available under the name PEOPLE. The data are not duplicated; a special table relates the group PEOPLE to its member files.
File groups can be created either locally, with all files in the group owned by one copy of Model 204 or, with the Parallel Query Option/204 product, your file groups can be spread, or scattered, across multiple copies of Model 204. Scattered file groups are discussed in Scattered groups.
Groups provide several important benefits:
- File group facility is ideal for data aging applications. Each member file is one aging unit that can be replaced easily.
- Data can be combined into a number of different and possibly overlapping categories. For instance, files for each state in the United States can be members of multiple regional groups.
- With Parallel Query Option/204 (PQO) you can create file groups that include remote files. See the Parallel Query Option/204 User's Guide for more information.
- File groups can provide an alias facility, allowing users to address a file or collection of files by different names. Because of the alias facility, file groups can provide a simple environment for testing and conversion.
- You can define a group of files, each with the same fields. After the group is formed, you can add new fields to one of the files, and the group continues to function as before without error messages. Files containing newly defined fields can be grouped with others that do not share the same field definitions. A search for a field does not return an error, if the field for which you are searching is in at least one file in the group. If none of the files contains a specified field, any statement referring to that field is rejected.
For information about creating a file group, see Managing file groups. For information about storing and using file group definitions, see Storing and using file group definitions (CCAGRP).