File sizing introduction: Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (Admin moved page File size calculation to File sizing introduction without leaving a redirect: Better reflection of the content of the page and less confusion with other file sizing page)
 
(12 intermediate revisions by 5 users not shown)
Line 1: Line 1:
==Overview - Two approaches to File Sizing==
==Overview: two approaches to file sizing==
<p>After designing the data structures you are implementing (see [[Field Design (File Management)|field]] and [[Repeating Field Group Design (File Management)|Repeating Field Group design]]) there are two ways for a file manager to approach the calculation of file sizes:</p>
<p>
After designing the data structures you are implementing (see [[Field design]] and [[Field group design]]) there are two ways for a file manager to approach the calculation of file sizes:</p>
<p>
You can take the ad-hoc approach, by making sizing estimates and doing one or a combination of these:</p>
<ul>
<li>Iteratively load a sampling of records to verify</li>
<li>Use the development and testing process to make final sizing decisions</li>
</ul>


<p>You can take the ad-hoc approach, by making sizing estimates and either:</p>
<p>Alternatively, you can do a detailed analysis of the data you expect the file to contain, and try to derive precise sizes for the Model 204 tables, a laborious process.</p>


* iteratively load a sampling of records to verify
==Choosing an approach==
* use the development and testing process to make final sizing decisions
<p>
 
Most Model 204 file managers use the ad-hoc design approach. Often there is already a production file with similar characteristics to the new file you are creating. Simply copying its parameters as a starting point is a quick way to get a file ready for development and testing.</p>
<p>or, of course, a combination of both.</p> 
 
<p>Alternatively, you can do a detailed analysis of the data you expect the file to contain, and try and derive precise sizes for the Model 204 tables, a laborious process.</p>
 
==Choosing an Approach==
 
<p>Most Model 204 file managers use the ad-hoc design approach. Often there is already a production file with similar characteristics to the new file you are creating. Simply copying its parameters as a starting point is a quick way to get a file ready for development and testing.</p>


<p>Moreover, most of the sizes can be easily changed dynamically, so an extreme level of precision is not overly important.</p>
<p>Moreover, most of the sizes can be easily changed dynamically, so an extreme level of precision is not overly important.</p>


<p>However, the detailed Model 204 file sizing rules do provide a level of knowledge that the file manager should be grounded in. So, to use (or just understand) the sizing rules, see [[File Size Calculation in Detail]]. </p>
<p>However, it is valuable for a file manager to be grounded in the principles of Model 204 file size calculation. To use (or just understand) the sizing rules, see [[File size calculation in detail]]. </p>
 
==Critical, Up Front Decisions==
 
<p>There are, however, certain decisions which are more difficult to fix and so should be as 'correct' as possible, as early as possible:</p>
 
 
<b>[[ATRPG parameter|ATRPG]] and [[ASTRPPG parameter]]s</b>
 
<p>Because the [[Table A (File Architecture)#Internal File Dictionary|Internal File Dictionary]] is hashed, it can only be resized by reorganizing / recreating the file. This involves an outage of the online, and so should be avoided.</p>
 
<p>You can look at the detailed calculation rules at [[File Size Calculation in Detail#Sizing Table A|Sizing Table A]] or, given how small Table A is compared to the other tables, round up on your rough estimate for [[ATRPG parameter|ATRPG]] and round down (to try and fit fewer field definitions on each Table A page) for [[ASTRPPG parameter|ASTRPPG]].</p>
 
 
<b>[[CSIZE parameter]]</b>


<p>Like the internal file dictionary, Table C is hashed and so can not be dynamically changed.</p>
==Critical, up-front decisions==
<p>
Because certain decisions are more difficult to fix, they must be as "correct" as possible, as early as possible. These are described in the sections that follow.</p>  


<p>The easiest way to make sure that this is an issue is not to define any KEY or NUMERIC RANGE fields: make them ordered instead. This has the associated advantage of, if you use [[FILEORG parameter|FILEORG]] x'100' files, of permitting up to 32000 field names in a file.</p>  
===ATRPG and ASTRPPG parameters===
<p>
Because the [[Table A (File architecture)#Internal File Dictionary|Internal File Dictionary]] is hashed, it can only be resized by reorganizing/recreating the file. This involves an outage of the Online, so it should be avoided.</p>  


<p>You can look at the detailed calculation rules at [[File size calculation in detail#Sizing Table A|Sizing Table A]] or, given how small Table A is compared to the other tables, round up on your rough estimate for <var>[[ATRPG parameter|ATRPG]]</var> and round down (to try and fit fewer field definitions on each Table A page) for <var>[[ASTRPPG parameter|ASTRPPG]]</var>.</p>


<b>Number of Datasets</b>
===CSIZE parameter===
<p>
Like the internal file dictionary, Table C is hashed and so cannot be dynamically changed.</p>


<p>You can dynamically [[INCREASE command|add datasets]] to a Model 204 file.</p>
<p>The easiest way to make sure that this is an issue is not to define any <var>KEY</var> or <var>NUMERIC RANGE</var> fields: make them ordered instead. This has the associated advantage, if you use <var>[[FILEORG parameter|FILEORG]]</var> X'100' files, of permitting as many as 32,000 field names in a file.</p>


<p>The reason that it is better not have to is that there may be JCL containing file references which would need to be updated at the same time.</p>
===Number of data sets===
<p>
You can dynamically [[INCREASE command|add datasets]] to a Model 204 file.</p>


<p>Unless space is at a premium, it is a good idea to define dataset(s) larger than you need, which gives you the ability to automatically, or manually, increase the Tables without issue.</p>  
<p>It is better not to have to do this, because there may be JCL containing file references that would need to be updated at the same time.</p>


<p>Unless space is at a premium, it is a good idea to define larger data set(s) than you need, which gives you the ability to automatically, or manually, increase the Tables without issue.</p>   


[[Category:File management]]
[[Category:Model 204 files]]

Latest revision as of 18:45, 13 April 2015

Overview: two approaches to file sizing

After designing the data structures you are implementing (see Field design and Field group design) there are two ways for a file manager to approach the calculation of file sizes:

You can take the ad-hoc approach, by making sizing estimates and doing one or a combination of these:

  • Iteratively load a sampling of records to verify
  • Use the development and testing process to make final sizing decisions

Alternatively, you can do a detailed analysis of the data you expect the file to contain, and try to derive precise sizes for the Model 204 tables, a laborious process.

Choosing an approach

Most Model 204 file managers use the ad-hoc design approach. Often there is already a production file with similar characteristics to the new file you are creating. Simply copying its parameters as a starting point is a quick way to get a file ready for development and testing.

Moreover, most of the sizes can be easily changed dynamically, so an extreme level of precision is not overly important.

However, it is valuable for a file manager to be grounded in the principles of Model 204 file size calculation. To use (or just understand) the sizing rules, see File size calculation in detail.

Critical, up-front decisions

Because certain decisions are more difficult to fix, they must be as "correct" as possible, as early as possible. These are described in the sections that follow.

ATRPG and ASTRPPG parameters

Because the Internal File Dictionary is hashed, it can only be resized by reorganizing/recreating the file. This involves an outage of the Online, so it should be avoided.

You can look at the detailed calculation rules at Sizing Table A or, given how small Table A is compared to the other tables, round up on your rough estimate for ATRPG and round down (to try and fit fewer field definitions on each Table A page) for ASTRPPG.

CSIZE parameter

Like the internal file dictionary, Table C is hashed and so cannot be dynamically changed.

The easiest way to make sure that this is an issue is not to define any KEY or NUMERIC RANGE fields: make them ordered instead. This has the associated advantage, if you use FILEORG X'100' files, of permitting as many as 32,000 field names in a file.

Number of data sets

You can dynamically add datasets to a Model 204 file.

It is better not to have to do this, because there may be JCL containing file references that would need to be updated at the same time.

Unless space is at a premium, it is a good idea to define larger data set(s) than you need, which gives you the ability to automatically, or manually, increase the Tables without issue.