File architecture overview: Difference between revisions

From m204wiki
Jump to navigation Jump to search
mNo edit summary
No edit summary
 
(8 intermediate revisions by 5 users not shown)
Line 1: Line 1:
== Summary ==
==Summary==
 
<p>
<p>Model 204 files provide a highly flexible environment for handling large or small amounts of data. The system supports all types of data structures: flat structures, relations, hierarchies, and networks.</p>
Model 204 files provide a highly flexible environment for handling large or small amounts of data. The system supports all types of data structures: flat structures, relations, hierarchies, and networks.</p>
 


<p>Model 204 uses inverted file retrieval techniques. These techniques facilitate rapid retrieval of data without requiring expensive scanning of the database itself.</p>
<p>Model 204 uses inverted file retrieval techniques. These techniques facilitate rapid retrieval of data without requiring expensive scanning of the database itself.</p>


<p>Data within a Model 204 file are kept in an arbitrary collection of [[Record (File architecture)|records]].</p>
<p>Data within a Model 204 file is kept in an arbitrary collection of [[Record (File architecture)|records]].</p>


<p>Records in a file normally are related to each other, although they need not be. Records within a given file may have the same format (same collection of field names) or any number of different formats (containing any mixture of the fields defined to Table A, below). A file can contain as many as 16.7 million records. With the new FILEORG x&#x2019;200&#x2019; setting, available from Model 204 V7R5 onwards, this upper limit will increase to 48 million.</p>
<p>Records in a file normally are related to each other, although they need not be. Records within a given file may have the same format (same collection of field names) or any number of different formats (containing any mixture of the fields defined to Table A, below). Prior to version 7.5 of Model 204, a file can contain as many as 16.7 million records. With the <var>[[FILEORG parameter|FILEORG]]</var> X'200' setting, available as of Model 204 V7.5, this upper limit increases to 48 million.</p>


<p>Files can be linked together logically according to field values within the files. Any number of files can be linked in this way. As many as 32,767 files can be accessed in one Model 204 job.</p>  
<p>Files can be linked together logically according to field values within the files. Any number of files can be linked in this way. As many as 32,767 files can be accessed in one Model 204 job.</p>  
Line 14: Line 13:
<p>Model 204 files consist of one or more data sets. Each data set is formatted into fixed-length physical records called [[Page (File architecture)|pages]], and all pages are the same size: 6184 bytes.</p>  
<p>Model 204 files consist of one or more data sets. Each data set is formatted into fixed-length physical records called [[Page (File architecture)|pages]], and all pages are the same size: 6184 bytes.</p>  


<p>Internally, these pages are organized into tables: the [[File Control Table (File architecture)|File Control Table]] (FCT); [[Table A (File architecture)|Table A]]; [[Table B (File architecture)|Table B]]; [[Table C (File architecture)|Table C]]; [[Table D (File architecture)|Table D]]; [[Table E (File architecture)|Table E]]; and [[Table X (File architecture)|Table X]]. Any pages in the file not allocated to one of these tables are considered free space. See [[Managing file and table sizes]] for how this free space can be applied to the other tables.</p>
<p>Internally, these pages are organized into tables: the [[File Control Table (File architecture)|File Control Table]] (FCT), [[Table A (File architecture)|Table A]], [[Table B (File architecture)|Table B]], [[Table C (File architecture)|Table C]], [[Table D (File architecture)|Table D]], [[Table E (File architecture)|Table E]], and [[Table X (File architecture)|Table X]]. Any pages in the file not allocated to one of these tables are considered free space. See [[Managing file and table sizes]] for how this free space can be applied to the other tables.</p>
 
==The components of a Model 204 File==
 
 
[[File:Table Structure (File Architecture).jpg]]
 
 
 
 
=== File Control Table ===
 
The [[File Control Table (File architecture)|File Control Table (FCT)]] Contains file parameter settings, data set or file definition names of all data sets in the file, the status of the file, and other control information. The FCT is always 8 pages.


<div id="fileComponents"></div>
==The components of a Model 204 file==


=== The Internal File Directory ===
[[File:Table Structure (File Architecture).jpg|border]]


[[Table A (File architecture)|Table A]] contains three structures:
===File Control Table===
The [[File Control Table (File architecture)|File Control Table (FCT)]] contains file parameter settings, data set or file definition names of all data sets in the file, the status of the file, and other control information. The FCT is always 8 pages.


A dictionary of the fieldgroup / field names and their attributes.
===Internal file directory===
[[Table A (File architecture)|Table A]] contains three structures: a file dictionary, a set of <var>MANY-VALUED</var> pages, and a set of <var>FEW-VALUED</var> pages.


Some attributes (notably [[Field design (File management)#CODED and NON-CODED attributes|CODED]]) require lists of values to be maintained. These lists are stored either in the [[Field design (File management)#FEW-VALUED and MANY-VALUED attributes|FEW-VALUED or MANY-VALUED]] structures.  
The file dictionary contains the file's field group and field names and their attributes. Some field attributes (notably <var>[[Field design#CODED and NON-CODED attributes|CODED]]</var>) require lists of values to be maintained. These lists are stored either in the [[Field design#FEW-VALUED and MANY-VALUED attributes|FEW-VALUED or MANY-VALUED]] structures.  


Table A usually is small in relation to the rest of the file. The field name section in particular should be as small as possible to aid efficient access, especially if your site uses field name variables
Table A usually is small in relation to the rest of the file. The field name section in particular should be as small as possible to aid efficient access, especially if your site uses field name variables


=== Data ===
===Data===


==== Data Structures ====
====Data structures====


All data held in a Model 204 file are held in [[Record (File architecture)|records]].
All data held in a Model 204 file is held in [[Record (File architecture)|records]].


Model 204 records are best thought of as very loosely defined containers, with almost no fixed structure.  
Model 204 records are best thought of as very loosely defined containers, with almost no fixed structure.  


 
The fundamental elements of Model 204&#x2019;s logical data structure are:
::{| class="wikitable";style="width="80%
<table style="width:80%">
 
<tr><td>Record</td>
|-
<td>Collection of fields, either individual or in physical field groups (see below).
|-
<p>
! scope="row"| [[Record (File architecture)|Record]]
| Collection of fields, either individual or in physical field groups (see below).
 
Each record is variable in length and need contain only the fields that pertain to it. The limit of the number of field value
Each record is variable in length and need contain only the fields that pertain to it. The limit of the number of field value
pairs in a record is in the tens of millions.
pairs in a record is in the tens of millions.</p>
 
<p>
There is only a limited fixed format for a record ([[Field design (File management)#Preallocated fields|preallocated fields]]). Almost any number of fields
There is only a limited fixed format for a record ([[Field design#Preallocated fields (OCCURS attribute)|preallocated fields]]). Almost any number of fields
can appear almost any number of times in almost any order.
can appear almost any number of times in almost any order.
Each record is automatically assigned a unique internal record
Each record is automatically assigned a unique internal record
number that is used by the system to build index entries for the
number that is used by the system to build index entries for the
record.
record.</p></td></tr>


|-
<tr><td>Field</td>
! scope="row"| [[Field group (File architecture)|Field group]]  
<td>Elementary data item.
| Available as of Model 204 V7R5.
<p>
Fields are [[DEFINE FIELD command|defined]] with attributes that control storage formats and indexing options. Up to 4000 (32000 as of Model 204 V7.5, when using FILEORG X'100') different field names can be defined in a single Model 204 file.</p></td></tr>


A field group is, as the name implies, a set of fields which are retrieved and updated in a single operation.  
<tr><td nowrap>[[Field group (File architecture)|Field group]] </td>
<td>A field group is, as the name implies, a set of fields that are retrieved and updated in a single operation.  
<p>
Available as of Model 204 V7.5.</p></td></tr>


|-
<tr><td>File</td>
! scope="row"| Field
<td>An arbitrary collection of records.</td></tr>
| Elementary data item.


Fields are [[DEFINE FIELD command| defined]] with attributes that control storage formats and indexing options. Up to 4000 (32000 as of Model 204 V7R5, when using FILEORG x'100') different field names can be defined in a single Model 204 file.
<tr><td>File group</td>
<td>A collection of files logically grouped together that can be treated as a single file.</td></tr>
</table>


|-
====Data tables====
! scope="row"| File
[[Table B (File architecture)|Table B]] contains (at least) the base data of all the records in the file. These base records contain pointers to any extensions that exist (whether they are elsewhere in Table B or in Table X).
| An arbitrary collection of records.
 
[[Table X (File architecture)|Table X]] when enabled, holds all extension records. Table B is then used
to store only base records, thus maximizing the possible number of records that can be stored regardless of the number of extension records.


|-
[[Table E (File architecture)|Table E]] (when enabled) contains Large Object (LOB) data for the file. The pointer to the starting point of the LOB is contained in Table B or X.  
! scope="row"| File group
| A collection of files logically grouped together that can be treated as a single file.


|}
===Indexing===


==== Data Tables ====
====Index structures====
<p>
The indexing structures necessary for direct retrieval of
records are contained in [[Table C (File architecture)|Table C]] and [[Table D (File architecture)|Table D]].</p>


[[Table B (File architecture)|Table B]] contains (at least) the base data of all the records in the file. These base records contain pointers to any extensions which exist (whether they are elsewhere in Table B or in Table X).
<p>There are two types of indexes, both of which utilize Table D [[Table D (File architecture)#List and bitmap pages|list and bitmap pages]]:</p>
 
[[Table X (File Architecture)|Table X]] when enabled, holds all extension records. Table B is then used
to store only base records, thus, maximizing the possible number of records can be stored regardless of the number of
extension records.


[[Table E (File Architecture)|Table E]] (when enabled) contains Large Object data for the file. The pointer to the starting point of the LOB is contained in Table B or X.
<table style="width:80%">
<tr class="head">
<th>Index type</th>
<th>Description</th></tr>


=== Indexing ===
<tr><td nowrap>Ordered Index</td>
<td>Consists of the Ordered Index B-tree, contained in Table D, for <var>[[Table D (File architecture)#Ordered_Index|ORDERED]]</var> fields, along with either a number of IMMEDiate pointers to the record in Table B or to a secondary index (of list or bitmap pages) located in Table D. </td></tr>


==== Index Structures ====
<tr><td>Hashed Index
<td>Consists of Table C, which indexes <var>[[Field design#KEY and NON-KEY attributes|KEY]]</var> and <var>[[Field design#NUMERIC RANGE and NON-RANGE attributes|NUMERIC RANGE]]</var> values in fields, along with either a single direct pointer (per segment) to a base record in Table B or to a secondary index (of list or bitmap pages) located in Table D.</td></tr>
</table>


<p>The indexing structures necessary for direct retrieval of
====Index tables====
records are contained in Tables C and D.</p>  
Table C contains a series of entries for every field/value combination that occurs in the file for fields defined as <var>KEY</var>. There are also a series of entries for every field value pair that occurs for fields that have the <var>NUMERIC RANGE</var> attribute.


<p>There are two types of indexes, both of which utilize Table D [[Table D (File architecture)#List and bitmap pages|List and bitmap pages]]:</p>
The series consists of an entry cell, chained to cells with an entry for any segment that contains one or more occurrences of the [[Field value pairs (File architecture)|field value pair]]. If a field value pair is unique within a segment, the Table C cell contains the internal record number of the Table B record in which the field occurs. If the field name = value pair is not unique in the segment, the Table C cell contains a pointer to a bitmap or list page in Table D (as described above).


::{| class="wikitable";style="width="80%
Table D contains:  
|-
<ul>
! align="center" scope="col" | Index Type
<li>The Ordered Index B-tree node pages, which contain the values and record accessing information for all <var>ORDERED</var> fields.</li>
! align="center" scope="col" | Description
|-
! scope="row"| Ordered Index
| Is composed of the Ordered Index B-tree, contained in Table D, for [[Table D (File architecture)#Ordered_Index|ORDERED]] fields along with either a number of IMMEDiate pointers to the record in Table B, or to a secondary index (of list or bitmap pages) located in Table D.


|-
<li>The bitmap and list pages described above. This includes the [[Table D (File architecture)#Existence Bit Map|Existence Bit Map]].</li>
! scope="row"| Hashed Index
</ul>
| Is composed of Table C, which indexes [[Field design (File management)#KEY and NON-KEY attributes|KEY]] and [[Field design (File management)#NUMERIC RANGE and NON-RANGE attributes|NUMERIC RANGE]] values in fields, along with either a single direct pointer (per segment) to a base record in Table B or or a secondary index (of list or bitmap pages) located in Table D.
 
|}
 
==== Index Tables ====
 
[[Table C (File architecture)|Table C]] contains a series of entries for every field / value combination that occurs in the file for fields defined as [[Field design (File management)#KEY and NON-KEY attributes|KEY]]. There are also a series of entries for every Field Value Pair that occurs for fields that have the [[Field design (File management)#NUMERIC RANGE and NON-RANGE attributes|NUMERIC RANGE]] attribute.


The series consists of an entry cell, chained to cells with an entry for any segment that contains one or more occurrences of the [[Field value pairs (File architecture)|field value pair]]. If a Field Value Pair  is unique within a segment, the Table C cell contains the internal record number of the Table B record in which the field occurs. If the field name = value pair is not unique in the segment, the Table C cell contains a pointer to a bitmap or list page in Table D (as described above).
Table D can be expanded with the <var>[[INCREASE command|INCREASE]]</var> command.


[[Table D (File architecture)|Table D]] contains:  
===Procedures and miscellaneous structures===
Table D also contains a few other structures:  
<ul>
<ul>
<li>the Ordered Index B-tree node pages, which contain the values and record accessing information for all [[Table D (File architecture)#Ordered Index|ORDERED]] fields</li>
<li>Procedures (code) and structures used to manage them</li>  
<li>the bitmap and list pages described above. This includes the [[Table D (File architecture)#Existence Bit Map|Existence Bit Map]]</li>
</ul>  


Table D can be expanded with the [[INCREASE command]].
<li>A procedure dictionary (used to store procedure names and aliases)</li>


=== Procedures and Miscellaneous Structures===
<li>An Access Control Table (used to map user and procedure classes)</li>


[[Table D (File architecture)|Table D]] also contains a few other structures:
<li>A reserved area (see <var>[[DPGSRES parameter|DPGSRES]]</var>) that primarily provides additional space for the transaction back out facility</li>  
<ul>
<li>
procedures (code) and structures used to manage them</li>  
<li>a procedure dictionary (used to store procedure names and aliases)</li>
<li>an Access Control Table (used to map user and procedure classes)</li>
<li>a reserved area ([[DPGSRES parameter|DPGSRES]]) that primarily provides additional space for the transaction back out facility</li>  
</ul>  
</ul>  


Table D also contains a map of the Preallocated Fields (used whenever a new record is stored). Use the [[DISPLAY RECORD command]] to view this map.   
Table D also contains a map of the preallocated fields (used whenever a new record is stored). Use the <var>[[DISPLAY RECORD command|DISPLAY RECORD]]</var> command to view this map.   
 


[[Category:File architecture and management]]
[[Category:File architecture]]
[[Category:File architecture]]

Latest revision as of 16:54, 12 May 2014

Summary

Model 204 files provide a highly flexible environment for handling large or small amounts of data. The system supports all types of data structures: flat structures, relations, hierarchies, and networks.

Model 204 uses inverted file retrieval techniques. These techniques facilitate rapid retrieval of data without requiring expensive scanning of the database itself.

Data within a Model 204 file is kept in an arbitrary collection of records.

Records in a file normally are related to each other, although they need not be. Records within a given file may have the same format (same collection of field names) or any number of different formats (containing any mixture of the fields defined to Table A, below). Prior to version 7.5 of Model 204, a file can contain as many as 16.7 million records. With the FILEORG X'200' setting, available as of Model 204 V7.5, this upper limit increases to 48 million.

Files can be linked together logically according to field values within the files. Any number of files can be linked in this way. As many as 32,767 files can be accessed in one Model 204 job.

Model 204 files consist of one or more data sets. Each data set is formatted into fixed-length physical records called pages, and all pages are the same size: 6184 bytes.

Internally, these pages are organized into tables: the File Control Table (FCT), Table A, Table B, Table C, Table D, Table E, and Table X. Any pages in the file not allocated to one of these tables are considered free space. See Managing file and table sizes for how this free space can be applied to the other tables.

The components of a Model 204 file

File Control Table

The File Control Table (FCT) contains file parameter settings, data set or file definition names of all data sets in the file, the status of the file, and other control information. The FCT is always 8 pages.

Internal file directory

Table A contains three structures: a file dictionary, a set of MANY-VALUED pages, and a set of FEW-VALUED pages.

The file dictionary contains the file's field group and field names and their attributes. Some field attributes (notably CODED) require lists of values to be maintained. These lists are stored either in the FEW-VALUED or MANY-VALUED structures.

Table A usually is small in relation to the rest of the file. The field name section in particular should be as small as possible to aid efficient access, especially if your site uses field name variables

Data

Data structures

All data held in a Model 204 file is held in records.

Model 204 records are best thought of as very loosely defined containers, with almost no fixed structure.

The fundamental elements of Model 204’s logical data structure are:

Record Collection of fields, either individual or in physical field groups (see below).

Each record is variable in length and need contain only the fields that pertain to it. The limit of the number of field value pairs in a record is in the tens of millions.

There is only a limited fixed format for a record (preallocated fields). Almost any number of fields can appear almost any number of times in almost any order. Each record is automatically assigned a unique internal record number that is used by the system to build index entries for the record.

Field Elementary data item.

Fields are defined with attributes that control storage formats and indexing options. Up to 4000 (32000 as of Model 204 V7.5, when using FILEORG X'100') different field names can be defined in a single Model 204 file.

Field group A field group is, as the name implies, a set of fields that are retrieved and updated in a single operation.

Available as of Model 204 V7.5.

File An arbitrary collection of records.
File group A collection of files logically grouped together that can be treated as a single file.

Data tables

Table B contains (at least) the base data of all the records in the file. These base records contain pointers to any extensions that exist (whether they are elsewhere in Table B or in Table X).

Table X when enabled, holds all extension records. Table B is then used to store only base records, thus maximizing the possible number of records that can be stored regardless of the number of extension records.

Table E (when enabled) contains Large Object (LOB) data for the file. The pointer to the starting point of the LOB is contained in Table B or X.

Indexing

Index structures

The indexing structures necessary for direct retrieval of records are contained in Table C and Table D.

There are two types of indexes, both of which utilize Table D list and bitmap pages:

Index type Description
Ordered Index Consists of the Ordered Index B-tree, contained in Table D, for ORDERED fields, along with either a number of IMMEDiate pointers to the record in Table B or to a secondary index (of list or bitmap pages) located in Table D.
Hashed Index Consists of Table C, which indexes KEY and NUMERIC RANGE values in fields, along with either a single direct pointer (per segment) to a base record in Table B or to a secondary index (of list or bitmap pages) located in Table D.

Index tables

Table C contains a series of entries for every field/value combination that occurs in the file for fields defined as KEY. There are also a series of entries for every field value pair that occurs for fields that have the NUMERIC RANGE attribute.

The series consists of an entry cell, chained to cells with an entry for any segment that contains one or more occurrences of the field value pair. If a field value pair is unique within a segment, the Table C cell contains the internal record number of the Table B record in which the field occurs. If the field name = value pair is not unique in the segment, the Table C cell contains a pointer to a bitmap or list page in Table D (as described above).

Table D contains:

  • The Ordered Index B-tree node pages, which contain the values and record accessing information for all ORDERED fields.
  • The bitmap and list pages described above. This includes the Existence Bit Map.

Table D can be expanded with the INCREASE command.

Procedures and miscellaneous structures

Table D also contains a few other structures:

  • Procedures (code) and structures used to manage them
  • A procedure dictionary (used to store procedure names and aliases)
  • An Access Control Table (used to map user and procedure classes)
  • A reserved area (see DPGSRES) that primarily provides additional space for the transaction back out facility

Table D also contains a map of the preallocated fields (used whenever a new record is stored). Use the DISPLAY RECORD command to view this map.