Sorting

From m204wiki
Revision as of 18:06, 8 January 2014 by JAL (talk | contribs)
Jump to navigation Jump to search

Overview

IN ORDER clause

Record retrieval normally creates a found set of records in which the records are held in the order in which they are retrieved.

To produce output in sorted order, specify the IN ORDER clause with:

To use the IN ORDER clause with these statements requires that one of the following conditions exists:

  • Sort key fields are defined with the ORDERED attribute
  • The file is defined as a Sorted file.

SORT RECORDS statement

You can sort a set of records using the SORT RECORDS statement. The SORT RECORDS statement copies only the portions of records needed to complete the request to a temporary work file and sorts the set of temporary records in the user-specified order. The ordered, temporary set of records is kept only for the duration of the request.

You can use the SORT RECORDS statement with a sorted or an unsorted file. If the file is sorted, the key specified in the SORT RECORDS statement need not be the sort field for the file.

Using sort utilities

If you need to perform a sort on several found sets from the same file, you might gain a performance improvement by presorting the file with a sort utility, such as the SYNCSORT utility.

See Sorted files for more information about sorting files.

Syntax for the SORT RECORDS statement

This section describes the syntax of the SORT RECORDS statement, including a discussion of several of the options. See also Usage guidelines for the SORT RECORDS statement.

Syntax

The format of the SORT RECORDS statement is:

SORT [k] RECORDS {IN label | ON [LIST] listname} BY key [AND key] ...

Where:

key = [EACH] fieldname [VALUE {[ASCENDING | DESCENDING] [CHARACTER | NUMERICAL] | [RIGHT-ADJUSTED]} ...]

The SORT RECORDS statement is supported in remote file and scattered group contexts.

k option

The k option specifies a number of records to sort. If a number is specified, only the first k referenced records are processed, and only k temporary records are created.

DESCENDING option

The DESCENDING option specifies a reversal of the sort sequence. For example, serial numbers A30, A301, and B95 sort in that order. But when DESCENDING is specified, the order is reversed: B95, A301, A30. Groups of equal records are not reversed. Records for which no value is available are listed first.

NUMERICAL option

The NUMERICAL option specifies a sort with the usual numeric order relationships. This is possible only if the field value has the proper form.

Proper form required

A field value in the proper form is a rational number with optional sign or a number in exponent notation. Leading or trailing blanks, or blanks between the sign and the number, are ignored.

Numeric fields must have fewer than 64 significant digits. If you specify both the NUMERICAL and RIGHT-ADJUSTED options, the alphabetic fields must be shorter than128 characters.

Before sorting, Model 204 determines whether the value of the field in each record has the proper form. Fields whose values do not have the proper form are sorted first. A field that does have the proper form is sorted in the proper numeric order whether or not its field type is numeric.

Usefulness of the NUMERIC option

Suppose a field contains temperatures, some of which contain leading plus and minus signs, and some of which are assumed to be a positive temperature when the plus sign is omitted. Because plus and minus signs are special characters, the fields containing them are sorted prior to unsigned fields.

SORT RECORDS IN FIND.RECS BY TEMP

would produce this order of sample values:

+65 -65 63 65 66

However,

SORT RECORDS IN FIND.RECS BY TEMP VALUE NUMERICAL

would result in:

-65 63 +65 65 66

RIGHT-ADJUSTED option

The RIGHT-ADJUSTED option specifies that Model 204 should temporarily right-justify the field values before sorting, so that the shorter fields sort first. For example:

EATON EASTON EARNEST

If you do not specify any sort order options

If you do not specify any sort order options (DESCENDING, NUMERICAL, or RIGHT-ADJUSTED), the SORT RECORDS statement creates a sorted copy of the records to which it refers, ordered as follows:

  1. If no value is present, a character string of zero length, called a null string, is assumed, and those records are listed first. The sort sequence is EBCDIC (punctuation, then letters, then numbers). Ordinarily, fields of unequal lengths are left-justified before being compared. For example, the names Earnest, Eaton, and Easton would be ordered:

    EARNEST EASTON EATON

  2. Each set of equal records remains in its retrieved order. If additional sort keys are specified, they are used, one after the other, to order otherwise equal records.

Usage guidelines for the SORT RECORDS statement

This section provides guidelines for using the SORT RECORDS statement. See also Syntax for the SORT RECORDS statement.

Limits of the SORT RECORDS statement

The limits of the SORT RECORDS statement are shown in the following table.

SORT RECORDS statement limits
Limit User Language sort
Number of keys none
Size of key field (bytes) 764
Total record size (bytes) 4,294,963,104
Number of records 1,073,741,568
Number of CCATEMP pages 4,194,303

Copy produced

The records in the sorted copy contain only the fields required to complete the original request. Therefore only those fields are available if the sorted copy is referred to by a later request continuation (see Request continuation).

Prior to Model 204 V4R2.0

The SORT RECORDS statement produces a sorted copy of the original records in the CCATEMP file; file maintenance cannot be performed directly on the sorted copy. Attempts to add, change, or delete values results in errors.

If the total size of all records to be sorted exceeds 30K, only the record keys are sorted. The remaining data remains in place in CCATEMP, and the key has a pointer to the data, so the data appears to be sorted.

To update the original record, a FOR RECORD NUMBER statement could be used within a FOR EACH RECORD loop. Refer to Processing a single record and $CURREC for more information.

Beginning with Model 204 V4R2.0

User Language requests can update Table B records from within a FOR EACH RECORD loop that references a sorted record set. The SORT RECORDS statement produces a sorted copy of the original record in CCATEMP; file maintenance is not performed directly on the sorted copy. Any update statement found inside a FOR EACH RECORD loop for the sorted record set updates the original record in Table B of the file.

Within a FOR EACH RECORD loop for a sorted record set, data extraction takes place from the sorted copy while data update affects the original record in Table B. Copy records in the CCATEMP sorted set are never updated. In addition, once the update takes place against the Table B record, the sorted copy might no longer match the Table B record.

This discrepancy does not exist, however, when updating records through a sorted record set that was produced with the SORT RECORD KEYS statement. For the SORT RECORD KEYS statement, only the sort keys and the Table B record number are copied. Inside the FOR EACH RECORD loop, data extraction takes place from the original Table B record.

Using FOR EACH RECORD loops

You can refer to the SORT RECORDS statement only by record loop statements. The record loop statement used must not contain the IN ORDER option. Refer to FOR EACH RECORD statement and Processing records in order for more information on record loop statements and the IN ORDER option.

Example

In the following example, a selected group of records is sorted by record key. Each record in the original file is then updated inside the FOR EACH RECORD loop that refers to the sorted record set. The original file is updated through the use of the FOR EACH RECORD statement.

BEGIN FORDS: FIND ALL RECORDS FOR WHICH MAKE=FORD END FIND SRTCLR: SORT 30 RECORD KEYS IN FORDS BY COLOR AND MODEL FOR EACH RECORD IN SRTCLR PRINT COLOR AND YEAR AT COLUMN 10 - WITH OWNER AT COLUMN 30 CHANGE REPORT PRINTED TO 'YES' END FOR END

Use with groups

The SORT RECORDS statement can refer to the results of a group FIND or to records placed on a list as a result of a group FIND. The group SORT functions as a single file SORT, producing a sorted copy of the records to which it refers.

The records from all files of the group are sorted together by the sortkey value. The order of sorted group records bears no relationship to the order of the files from which the records came.

When the sortkey field is defined with different field types in different members of the group, the order of the sorted records is affected. The records with different sortkey field types are sorted in the following sequence:

  1. Nonnumeric strings, and also, numeric strings with more than 63 significant digits
  2. Numeric strings
  3. Float values

For example, -5 stored in a float field sorts after +5 stored in a string field.

Use with lists

Sorting a list creates a temporary set of records. To retrieve the records in sorted order, you must refer to the statement label of the SORT statement, not the list name. Refer to Performing loops on lists of records for more information on using lists.

Data types and sorting

BINARY fields and sort work records

BINARY values are not converted to string values during SORT but are left as BINARY in sort work records. Leaving fields as BINARY reduces the size of sort work records and also reduces CPU and CCATEMP I/O during SORT processing.

CODED fields and sort work records

CODED values are not converted to string values during SORT but are left as CODED in sort work records. Leaving fields as CODED reduces the size of sort work records and also reduces CPU and CCATEMP I/O during SORT processing.

Use with multiply occurring fields

Special forms of the SORT RECORDS statement exist for use with multiply occurring fields. See Subscripts and SORT RECORDS statement for details.

Field group SORT support

As of Model 204 version 7.5, you can use field groups in sorted sets. SORT statement support for field groups not only allows records with field groups in them to be sorted, it lets you reference field groups in the sorted sets. For example, you can issue a FOR FIELDGROUP statement or FEO FIELDGROUP statement against the sorted set.

Restrictions sorting an individual field in a field group

Individual fields within field groups can cause syntax errors. In the following example INCIDENT_TYPE is a field in a field group:

*** M204.0542: EDIT COMPLETE - GO *** 1 M204.2845: INVALID FIELDGROUP CONTEXT FOR INCIDENT_TYPE 2 SORT RECORDS IN 1 BY INCIDENT_TYPE (FILE = CCATEMP , PROCEDURE = 0, LINE = 3) *** M204.1042: COMPILATION ERRORS

The previous example is invalid because the field group context for the SORT statement is the record context. In the record context, you can reference fields inside a field group if, and only if, the field is both:

  • In an outer level field group (which is true in the previous example)
  • Defined as EXACTLY-ONE or AT-MOST-ONE

Without these restrictions it is ambiguous as to which occurrence a field group field reference is referring. In the previous case, INCIDENT_TYPE is in field group DRIVER, but is defined REPEATABLE. The REPEATABLE attribute makes INCIDENT_TYPE ineligible to be a sort key.

Usage notes for SORT processing and field groups

The EACH clause for fields inside field groups is not allowed. A compile error or run-time error, depending on whether a field name variable is used or not, will be returned.

An unsubscripted reference to a field in a field group in a SORT statement is allowed.

Sorting by record key

To reduce the overhead of sorting and referencing sorted records, the data sorted can be limited to the sort key and the record number. To perform this type of sort, use the SORT RECORD KEYS statement.

This type of sort is particularly useful in any of the following conditions:

  • If the quantity of data to be copied from each record to the temporary work file (CCATEMP) is large
  • If processing the sorted records involves significant terminal interaction
  • If the area used for the temporary work file is limited

Syntax

The following form of the SORT RECORDS statement sorts by key and record number:

SORT k RECORD KEYS {IN label | ON [LIST] listname} BY key [AND key] ...

The SORT RECORD KEYS statement is supported in remote file and scattered group contexts.

Do not use SORT RECORD KEYS with multiply occurring fields

All sort options and considerations are the same as those for the SORT RECORDS statement, except that SORT RECORD KEYS cannot be used for multiply occurring fields.

Special forms of the SORT RECORDS statement exist for use with multiply occurring fields. See Operations on multiply occurring fields for details.

Examples

Using the SORT RECORDS statement

The following example illustrates a SORT RECORDS statement.

BEGIN POL.HOLDERS: FIND ALL RECORDS FOR WHICH RECTYPE = POLICYHOLDER STATE = VIRGINIA END FIND CITY.SORT: SORT RECORDS IN POL.HOLDERS BY CITY FOR EACH RECORD IN CITY.SORT PRINT FULLNAME WITH CITY TO COLUMN 30 END FOR END

In this example, the SORT RECORDS statement creates a set of temporary records containing portions of the permanent records found by the POL.HOLDERS statement.

Printing the temporary records produces this output:

BISHOP, PEGGY E ALEXANDRIA DAVIS, SHARON M ALEXANDRIA EDWARDS, RITA U ALEXANDRIA . . . . . . BAKER, ELSIE Y ARLINGTON BRIGGS, CLIFTON G ARLINGTON . . . . . .

Using the SORT RECORD KEYS statement

In the following example, a selected group of records is sorted by record key. Each record in the original file is then updated by using the FOR RECORD NUMBER statement.

This example uses a %variable, the CHANGE statement, and the $CURREC function.

BEGIN FORDS: FIND ALL RECORDS FOR WHICH MAKE = FORD END FIND SRTCLR: SORT 30 RECORD KEYS IN FORDS BY COLOR AND MODEL FOR EACH RECORD IN SRTCLR PRINT COLOR AND YEAR AT COLUMN 10 - WITH OWNER AT COLUMN 30 %POINTER = $CURREC FOR RECORD NUMBER %POINTER CHANGE REPORT PRINTED TO 'YES' END FOR END FOR END