Record retrievals: Difference between revisions

From m204wiki
Jump to navigation Jump to search
Line 1,151: Line 1,151:
<p class="code">M204.2864: FULL SEARCH OF ORDERED INDEX PERFORMED
<p class="code">M204.2864: FULL SEARCH OF ORDERED INDEX PERFORMED
</p>
</p>
<p class="note"><b>Note:</b> Since message M204.2864 is issued at pattern evaluation time and will be issued each time the pattern is evaluated for each segment of the file, you may wish to remove the overhead of issuing the message entirely. Issue the following command to minimize the amount of CPU used:</p>
<blockquote class="note"><b>Note:</b> Since message M204.2864 is issued at pattern evaluation time and will be issued each time the pattern is evaluated for each segment of the file, you may want to remove the overhead of issuing the message entirely. Issue the following command to minimize the amount of CPU used:
<p class="code">MSGCTL M204.2864 NOACTION
<p class="code">MSGCTL M204.2864 NOACTION
</p>
</p>
</blockquote >
<b>Examples</b>
<b>Examples</b>
<p>
<p>
Line 1,175: Line 1,176:
<p>
<p>
Pattern retrieval performance also is affected by the complexity of the pattern specified. The more complex the pattern, the more processing is required.</p>
Pattern retrieval performance also is affected by the complexity of the pattern specified. The more complex the pattern, the more processing is required.</p>
 
===Use of full key for pattern matching===
===Use of full key for pattern matching===
<p>
<p>

Revision as of 18:19, 19 August 2013

Overview

Basic User Language Statements and Commands introduced basic FIND statements that could be used for record retrieval. This page discusses more advanced ways to locate records.

Reference context

The FIND statement is supported in remote file and scattered group contexts. See Files, Groups, and Reference Context for additional information on the reference context.

Retrieval conditions

In addition to retrieval conditions that perform searches based on a specified value, User Language supports more complex retrieval conditions that allow more than one value to be used as a criterion for selecting records.

This page discusses the following types of record retrievals:

Record retrieval type Locates records...
Range With field values within a specified numeric or character value range.
Pattern With field values matching a specified pattern (for example, a c* pattern searches for all patterns beginning with the letter c).
IS PRESENT Based on whether or not the specified field exists in a record.
FIND$ From a found or deleted set of records.
IN label From a found set of records.
SFL$ and SFGE$ In sorted files.
POINT$ Based on internal record number.
LIST$ From a list. Lists discusses lists and the LIST$ condition in detail.
IN MEMBER and FILE$ In file groups.
LOCATION$ Restricted to a retrieval in scattered group context to a specified node or nodes.

Handling a foundset

When a FIND statement refers back to a previous foundset and the foundset is empty, then the loop for the FIND statement is skipped. Performance for the FIND is enhanced because no retrieval is performed for the criteria specified in the FIND statement. Each FIND statement, however, for this particular case, requires one more word (four bytes) of QTBL space.

Numeric range retrievals

When a field contains mostly numerical values, the magnitude of the numbers can be tested by using this statement:

Syntax

FIND ALL RECORDS FOR WHICH fieldname IS [NUMERICALLY] [operator] value

Where

  • The NUMERICALLY keyword is optional and specifies that a numeric range retrieval should be performed. If this keyword is omitted, the type of retrieval performed is based upon the default type of the operator. For more information about the default type for each operator, refer to Interpretation of values in retrievals.
  • operator is one of the range retrieval operators in Range retrieval operators.

Range retrieval operators

You can use the range retrieval operators described in the following table for numeric retrievals.

IS operator Requires the field value to be...
EQ
=
Equal to the value specified.
NE
¬=
Unequal to the value specified.
LESS THAN
LT
<
Less than the value specified.
LE
<=
Less than or equal to the value specified.
GREATER THAN
GT
>
Greater than the value specified.
GE
>=
Greater than or equal to the value specified.
BETWEEN value1 AND value2

Between value1 and value2.

If the field value is equal to value1 or value2, the record is not retrieved.

If value1 is greater than value2, no records are retrieved.

Numeric retrieval examples

FIND ALL RECORDS FOR WHICH SETTLEMENT DATE IS 20030305 END FIND FIND ALL RECORDS FOR WHICH SETTLEMENT DATE IS NUMERICALLY < 20030305 END FIND FIND ALL RECORDS FOR WHICH SETTLEMENT DATE IS BETWEEN 20030300 AND 20030332 END FIND

How numeric retrievals are processed

When the relation IS is used, a character-by-character match is not performed. Leading zeros to the left of the decimal point and trailing zeros to the right are ignored. Thus, SETTLEMENT AMOUNT IS 50 retrieves records that were stored with a SETTLEMENT AMOUNT field value of 050, 50.0, +50, and so on, whereas the statement SETTLEMENT AMOUNT = 50 does not.

If the value of the named field in a given record has the stated relation to the number in the retrieval condition, then the record is retrieved. For example, AGE IS BETWEEN 21 AND 35 does retrieve a record with the pair AGE = 27.

When a numeric retrieval is performed on a multiply occurring field, all records with at least one occurrence of the field that meets the specified condition are retrieved. See Operations on Multiply Occurring Fields for more information about retrieval of multiply occurring fields.

NUMERIC RANGE and ORDERED NUMERIC attributes

In a manner similar to the effect of the IS relation (see above), field values for NUMERIC RANGE and ORDERED NUMERIC fields are indexed by their numeric equivalents; leading plus signs and leading and trailing zeros are ignored. The field values 1, 1.0, 01.0, +1, and +0001, for example, are considered identical when indexed numerically. When leading zeros are meaningful to your application, the field should be ORDERED CHARACTER. See also:

Records selected

If a numeric range retrieval is performed on a field that is defined with the NUMERIC RANGE or ORDERED NUMERIC attribute, the retrieval conditions find only those records where the numerical fields contain:

  • Optional leading plus sign (+) or minus sign (-).
  • Digits 0-9 and an optional decimal point.
  • No more than 10 digits on either side of the decimal point (20 digits maximum) for NUMERIC RANGE fields.
  • No more than 10 digits on either side of the decimal point for ORDERED NUMERIC fields; only the first 15 significant digits are used. (Exponent notation can be used for ORDERED NUMERIC fields; see the discussion in Exponent notation.)

Examples

Examples of valid NUMERIC RANGE or ORDERED NUMERIC fieldname = value pairs are provided here:

PRICE = 3317 SETTLEMENT AMOUNT = 050 VOLUME = 517.6473 WAVE LENGTH = +.0072 LOSS = 0 TEMP = -40

Restrictions

Note these restrictions on adding NUMERIC RANGE or ORDERED fields to a file:

  • If you try to add a NUMERIC RANGE field to a record in which that field name already appears, an error occurs. The field is not added and a message is displayed during execution of the request.
  • If you try to add an illegal value to a NUMERIC RANGE or ORDERED NUMERIC field, the value is nevertheless stored as entered. For example, if AGE is defined as a NUMERIC RANGE retrieval field type and AGE = TWO is added to the record, it is stored. In order to retrieve a field with an illegal value, the following methods must be used.

Locating and printing illegal values

Illegal values in a NUMERIC RANGE field can be located by using the following fieldname = value pair:

NUMERICAL FIELD++ = NON NUMERICAL

For example, a record that contains AGE = A3 or AGE = TWO can be found by the statement:

AGE++ = NON NUMERICAL

To print an illegal value in a NUMERIC RANGE field, the regular name of the field should be used. PRINT AGE outputs all the values of the AGE field in a record, including A3 and TWO. Illegal values in an ORDERED NUMERIC field can be found by locating a value alphabetically greater than nulls and less than zero.

Negated numeric range retrievals

Any of the numeric retrieval conditions can be negated by using the word NOT after the IS. For example:

AGE IS NOT 3 AGE IS NOT NUMERICALLY LT 21 AGE IS NOT NUMERICALLY > 65 AGE IS NOT BETWEEN -18 AND 10

In the last example above, the condition IS NOT BETWEEN retrieves any record that has AGE less than or equal to -18 and any record that has AGE greater than or equal to 10. Notice that in this example, if a record has AGE equal to -18 or 10, it is retrieved.

Negating a negated operator

Negating a negated operator results in a positive operator. For example:

FIND ALL RECORDS FOR WHICH AGE IS NOT ¬= 35 END FIND

results in the same found set as:

FIND ALL RECORDS FOR WHICH AGE IS 35 END FIND

How negated conditions are processed

The result of a numeric retrieval on a field defined by the file manager as NUMERIC RANGE differs from the result on a field defined as NON-RANGE or ORDERED when the condition is negated. Suppose this condition is specified:

AGE IS NOT LESS THAN 10

The set of records retrieved depends on whether AGE is defined with the NUMERIC RANGE, ORDERED, or NON-RANGE attribute, as follows:

If AGE is defined as... Then records are...
NUMERIC RANGE Retrieved by this statement are those with AGE fields whose values are numerical and greater than or equal to 10.
NON-RANGE or ORDERED Retrieved by this condition are those that do not contain AGE fields less than 10. These records include those containing nonnumerical AGE fields and those containing no AGE field.
NON-RANGE and INVISIBLE Not retrieved.

Exponent notation

A numerical value in a retrieval statement can be defined in exponent notation.

Syntax

Exponent notation has this format:

[+ | -] {whole-number | whole-number.fractional-number | fractional-number} E [+ | -]

Where

  • The value of the whole number and/or the fractional number should be a maximum of 15 significant decimal digits. If the number of significant digits exceeds 15, the remaining precision is lost.
  • The exponent expression must be between 75 and -74.

Embedded spaces are not allowed within the exponent string.

Example

These values are legal:

-1322.444E14 15E-47 +99233.0332E-66 222E+11

Character string retrievals

When a field contains string values, for example, FULLNAME or ADDRESS, the field can be tested to determine whether a record containing values within a particular range should be retrieved.

Syntax

In addition to the equality condition, NAME = SMITH that is described in FIND statement, the following statement is used for the retrieval of string values:

FIND ALL RECORDS FOR WHICH fieldname IS [ALPHABETICALLY] {BEFORE | AFTER | operator} value

Where

  • The ALPHABETICALLY keyword is optional and specifies that a character range retrieval should be performed. If this keyword is omitted, the type of retrieval performed is based upon the default type of the operator. For more information about the default type for each operator, refer to Interpretation of values in retrievals.
  • BEFORE is an operator that specifies records in which the field precedes the given value in EBCDIC collating sequence.
  • AFTER is an operator that specifies records in which the field follows the given value in EBCDIC collating sequence.
  • operator is one of the range retrieval operators listed earlier in this section in Range retrieval operators. The standard EBCDIC collating sequence is used when performing string comparisons.

Examples

FIND ALL RECORDS FOR WHICH FULLNAME IS BEFORE 'M' END FIND FIND ALL RECORDS FOR WHICH FULLNAME IS AFTER BAKER END FIND FIND ALL RECORDS FOR WHICH FULLNAME IS ALPHABETICALLY GE JONES END FIND

If the value of the named field in a given record has the stated relation to the string in the retrieval condition, then the record is retrieved. Thus, NAME IS AFTER 'M' retrieves all records containing names that begin with M through Z.

Multiply occurring fields

When a string retrieval is performed on a multiply occurring field, all records with at least one occurrence of the field meeting the specified condition are retrieved. See Operations on Multiply Occurring Fields for more information about retrieving multiply occurring fields.

Negated string retrieval conditions

Any of the string retrieval conditions can be negated by using the word NOT after the IS. For example:

FULLNAME IS NOT AFTER 'Q' FULLNAME IS NOT LT JOHNSTON

Combining retrieval conditions

Beginning and ending values in retrievals can be specified or retrievals can be combined in the following two ways:

  • By using the IN RANGE clause with the FIND statement
  • By using Boolean operators or parentheses

IN RANGE clause

The following form of the FIND statement can be used if a beginning and ending range of values is desired:

Syntax

FIND ALL RECORDS FOR WHICH fieldname IS [NOT] [NUMERICALLY | ALPHABETICALLY] IN RANGE [FROM | AFTER] value1 {TO | [AND] BEFORE} value2

Where

  • The NUMERICALLY or ALPHABETICALLY keyword specifies the type of retrieval to be performed. NUMERICALLY indicates a numeric range retrieval; ALPHABETICALLY indicates a character range retrieval. NUMERICALLY is the default.
  • The FROM or AFTER clause specify whether a field containing value1 should be included in the retrieval. FROM indicates that the retrieval includes value1. AFTER indicates that the retrieval begins at, but does not include, value1. FROM is the default if the FROM or AFTER keyword is not specified.
  • value1 specifies the beginning value.
  • The TO or BEFORE clause specifies whether a field containing value2 should be included in the retrieval. TO indicates that the retrieval includes value2. BEFORE indicates that the retrieval ends at, but does not include, value2.
  • value2 specifies the ending value.

Usage

  • If the IN RANGE clause is used for ORDERED fields, the retrieval is optimized because the search on the Ordered Index is restricted between two values. For more information see Summary of field attributes and retrieval optimization.
  • The IN RANGE clause is particularly useful when a beginning and ending range of values must be specified for a multiply occurring field. See Operations on Multiply Occurring Fields for more information about multiply occurring fields.

Boolean operators in retrieval statements

You can use the AND, OR, NOT, NOR, and parentheses in FIND statements to combine retrieval conditions.

Boolean operators and parentheses are described in more detail on the following pages.

Some examples are listed below:

  • fieldname IS BETWEEN value1 AND value2 OR IS BETWEEN value3 AND value4
  • fieldname IS value1 OR IS LESS THAN value2
  • fieldname IS GREATER THAN value1 OR IS NOT GREATER THAN value2 OR IS value3
  • fieldname = value1 OR IS LESS THAN value2
  • fieldname1 IS LESS THAN value2 OR fieldname2 IS BETWEEN value3 AND value4
  • fieldname1 IS LESS THAN value1 OR fieldname2 = value2
  • fieldname IS BEFORE value1 OR = value2
  • fieldname IS AFTER value1 AND BEFORE value2 OR IS AFTER value3
  • fieldname = value1 OR IS BEFORE value2 AND AFTER value3

Interpretation of Boolean operators in retrievals

When a retrieval involves several Boolean operations, Model 204 performs them in the following sequence:

Order of precedence Operator
First NOT
Second NOR
Third AND (explicitly stated)
Fourth OR
Fifth AND (implied by a new line)

You can use parentheses to change the order of precedence: conditions within parentheses take the highest precedence.

Example

Consider this example: search a CENSUS file to find Boston residents whose mother and father were both American-born. The FIND statement to do this can be written:

FIND.RECS: FIND ALL RECORDS FOR WHICH CITY EQ BOSTON (MOTHERS BIRTHPLACE = U.S. FATHERS BIRTHPLACE = U.S.) END FIND

Line continuation and implied ANDs

Do not use a hyphen for line continuation when you are using the implied AND.

Only use a hyphen to indicate line continuation when you have coded an explicit AND or OR at the end of a line.

No continuation hyphen is required in the previous example. Since there is no Boolean operator at the end of the second line, Model 204 processes the statement with the implied AND.

Implied ANDs are at the end of lines 2 and 3.

Note: Technical Support strongly recommends that you do not use parentheses as a line continuation method in a FIND statement; use parentheses only where you need to change the normal order of precedence of Boolean operators.

Order in which expressions are interpreted

Interpretation of a Boolean expression proceeds from left to right, except where the order of precedence (shown in Interpretation of Boolean operators in retrievals) dictates otherwise. An operation of higher precedence following an operation of lower precedence is performed first.

Thus, from the example in the previous page, the following code shows:

(MOTHERS BIRTHPLACE = U.S. FATHERS BIRTHPLACE = U.S.)

(describing people who have American-born mothers and fathers) is performed first and is then combined with the set of records describing Boston residents. A record that meets both criteria is included in the found set.

The following example indicates how a somewhat more complex statement would be processed:

BEGIN FIND.RECS: FIND AND PRINT COUNT DECILE IS ALPHA IN RANGE FROM 3320 TO 3390 STATE EQ MA OR - STATE EQ CT END FIND END

The first action Model 204 takes is to perform the OR (STATE EQ MA OR STATE EQ CT), since the OR has higher precedence than the implied AND generated by the first selection criterion. Then the result of that OR is combined with the first selection criterion; again only records that meet both criteria are included in the found set. In other words, this example selects only those residents of Massachusetts or Connecticut whose decile value is in the specified range.

Interpretation of values in retrievals

The manner in which values (numbers, strings, and exponent notation) are interpreted depends on whether the retrieval is an equality retrieval or a range retrieval.

Equality retrievals

In an equality retrieval condition (fieldname = value), the interpretation of the value is based on whether the field has been defined with the KEY and FLOAT field attributes or the ORDERED NUMERIC attribute, as follows:

  • If the field has been defined with the KEY and FLOAT field attributes or the ORDERED NUMERIC attribute, the value is examined to determine whether it is a number, exponent notation, or a character string. Comparisons are then performed as follows:
  • If the value is a number or in exponent notation, a numerical comparison is performed.

    If the value is a character string, a character string comparison is performed.

  • If the field has not been defined with the KEY and FLOAT field attributes or the ORDERED NUMERIC attribute, the value is treated as a string and a character string comparison is performed. Exponent notation is not converted to a numerical form. For example, if a field contains a value of .1234E-3, the comparison .1234E-3 = .0001234 is not true if the field has not been defined with the KEY and FLOAT field attributes or the ORDERED NUMERIC attribute.

Range retrieval

In a range retrieval condition (in the form fieldname IS [operator] value) or a condition that uses the IN RANGE clause, the type of comparison performed is based on whether the NUMERICALLY or ALPHABETICALLY keyword is specified, as follows:

If the keyword is... The value is interpreted as...
NUMERICALLY Number and a numerical comparison is performed.
ALPHABETICALLY String and a character string comparison is performed.
Neither NUMERICALLY nor ALPHABETICALLY Default comparison type of the operator and the corresponding type of comparison is performed. See the Range retrieval comparisons table.

The default comparison type for each operator is provided in the following table:

Range retrieval comparisons
IS operator Default type of comparison performed
IN RANGE Numeric
EQ, =, or none Numeric
NE or ¬= Numeric
LESS THAN, LT, or < Numeric
LE or <= Numeric
GREATER THAN, GT, or > Numeric
GE or >= Numeric
BETWEEN Numeric
AFTER Character string
BEFORE Character string

Results when operator and value type are not matched

The following results occur when an operator and value type are not matched:

  • If a numeric operator is used with a nonnumeric value, as in:
  • FIND ALL RECORDS FOR WHICH LAST NAME IS GREATER THAN ANDREWS

    no records are retrieved. If this operator is negated, as follows:

    FIND ALL RECORDS FOR WHICH LAST NAME IS NOT GREATER THAN ANDREWS

    every record in the file or group is retrieved.

  • If a character string operator is used with a numeric value, as in:
  • FIND ALL RECORDS FOR WHICH YEAR IS BEFORE 1986

    all numeric values in the field are converted to character strings and compared character-by-character to the retrieval criteria. Therefore, numbers such as 942, 700, or 2 would not be retrieved.

Range retrieval optimization

The attributes defined for a field can impact how quickly a retrieval is performed.

Indexed searches

Retrieval based on KEY, NUMERIC RANGE, or ORDERED fields is particularly fast and efficient because Model 204 goes directly to the appropriate index entry to find the records that satisfy the selection criteria. The effect of these attributes on retrieval is discussed in detail in the following sections.

Indexed searches for equality retrievals

When an equality retrieval (fieldname = value) is performed on a field that has the KEY attribute, or when a numeric retrieval (fieldname IS [operator] value) is performed on a field that has the NUMERIC RANGE attribute, Model 204 locates the records that meet the conditions by doing a search of the Table C index.

When an equality retrieval (fieldname = value) is performed on a field that has the ORDERED attribute but does not have the KEY attribute, the Ordered Index is searched.

Indexed searches for range retrievals

Range retrievals for a field defined with the ORDERED attribute are dependent on the type of retrieval to be performed and whether the NUMERIC or CHARACTER option has been selected for the ORDERED attribute. The Ordered Index is searched if one of the following conditions exists:

  • If a numeric range retrieval is performed and the field has the ORDERED NUMERIC attribute.
  • If a character range retrieval is performed and the field has the ORDERED CHARACTER attribute.

Direct (Table B) searches

In all other cases, Model 204 performs a direct search of the data (Table B) to locate the records. For example, if the field has the ORDERED NUMERIC attribute but the type of retrieval is nonnumeric, a direct search of the data is performed.

Note: The user-resettable parameter, MBSCAN, can be used to warn the user when a direct search of more than a specified number of records is required. The user then has the option of proceeding with the search or cancelling the request.

Summary of field attributes and retrieval optimization

The following table summarizes the field attribute and retrieval combinations for retrieval optimization.

Evaluation chart for the FIND statement
FIND condition
 
Field attribute
 
KEY
 
NUMERIC RANGE
 
ORDERED NUMERIC
 
ORDERED CHARACTER
KEY and  
Other
ORDERED NUMERIC ORDERED CHARACTER
Equality Table C Index Table C Index Ordered Index Ordered Index Table C Index Table C Index Data
Character range Data Data Data Ordered Index Data Ordered Index Data
Numeric range Data Table C Index Ordered Index Data Ordered Index Data Data
Character and
numeric range
Data Data Data Ordered Index Data Ordered Index Data

Pattern matching

Retrieval can be performed using a field value that is specified in the form of a pattern.

Pattern retrieval (pattern matching) can be performed on fields of any type. However, pattern retrieval does not process records for which the specified retrieval field is not present.

The pattern is evaluated for each value in the specified field to determine the selection of records. Those values that match the criteria in the pattern are selected. Retrieved values must match patterns character by character, including blanks, except when special characters are used.

Patterns also can be specified in the FOR EACH RECORD, FOR EACH VALUE, and IF statements, which are described later in this manual.

Syntax

The format of the FIND statement used to perform pattern matching is:

FIND [ALL] RECORDS {FOR WHICH | WITH} fieldname IS [NOT] LIKE 'pattern'

where the keyword LIKE indicates that whatever follows is a pattern. The pattern must be enclosed in quotation marks.

Note: The syntax "fieldname IS NOT LIKE pattern" operates only on records for which fieldname actually exists in the record. For example, a file contains 99983 records. If fieldname ORDCHAR has the value 'A' in 9 records and the value 'B' in 7 different records and does not exist in any other records, the following requests will provide the results indicated:

  • Input 1:

    BEGIN ALL: FPC A: FPC ORDCHAR IS NOT LIKE 'B' END

    Output 1:

    99983 9

  • Input 2:

    BEGIN ALL: FPC B: FPC ORDCHAR IS NOT LIKE 'A' END

    Output 2:

    99983 7

  • Input 3:

    BEGIN ALL: FPC AB: FPC ORDCHAR IS NOT LIKE '*' END

    Output 3:

    99983 0

Moving the NOT operator before the field name provides the complement or NOT of the found set and operates on all records, even those not containing the field:

  • Input 4:

    BEGIN ALL: FPC NOTA: FPC NOT ORDCHAR IS LIKE 'A' END

    Output 4:

    99983 99974

  • Input 5:

    BEGIN ALL: FPC NOTB: FPC NOT ORDCHAR IS LIKE 'B' END

    Output 5:

    99983 99976

  • Input 6:

    BEGIN ALL: FPC NOTA: FPC NOT ORDCHAR IS LIKE '*' END

    Output 6:

    99983 99967

Providing sufficient stack space for complex patterns

If a pattern retrieval produces one of the following error messages, you might need to increase the length of the Model 204 pushdown list area, a stack area which contains dynamic storage used by both the pattern matcher and the Model 204 assembler:

M204.2104 THE STACK IS TOO SMALL TO PROCESS YOUR REQUEST M204.2106 STACK OVERFLOW. RECURSION TOO DEEP. SIMPLIFY PATTERN

This stack area is controlled by the LPDLST parameter, a SYSTEM parameter that can be set (or reset) on the User 0 line or with the UTABLE command. For extremely complex patterns (for example, patterns involving nested repeated patterns or nested wildcards) you might need to reset LPDLST to 32K.

For sites using MP/204 (the multiprocessor feature), LPDLST is set once, but is allocated for each processor of the CPU.

Note: The LCPDLST parameter, which controlled the stack area used by the pattern matcher in earlier releases, should be kept at its default value.

Pattern characters in the Is Like clause

The characters used for pattern retrieval are summarized in the following table, and discussed in detail in subsequent sections. Because a pattern is treated like a character value, it needs beginning and ending quotes to conform to the format requirements of User Language. See Quotation marks for a detailed discussion about the use of quotes in User Language.

Pattern matching characters
Character Description
* Wildcard character
+ Placeholder character
, Or character
( ) Set begin and end characters
- Range character
/ Repeat character
! Escape character
= Hexadecimal character
# Numeric digit character
@ Alphabetic character

Beginning in Model 204 Version 5.1, all characters X'00' through X'FF' are valid in a pattern presented to the User Language pattern matcher and the $CHKPAT function.

As a result all characters are treated as valid, literal characters. The following error messages are no longer invoked for these characters.

M204.1688: errortype IN PATTERN 'pattern' AT CHARACTER char M204.1689: errortype IN PATTERN 'pattern' AT CHARACTER char

Note: Patterns containing embedded escape characters preceding special characters function as they did in earlier releases of Model 204.

Wildcard character

The wildcard character serves as a placeholder for 0 to 255 characters. You can use wildcard characters as follows:

  • Embed wildcard characters within a repeat pattern.
  • Use a repeat character (/) immediately after a wildcard character.
  • Include wildcard characters, either alone or in combination with other characters, in a set.

For example:

If pattern is... Sample matching values might be...
A*SON ANDERSON ABLESON ASON

Placeholder character

The placeholder character serves as a placeholder for one character only. For example:

If pattern is... Sample matching values might be...
C+RY CARY CORY
J+++SON JOHNSON JACKSON J%&$SON.

Or character

The Or character (a comma) allows the choice of more than one matching value. The Or character also can be used to define a null (,,) character. The values separated by the Or character are called members of the pattern. For example:

If pattern is... Sample matching values might be...
JONES,J+++SON JONES JOHNSON JACKSON

Note: The following type of pattern is illegal:

/A,B(XYZ)

Set begin and set end characters

The set begin and end characters allow for a segregation of choices. Set characters are required when a repeat character or range characters are used (see below). A comma (the Or character) is used to separate the choices. For example:

If pattern is... Sample matching values might be...
(JACK,JOHN)SON JACKSON JOHNSON

Note: You can include additional set begin and set end characters in a repeat pattern. The following type of pattern is illegal:

/A,B(XYZ)

Range character

The range character allows a single EBCDIC character range (including unprintable characters) to be specified. The range must either be enclosed with set begin and end characters or be a set member with a comma preceding and/or following it. In addition, the ending range character must be greater than the beginning range character. For example:

If pattern is... Sample matching values might be...
(0-9) 0 3 5 8 9
(A,N-T,X)*SON OLSON TILSON XSON

Repeat character

The repeat character lets you specify one or more repetitions of a set. The repeat character must be followed by a single number or a range of two numbers. The number must be between 0 and 255. The second number, if specified, must be greater than the first. In addition, a set must begin immediately following the repeat number(s). Supported uses for repeat characters include the following:

  • Including more than one repeat character within a pattern.
  • Including wildcard characters within a repeat pattern.
  • Using a repeat character (/) immediately after a wildcard character.
  • Including additional set begin and set end characters in a repeat pattern.

For example:

If pattern is... Sample matching values might be...
/3(COPY) COPYCOPYCOPY
/1-2(COPY) COPY COPYCOPY
/2(0-9) 01 98 23
/1-4(+) ONE 7X FOUR A 42

Note: The following type of pattern is illegal:

/A,B(XYZ)

Escape character

The escape character (!) allows a character to be interpreted as a literal character rather than as a pattern matching character. The escape character affects only the next character and is needed when you want to treat as a valid literal character one of the eleven pattern matching characters: * + ' ( ) - / ! = # @.

The escape character works as shown in the following example.

If pattern is... Sample matching values

!(800!)-*

(800)-244-3344

J+++.

JUMP. JILT. J%&$.

The eleven special pattern characters, listed in Pattern characters, invoke special pattern operations, unless preceded by the escape character (!).

Note: Patterns containing embedded escape characters preceding special characters function as they did in earlier releases of Model 204.

Hexadecimal character

The hexadecimal character allows for the hexadecimal representation of one or a range of unprintable EBCDIC characters. The character must be followed by two digits (0-9,A-F) that represent the hexadecimal value of the character to be retrieved. For example:

If pattern is... Sample matching values might be...
=6B,=4D , (
(=7C-=7F) @ ' = "

Numeric digit character

The numeric digit character allows a numeric range (0-9) to be specified. For example:

If pattern is... Sample matching values might be...
### 099 378 1111
/1-4(#) 2 9834 23

Note: If the value of the FLUSH parameter is #, FLUSH must be reset before the numeric digit pattern character can be used during line editing.

Alphabetic character

The alphabetic character allows an alphabetic range (a-z, A-Z) to be specified. Unprintable characters are not included. For example:

If pattern is... Sample matching values might be...
@@@@@ JONES SARAH Frand
/2-5(@) TO Harry JAN

Note: If the value of the ERASE parameter is @, ERASE must be reset before the alphabetic pattern character can be used during line editing.

Pattern retrieval optimization

For faster retrievals you can take advantage of the ORDERED CHARACTER attribute on fields:

  • Without the ORDERED CHARACTER attribute the data (Table B) is searched directly.
  • With the ORDERED CHARACTER attribute the Ordered Index is searched first.

Pattern retrieval optimization cannot be performed if the pattern and any of its members begin with:

  • set characters (parentheses)
  • range characters (hyphens)
  • hexadecimal value 'FF'
  • a wildcard character (*)
  • a placeholder character (+)
  • a numeric digit character (#)
  • an alphabetic pattern character (@)

When pattern retrieval optimization cannot be performed, an informational message is issued to the audit trail to notify you that the entire Ordered Index will be searched:

M204.2864: FULL SEARCH OF ORDERED INDEX PERFORMED

Note: Since message M204.2864 is issued at pattern evaluation time and will be issued each time the pattern is evaluated for each segment of the file, you may want to remove the overhead of issuing the message entirely. Issue the following command to minimize the amount of CPU used:

MSGCTL M204.2864 NOACTION

Examples

These patterns would result in an optimized retrieval:

SMITH* ABBY,DAVID,DANNY A*,ERNIE,BERT,OS* 213,433,99 SP/2(E)D,VAC/2(U)M ALPHAB(ET,ITS) BOS(TON,CO),NEW( YORK,ARK)

while these patterns would search the entire Ordered Index:

/1-5(A) +++SON @@@@NATHA (A-N)+++ *SMITH

Pattern retrieval performance also is affected by the complexity of the pattern specified. The more complex the pattern, the more processing is required.

Use of full key for pattern matching

The full key is used when you acquire an Ordered Index found set by pattern matching. For example, in a FIND statement, your selection criteria could use the pattern matcher in this way:

FIND ALL RECORDS FOR WHICH fieldname IS LIKE 'COMPANY01DEPT05ACCT02*' END FIND

The pattern matcher now uses the full key COMPANY01DEPT05ACCT02 to search for the initial found set, then sees what fields in the found set match the pattern selection criteria. Performance is improved by restricting how much of the Ordered Index is searched to acquire the initial found set.

Prior to Version 3.2, the pattern matcher would truncate COMPANY01DEPT05ACCT02 to COMPANY01* for searching the Ordered Index for the initial found set. If COMPANY01 is the entire corporation, but only accounts (02) from one department (05) are wanted, the found set would have included the entire corporation. The pattern matcher would then look at the found set and return only those records whose fields actually matched the pattern selection criteria.

Using expressions in FIND statements

EQ VALUE retrieval condition

Purpose

The EQ VALUE clause provides support for the general use of expressions in FIND statements.

Syntax

The expression is enclosed in parentheses after the EQ VALUE clause.

FIELDNAME EQ VALUE (expression)

Where

Any expression can be used with the EQ VALUE clause.

Example

FD: IN FILE PEOPLE FD LAST EQ VALUE($READ('LAST NAME?')) END FIND

EQ WITH retrieval condition for concatenated fields

The EQ WITH clause is available as of Model 204 version 7.5. To use the EQ WITH clause, FILEORG=x'100' must be set.

Purpose

The EQ WITH clause retrieves CONCATENATION-OF fields. Model 204 automatically builds the concatenated value.

Syntax

The FIND statement for retrieving data based on concatenated fields is:

label: FIND ALL RECORDS CONCAT_FIELD EQ WITH ('field1 value',...,' - field8 value')

Usage

Although you must specify a minimum of two fields in the WITH clause, you can concatenate as many as eight. The number specified must match the concatenated field contents. For example, if the field is composed of four fields, the retrieval condition must include four fields. Otherwise, the following error message is produced:

M204.2877: WITH CLAUSE INVALID: CONCATENATION-OF FIELD HAS WRONG NUMBER OF COMPONENT FIELDS

Consider an example of a concatenated field FULL_NAME:

DEFINE FULL_NAME WITH CONCATENATION-OF FIRST WITH - MIDDLE WITH LAST AND ORD SEP C'$'

and a sample PAI of a record:

FIRST = SAM MIDDLE = Q LAST = SMITH FULL_NAME = SAM$Q$SMITH

The record could be retrieved with:

FD: IN PEOPLE FD FULL_NAME = 'SAM$Q$SMITH' END FIND

The separator, a dollar sign ($) in this case, must be included in the string.

Alternatively, each part of the concatenated field could be used with the new EQ WITH syntax that does not require knowledge of the separator character:

FD: IN PEOPLE FD FULL_NAME EQ WITH ('SAM','Q','SMITH') END FIND

The values specified in the EQ WITH clause can be the results of expressions:

%PARENTS_NAME = '...SMITH' FD: IN PEOPLE FD FULL_NAME EQ WITH ('SAM','Q',$SUBSTR(%PARENTS_NAME,4)) END FIND

Limitations of CONCATENATION-OF fields

Limitations when using SEPARATOR NONE

If you use SEPARATOR NONE in the CONCATENATION-OF field definition, your find set results might not be what you wanted.

For example:

FD: IN FILE PITCHERS FD FOOBAR EQ WITH(’ABC’, ’XYZ’) END FIND

Would match a record with:

FOO = AB BAR = CXYZ

as well as records with

FOO = ABC BAR = XYZ

Limitations when using ESCAPE CANCEL

If you define ESCAPE CANCEL on the CONCATENATION-OF field declaration, an attempt to store a field value with the separator character in a component field for a CONCATENATION-OF field results in request cancellation.

For example, if FULL_NAME is defined with

(ORD CAT FIRST WITH MIDDLE WITH LAST SEP C'$' ESC CANCEL)

Then the following request will cancel:

BEGIN IN PEOPLE STORE RECORD FIRST='SAM' MIDDLE='Q$' LAST='SMITH' END STORE END *** 1 CANCELLING REQUEST: M204.2873: CONCATENATION FIELD FULL_NAME COMPONENT FIELD MIDDLE CONTAINS SEPARATOR CHARACTER

Limitations when using EQ WITH and file groups

An explicit CONCATENATION-OF field must be a CONCATENATION-OF field in any file where it is defined, although it does not have to be defined in every file in the group, just as with any other field reference in a group find. The number of component fields must also be the same in every file, however, the separator and escape characters can be different as can the actual component field names. For example, if one:

Defined FULL_NAME in file PEOPLE with component fields FIRST, MIDDLE, and LAST and $ for the separator character:

IN PEOPLE DEFINE FULL_NAME WITH CONCATENATION-OF - FIRST WITH MIDDLE WITH LAST AND ORD SEP C'$' ESC CANCEL

Defined FULL_NAME in file PEOPLE2 with component fields FIRST_NAME, MIDDLE_INITIAL, and LAST_NAME and the asciicircum, (^) for the separator character:

IN PEOPLE2 DEFINE FULL_NAME WITH CONCATENATION-OF - FIRST_NAME WITH MIDDLE_INITIAL WITH LAST_NAME AND - ORD SEP C'^' ESC CANCEL

Without the EQ WITH retrieval condition, in order to find SAM Q SMITH in both files, the FIND would require two separate strings and knowledge of the separation character:

FD: IN PEOPLE, PEOPLE2 FD FULL_NAME = 'SAM$Q$SMITH' OR 'SAM^Q^SMITH' END FIND

Using the EQ WITH allows the much simpler syntax of:

FD: IN PEOPLE, PEOPLE2 FD FULL_NAME EQ WITH ('SAM','Q','SMITH') END FIND

Miscellaneous limitations when using EQ WITH

If an EQ WITH clause produces a value longer than 255 bytes, it truncates the value.

The EQ WITH clause usually issues any errors at compile time. For example, if you issue the following in file context:

FD: IN PEOPLE FD FULL_NAME EQ WITH ('SAM','Q') END FIND

And FULL_NAME is a concatenation of three fields, a compile time error is issued.

In contrast, if you use a field name variable in your query:

%FIELD = 'FULL_NAME' FD: IN PEOPLE FD  %%FIELD EQ WITH ('SAM','SMITH') END FIND

This type of error would only be detected at evaluation time and the request would be canceled.

Transporting Large Object data using the Universal Buffer

The Universal Buffer is used to transport Large Object data to and from a client. The following is a sample request that moves Large Object data from a client to the Model 204 server.

  • Transporting data from a client to the Model 204 server:

    BEGIN ... FD: retrieval-criteria ... /* a statement populates BUFFER with data */ %POSITION=23 %LENGTH=8000 FOR 1 RECORD IN FD ADD NOVEL=BUFFER,%POSITION,%LENGTH END FOR END

  • Transporting Large Object data from the server to a client application:

    BEGIN FD: IN FILE xx FD retrieval criteria END FIND FR FD BUFFER,position,length=NOVEL,position,length END FOR END /* A statement to send the BUFFER to a client*/

Note: Lob-name NOVEL in the first of these examples has a position and length parameter, which you must supply. You can extract a section of the Large Object field by supplying a position and a length within the data. If you want all the data, position is set to 1 and length is the actual size of the Large Object data obtained using $LOBLEN.

IS PRESENT condition

The phrase, IS PRESENT, can be used as a condition of a FIND statement to specify whether or not a field exists in a record. IS PRESENT retrieves only records that contain at least one occurrence of the specified field, regardless of its value.

Syntax

The format of the condition is:

fieldname IS [NOT] PRESENT

Performance considerations

A retrieval using the IS PRESENT condition is performed by directly searching the data records (Table B). To increase the efficiency of the search, the IS PRESENT condition should be preceded by a condition that uses a field with the KEY attribute.

For example, the following statement locates all records that contain one or more occurrences of the INCIDENT field. Because the RECTYPE field has the KEY attribute, the Table B search initiated by the IS PRESENT condition is performed only on the records retrieved by RECTYPE = DRIVER.

IS NOT PRESENT condition

The condition IS NOT PRESENT retrieves only records in which the field is missing. For example, this statement locates all records that do not contain an INCIDENT field:

FIND.RECS: FIND ALL RECORDS FOR WHICH RECTYPE = DRIVER INCIDENT IS NOT PRESENT END FIND

FIND$ condition

The FIND$ condition performs a FIND on a set of records that has been retrieved by a previous FIND statement.

Syntax

The format of FIND$ is:

FIND$ label

When to use FIND$

The FIND$ condition is typically used when you want to AND, NOT, or OR a condition with an already found set. FIND$ also can be used when some of the records, fields, or values in the original found set have been deleted and you wish to refer to the set containing the deletions.

For example:

BEGIN GROUP1: FIND ALL RECORDS FOR WHICH DATE OF BIRTH IS BETWEEN 19200101 AND - 19650101 STATE = CALIFORNIA END FIND GROUP1.CT: COUNT RECORDS IN GROUP1 GROUP1A: FIND ALL RECORDS FOR WHICH SEX = M AND FIND$ GROUP1 END FIND FOR EACH RECORD IN GROUP1A PRINT FULLNAME SKIP 1 LINE END FOR GROUP2A: FIND ALL RECORDS FOR WHICH SEX = M AND NOT FIND$ GROUP1 END FIND GROUP2A.CT: COUNT RECORDS IN GROUP2A PRINT COUNT IN GROUP1.CT PRINT COUNT IN GROUP2A.CT END

The GROUP1A statements retrieves records for SEX = M, DATE OF BIRTH between 200101 and 650101, and STATE = CALIFORNIA. The GROUP2A statements retrieves records for which SEX = MALE and either STATE not equal CALIFORNIA or DATE OF BIRTH not between 200101 and 650101.

SFL$ and SFGE$ conditions

The SFL$ condition

For sorted files, the condition:

SFL$ value

retrieves all records for which the sort field is less than the stated value.

The SFGE$ condition

For sorted files, the condition:

SFGE$ value

retrieves all records for which the sort field is greater than or equal to the stated value.

Example using SFL$ and SFGE$

In a sorted file in which the sort field is FULLNAME, this statement:

FIND.RECS: FIND ALL RECORDS FOR WHICH CITY = SAN DIEGO SFGE$ R AND SFL$ RZZZ END FIND

retrieves all records of people who live in San Diego and whose full name starts with R.

POINT$ condition

When a record is stored it is assigned an internal record number. In a FIND statement, the POINT$ condition retrieves all records that have internal record numbers greater than or equal to the stated value. If the given value is not a number, no records are retrieved.

When to use POINT$

The POINT$ condition should be used when ranges of record numbers must be retrieved. If individual record numbers need to be retrieved, the FOR RECORD NUMBER statement should be used. See Record Loops for a detailed discussion of the FOR RECORD NUMBER statement.

Syntax

The format of POINT$ is:

POINT$ n

Example

This statement:

FIND.RECS: FIND ALL RECORDS FOR WHICH POINT$ 1500 AND NOT POINT$ 2500 END FIND

retrieves all records that have internal record numbers between 1500 and 2499, inclusive.

FILE$ condition

The FILE$ condition can be used as a condition in a FIND statement to restrict the files from which records are selected.

The restriction can apply to a group FIND or to a FIND that refers to the record set of an earlier group FIND. FILE$ can also be used to limit the reference context of a FIND statement when using Parallel Query Option/204. See Files, Groups, and Reference Context.

Syntax

The format of FILE$ is:

FILE$ {filename | =}

Where

  • filename is one of the following:
    • A literal group member name, either without quotation marks or enclosed in single quotes
    • A literal remote file specification or file synonym

    filename cannot be represented by:

    • A %variable
    • A remote file specification enclosed entirely in quotes (for example, 'CLIENTS AT BOSTON')
  • The equal sign (=) restricts the search to the node on which the request is running (the local node). The equal sign cannot be used in a FIND statement that uses a complex FILE$ clause (for example, FILE$ CLAIMS89 OR FILE$ CLAIMS90).

The FILE$ condition is valid only in group context, and is supported in scattered group context.

Example

If a group consisting of the CLIENTS and CLAIMS90 files is open, the following statement retrieves records from CLIENTS but not CLAIMS90:

FIND.RECS: FIND ALL RECORDS FOR WHICH FILE$ CLIENTS POLICY NO = 100340 END FIND

LOCATION$ condition

The LOCATION$ condition is valid only when you are using Parallel Query Option/204. It is used to restrict the results of a group context FIND statement to a particular node or nodes.

Syntax

The format of LOCATION$ is:

LOCATION$ {location | =}

Where

  • location is the node name used in file synonyms and remote file specifications. location can be coded as a literal location name, or accepted as input in response to a dummy string prompt. location cannot be represented by a %variable.
  • The equal sign (=) indicates the node on which the request is running (the local node).

The LOCATION$ condition is supported in scattered group context, and is valid only in group context.

If an optional member of a scattered group is unavailable, no records are found for that file and processing continues.

Examples

Suppose you have opened a scattered group consisting of the files CLIENTS AT BOSTON, CLIENTS AT WASHINGTON, and CLAIMS90 AT BOSTON. The following FIND statement retrieves records from CLIENTS AT BOSTON but not CLIENTS AT WASHINGTON:

FIND.RECS: FIND ALL RECORDS FOR WHICH FILE$ CLIENTS LOCATION$ BOSTON POLICY NO = 100340 END FIND

You can use the LOCATION$ condition to specify, in a single FIND statement, multiple files which can be located at different nodes. The following FIND statement retrieves records from locations NORTHEAST and SOUTHWEST:

FIND.RECS: FIND ALL RECORDS FOR WHICH FILE$ VEHICLES LOCATION$ NORTHEAST OR LOCATION$ SOUTHWEST MAKE = 'FORD' END FIND

LIST$ condition

You can refer to the collection of records on a list by entering the LIST$ condition as a condition in a FIND statement. See Lists for a detailed discussion of lists and the LIST$ condition.

IN label retrieval

The IN label clause with the FIND statement can be used to refer to a set of records that has been retrieved by a previous FIND statement.

Syntax

The format for the IN label clause used with a FIND statement is:

label.b: FIND ALL RECORDS IN label.a FOR WHICH retrieval conditions

where label.a is the label of a preceding FIND statement.

Example

BEGIN N.Y.DRIVERS: FIND ALL RECORDS FOR WHICH RECTYPE = DRIVER STATE = NEW YORK END FIND CT.1: COUNT RECORDS IN N.Y.DRIVERS PRINT COUNT IN CT.1 - WITH ' NEW YORK DRIVERS' DATE: FIND ALL RECORDS IN N.Y.DRIVERS - FOR WHICH INCIDENT DATE IS > 19901231 END FIND CT.2: COUNT RECORDS IN DATE PRINT COUNT IN CT.2 - WITH ' WITH INCIDENTS AFTER 12/31/1990' END

In this request, the FIND statement labeled DATE finds only the records in the previous found set that also contain an incident date later than December 31, 1990.