Home > Expression Operators > Relational Operators > Pattern

Pattern

A pattern is a string of text surrounded by single quotes (called a character string), e.g., 'JONES', and is used with operators #IN and #NOTIN.

A pattern can be a string of specific characters or can include one or more asterisks (*) to represent any number (including zero) of arbitrary characters. The * can occur at the beginning of, at the end of, or within the pattern. For example, the following are all valid patterns in ProVal:

‘SMI*’ ‘*ERS’ ‘*TH*’ ‘SMI*ERS’ ‘SM*TH*RS’

‘SMI*’ can represent ‘SMI’, ‘SMIT’, ‘SMITH’, ‘SMITE’, or ‘SMITHERS’, but not ‘SMOTE’ or ‘SMOTHERS’. ‘*TH*’, like ‘SMI*ERS’ and ‘SM*TH*RS’, can represent ‘SMITHERS’ or ‘SMOTHERS’, but can also represent ‘SMITH’, ‘SMITHTOWN’, and ‘THEODORE’, which ‘SMI*ERS’ and ‘SM*TH*RS’ cannot.

In addition to the *, a question mark (?) can be used to represent exactly one arbitrary character. For example, ‘HM??’ matches ‘HM12’ but not ‘HM123’ or ‘HM1’; ‘?BC‘ matches ‘ABC’ and ‘BBC’, but not ‘XYZBC’. The possible values of ‘A?? ‘ are all three-character values that start with A.

A pattern can include both * and ?. A pattern need not, however, include either * or ?, as in the pattern ‘JONES’.

When searching for character values, the items in the right argument to #IN and #NOTIN are patterns that ProVal will compare to the value of the database field in the left argument. For example, when ProVal processes the expression “NAME #IN ‘SMI*’ “, the pattern ‘SMI*‘ represents any value of NAME that starts with SMI (such as the names Smith, Smiley, and Smithers, but not Smuckers). In this particular example, the pattern contained one asterisk placed at the end of the character string, but the asterisk could have been placed elsewhere or the pattern could have contained more (or no) asterisks. Here are some examples of possible values for patterns with other placements of the asterisk or a different number of asterisks:

*ers Values ending in “ers”, including ers
*ch* Values containing “ch”, including ch
Jo*s Values beginning with “Jo” and ending with “s”
J*son* Values beginning with “J” and containing “son”
Jason One value, “Jason”

Spaces in a pattern are treated the same way as non-blank characters—for the pattern to match a field value, that value must contain spaces as specified (number and placement) in the pattern. For example, a database field value containing the character string “A <space> B” matches the pattern ‘*A B*‘; a value containing “A <space> <space> B” or “AB” or “AB <space>” does not match (unless, of course, the value also contains “A <space> B” among the characters represented by an asterisk).

If a pattern in the right argument contains a single quote, you must double it in the argument to #IN or #NOTIN, as in ‘That”s the ticket’.

Advanced patterns:

When a pattern is a right argument of #IN, it can be prefixed with a tilde (~) to indicate that database field values matching the pattern should be excluded (i.e., only values that don’t match the pattern should be selected) and 1 returned for values not matching the pattern. For example, the following two expressions produce the same result:

HMO #IN ('HM*','~HM10','~HM11')

(HMO #IN'HM*') #AND (HMO #NOTIN ('HM10','HM11')).

Patterns in the right argument need not be restricted to a single value (string of specific characters), but can represent several values by including asterisks and question marks. For example, because ‘A*‘ represents values that begin with A, ‘~A*‘ selects values that don't begin with A.

The patterns in a list are processed sequentially, from first to last, and each pattern adds to or removes from the previous selection. This is significant for patterns that start with a tilde. If the first pattern in a list starts with a tilde, an implicit * pattern is added to the front of the list, so that the search selects everything and then removes items matching the pattern starting with the tilde. Thus, HMO #IN '~HM10' matches everything but ‘HM10’ and is the same as HMO #IN ('*', '~HM10') or HMO #NOTIN 'HM10'. Similarly, NAME #IN ('AB*’, '~??C*') first selects all values for the field NAME that begin with AB and then removes values that have C as the third character. This selects ‘ABDEF’, ‘AB’, and ‘ABZ’, but not ‘ABC’ or ‘ABCDE’.

To match a pattern value containing the symbols *, ?, ~, or " (called syntactic characters), you must surround the symbol with double quotes ("). The quotes can go around the entire pattern or around the syntactic character itself. Here are some examples:

"*ABC" Values equal to *ABC
"*"ABC Same as above
"X?Y"* Values that start with X?Y
*"*"* Values containing *

If a syntactic character is outside double-quotes, it has its usual syntactic meaning; if it is within double-quotes, it’s treated as an ordinary character.

To search for a double-quote character, you must double the double-quote within the surrounding double-quotes, as in:

“D””Q” Values equal to D"Q
D””””Q Same as above