Description

This function searches a string to find the first occurrence of a sub-string which matches a supplied template string. Alternatively, an option checks for a match between the template and the whole string; used this way, SearchString is a test for equivalent strings.

Return Type

A McObject object.  

An Long array of length-2 or longer. If no match is is found, both values will be negative. If a sub-string is found, the first value is the positive index into StringToSearch of the character that matched the beginning of the template; the second value is the number of characters in the matching substring (for certain regular expressions, the number of matching characters can be zero; e.g., ^$ matches zero-length lines). If the regular expression contained groups (i.e., '\(' and '\)' pairs), then for each group found, the starting offset and count (which can be zero) of the grouped characters are appended.

Syntax

object.McSearchString (Template, StringToSearch, [OptionFlags], [CharLimits])

The McSearchString Method syntax has these parts:

PartDescription
objectAn expression evaluating to an object of type McOMGlobal.
TemplateRequired. A String value.
StringToSearchRequired. A String value.

The string being searched

OptionFlagsOptional. A mcobjTextMatchFlags enumeration, as described in settings.

Flags to control whether the TemplateString is to be interpreted as a regular expression or not, and whether case should be ignored in the lookup. Flags also control whether the whole string must be matched, and if a regular expression template needs to match a word or if wild card regular expressions are allowed to extend across an end-of-line in the StringToSearch.

CharLimitsOptional. A Variant value.

If given, a length-1 or length-2 array. The first element is an offset into StringToSearch giving the starting point of the search. If the array is length-2, then then second element is the maximum number of characters to search (if this element is not given, the StringToSearch is potentially searched up to its end). A negative offset or count results in a no-match return value. If not given, no offset or character limits are used; searching starts at offset 0, the beginning of the StringToSearch and can cover the whole string, if necessary.

Settings

The settings for OptionFlags are:

ConstantValueDescription
 mcobjNoRegExprAndMatchCase0
 mcobjAllowRegularExpression1

allow regular expression match

 mcobjIgnoreCase2

ignore case when doing match

 mcobjMatchWholeWords4

Match whole words

 mcobjMatchAcrossEndOfLine8

The . wild-card character can match across an end-of-line

 mcobjMatchEntireString16

Match template to entire string rather that the first match

Remarks

If a match is found, a length-2 results vector is returned indicating where the matched sub-string starts and how long it is.

The template string can optionally contain a “regular-expression” similar to that used by the UNIX grep utility. If the “enable regular expression” option is selected, the template string can contain special wild card characters according to the following rules:

. A dot (period) matches any single character.

[abc]   A group of characters enclosed in brackets matches any character within
the brackets.  The brackets may contain ranges; for example, [a-z] matches any
letter from a through z (A through Z will also match if the “ignore case” option
is selected).

[^abc] A group of characters enclosed in brackets and beginning with the ^ character, matches any character not listed within the brackets. The brackets may contain ranges; for example, [^a-z] matches any character other than a lower case letter (upper case letters will also fail to match if the “ignore case” option is selected).

The character or any one of a bracketed group of characters preceding a character is matched zero or more times.

^ and $ match the beginning of a line (following a line-feed character, \n) or end-of-line (a carriage-return \r, a \n, or end of row), respectively.

< and > match the beginning or end of a word, respectively (a word consists of letters, numbers or underscore only).

A group may be identified by surrounding the group with \( and \).

The backslash, \, is the escape character; it undoes the special meanings of the above characters. Since the backslash is also ALI's escape character, you will need to put in \\ pairs to get one backslash within a quoted string.

Any other character matches only that character.

Notes

The UNIX grep utility uses \< and \> instead of < and > for begin/end word in regular expressions. Currently, we do not support the grep group-reference construct \digit within reqular expression templates.