Regex class

From m204wiki
Revision as of 14:21, 12 March 2022 by Alex (talk | contribs) (Created page with "The <var>Regex</var> class provides a facility that allows reuse of runtime compiled regular expressions. It was first available in Model 204 version 7.9. ==Regex usage== Be...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The Regex class provides a facility that allows reuse of runtime compiled regular expressions. It was first available in Model 204 version 7.9.


Regex usage

Before model 204 7.9, every regular expression function would recompile the regular expression specified as a string parameter on each call. So, in a program like the following:

%num is float %stringlist is object stringlist ... %num = 1 repeat forever %num = %stringlist:regexLocate("(%[a-z_][a-z_.0-9]*) *=.*?\1", options="i") if %num eq 0 then loop end; end if ... end repeat

the regular expression in the RegexLocate call would be parsed and compiled in every iteration of the loop. As one might guess, this adds a lot of unnecessary overhead. In Model 204 7.9 and later, the regular expression would be compiled at the time the surrounding SOUL code would be compiled so no compilation would be necessary at runtime. The Model 204 regular expression engine was also rewritten in 7.9 to make it significantly more efficient so code like that above would be significantly more efficient in Model 204 7.9 and later.

For most purposes, these improvements can be taken advantage of without any effort. However, as one might guess, a regular expression can only be compiled at SOUL compile-time if the regular expression is a literal. Otherwise how could the compiler know what to compile? There are cases, however, where a regular expression is only available at runtime. A classic example is in a program that allows end users to type in arbitrary regular expressions for matching against a set of values. In such a case, it might be useful to be able to compile the user-specified expression once and then apply it to a variety of values.

It might also be useful to use the same regular expression in multiple places or to even create a system or subsystem global that contains a regular expression. Or one might simply want to separate the regular expression creation from the code that uses it for code-readability or maintenance reasons.

It is for cases like these that the Regex class was created. To illustrate how the Regex class might be used, consider the above example where user-specified regular expression is in string variable %userRegex. In that case, the above code can be written as:

%num is float %regex is object regex %stringlist is object stringlist %userRegex is string len 255 ... %num = 1 %regex = new(%userRegex, options="i") repeat forever %num = %regex:locate(%stringlist, %num) if %num eq 0 then loop end; end if ... end repeat

In this case, the user-specified regular expression is only compiled once in the Regex constructor call (new) and reused multiple times after that, for optimal efficiency.

List of Regex methods

The individual Regex methods are summarized in List of Regex methods.