Replace (Regex function): Difference between revisions
No edit summary |
No edit summary |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{Template:Regex:Replace subtitle}} | {{Template:Regex:Replace subtitle}} | ||
This function replaces the parts of a string that match the regular expression in the <var>Regex</var> object and returns the string with the replacements. | This function replaces the parts of a string that match the regular expression in the <var>Regex</var> object and returns the string with the replacements. This provides similar functionality to [[RegexReplace (String function)|RegexReplace]] and [[UnicodeRegexReplace (Unicode function)|UnicodeRegexReplace]]. | ||
==Syntax== | ==Syntax== | ||
{{Template:Regex:Replace syntax}} | {{Template:Regex:Replace syntax}} | ||
Line 12: | Line 12: | ||
<td>The string to test against the Regex object.</td></tr> | <td>The string to test against the Regex object.</td></tr> | ||
<tr><th>replacement</th> | <tr><th>replacement</th> | ||
<td>The string that replaces the substrings of <var class="term">string</var> that <var class="term">regex</var> matches. Except when the <code>A</code> option is specified (as described at [[Regex_processing#Common_regex_options|Common regex options]]), you can include markers in the <var class="term">replacement</var> value to indicate where to insert corresponding captured strings — strings matched by capturing groups (parenthesized subexpressions) in the | <td>The string that replaces the substrings of <var class="term">string</var> that <var class="term">regex</var> matches. Except when the <code>A</code> option is specified (as described at [[Regex_processing#Common_regex_options|Common regex options]]), you can include markers in the <var class="term">replacement</var> value to indicate where to insert corresponding captured strings — strings matched by capturing groups (parenthesized subexpressions) in the regular expression, if any. | ||
<p> | <p> | ||
These markers are in the form <var class="term">$n</var>, where <i>n</i> is the number of the capture group, and 1 is the number of the first capture group. <i>n</i> must not be 0 or contain more than 9 digits. If there was no <i>n</i>th capture group corresponding to the <var class="term">$n</var> marker in a replacement string, the (literal) value of <var class="term">$n</var> is used in the replacement string instead of the empty string. <code>xxx$1</code> is an example of a valid replacement string, and <code>$0yyy</code> is an example of an invalid one. Or you can use the format <var class="term">$mn</var>, where <i>m</i> is one of the following modifiers: | These markers are in the form <var class="term">$n</var>, where <i>n</i> is the number of the capture group, and 1 is the number of the first capture group. <i>n</i> must not be 0 or contain more than 9 digits. If there was no <i>n</i>th capture group corresponding to the <var class="term">$n</var> marker in a replacement string, the (literal) value of <var class="term">$n</var> is used in the replacement string instead of the empty string. <code>xxx$1</code> is an example of a valid replacement string, and <code>$0yyy</code> is an example of an invalid one. Or you can use the format <var class="term">$mn</var>, where <i>m</i> is one of the following modifiers: | ||
Line 25: | Line 25: | ||
The only characters you can escape in a replacement string are dollar sign (<code>$</code>), backslash (<code>\</code>), and the digits <code>0</code> through <code>9</code>. So only these escapes are respected: <code>\\</code>, <code>\$</code>, and <code>\0</code> through <code>\9</code>. No other escapes are allowed in a replacement string — this includes "shorthand" escapes like <code>\d</code> — and an "unaccompanied" backslash (<code>\</code>) is an error. For example, since the scan for the number that accompanies the meta-$ stops at the first non-numeric, you use <code>1$1\2</code> to indicate that the first captured string should go between the numbers 1 and 2 in the replacement string. | The only characters you can escape in a replacement string are dollar sign (<code>$</code>), backslash (<code>\</code>), and the digits <code>0</code> through <code>9</code>. So only these escapes are respected: <code>\\</code>, <code>\$</code>, and <code>\0</code> through <code>\9</code>. No other escapes are allowed in a replacement string — this includes "shorthand" escapes like <code>\d</code> — and an "unaccompanied" backslash (<code>\</code>) is an error. For example, since the scan for the number that accompanies the meta-$ stops at the first non-numeric, you use <code>1$1\2</code> to indicate that the first captured string should go between the numbers 1 and 2 in the replacement string. | ||
<p>An invalid replacement string results in request cancellation.</p></td></tr> | <p>An invalid replacement string results in request cancellation.</p></td></tr> | ||
<tr><th><var> | <tr><th><var>Options</var></th> | ||
<td> | |||
A string of single letter options, which may be specified in uppercase or lowercase, in any combination, and blank separated or not. These options are a subset of [[Regex_processing#Common_regex_options|Common regex options]]. The only acceptable options (case-independent) are <code>A</code> for "as-is", <code>G</code> for "global" (replace all occurrences), and <code>T</code> for trace.</td></tr> | A string of single letter options, which may be specified in uppercase or lowercase, in any combination, and blank separated or not. These options are a subset of [[Regex_processing#Common_regex_options|Common regex options]]. The only acceptable options (case-independent) are <code>A</code> for "as-is", <code>G</code> for "global" (replace all occurrences), and <code>T</code> for trace.</td></tr> | ||
</table> | </table> | ||
==Usage notes== | ==Usage notes== | ||
<ul> | <ul> | ||
<li>If the regular | <li>If the regular expression specified in the constructor call was Unicode, this method causes request cancellation. To test if a <var>Regex</var> object was created with a Unicode regular expression check the [[IsUnicode (Regex property)|IsUnicode property]].</li> | ||
<li>There is no way to undo the <code>A</code>, <code>G</code>, and <code>T</code> options if they were specified on the constructor so if a <var>Regex</var> objects sometimes needs these options and sometimes not, they should be specified on each <var>Replace</var> call.</li> | <li>There is no way to undo the <code>A</code>, <code>G</code>, and <code>T</code> options if they were specified on the constructor so if a <var>Regex</var> objects sometimes needs these options and sometimes not, they should be specified on each <var>Replace</var> call.</li> | ||
</ul> | </ul> | ||
==Examples== | ==Examples== | ||
For example: | |||
<p class="code">b | |||
%regex is object regex | |||
%regex = new("([A-Z]{3,})(\d{3,})", replace="$2-$1") | |||
print %regex:unicodeReplace("My license plate says EYE2020") | |||
print %regex:unicodeReplace("My license plate says EYE2020", "nothing") | |||
end | |||
</p> | |||
displays: | |||
<p class="code">My license plate says 2020-EYE | |||
My license plate says nothing | |||
</p> | |||
==See also== | ==See also== | ||
{{Template:Regex:Replace footer}} | {{Template:Regex:Replace footer}} | ||
[[Category:Regular expression processing]] | [[Category:Regular expression processing]] |
Latest revision as of 15:06, 24 March 2022
Replace regex match(es) (Regex class)
This function replaces the parts of a string that match the regular expression in the Regex object and returns the string with the replacements. This provides similar functionality to RegexReplace and UnicodeRegexReplace.
Syntax
%string = regex:Replace( string, [replacement], [Options= string])
Syntax terms
%string | A copy of the input string after matches are replaced using the appropriate replacement string. | ||||
---|---|---|---|---|---|
regex | The Regex object. | ||||
string | The string to test against the Regex object. | ||||
replacement | The string that replaces the substrings of string that regex matches. Except when the A option is specified (as described at Common regex options), you can include markers in the replacement value to indicate where to insert corresponding captured strings — strings matched by capturing groups (parenthesized subexpressions) in the regular expression, if any.
These markers are in the form $n, where n is the number of the capture group, and 1 is the number of the first capture group. n must not be 0 or contain more than 9 digits. If there was no nth capture group corresponding to the $n marker in a replacement string, the (literal) value of $n is used in the replacement string instead of the empty string.
The only characters you can escape in a replacement string are dollar sign ( An invalid replacement string results in request cancellation. | ||||
Options |
A string of single letter options, which may be specified in uppercase or lowercase, in any combination, and blank separated or not. These options are a subset of Common regex options. The only acceptable options (case-independent) are A for "as-is", G for "global" (replace all occurrences), and T for trace. |
Usage notes
- If the regular expression specified in the constructor call was Unicode, this method causes request cancellation. To test if a Regex object was created with a Unicode regular expression check the IsUnicode property.
- There is no way to undo the
A
,G
, andT
options if they were specified on the constructor so if a Regex objects sometimes needs these options and sometimes not, they should be specified on each Replace call.
Examples
For example:
b %regex is object regex %regex = new("([A-Z]{3,})(\d{3,})", replace="$2-$1") print %regex:unicodeReplace("My license plate says EYE2020") print %regex:unicodeReplace("My license plate says EYE2020", "nothing") end
displays:
My license plate says 2020-EYE My license plate says nothing