Monday, October 26, 2009

Regex pattern modifier -PHP/Perl

i
When this modifier is used, the matching of alphabetic characters in the pattern becomes non-case-sensitive; for example, "/sgi/i" matches both "sgi" and "SGI." This is equivalent to Perl's /i modifier.


m

By default, PCRE treats the subject string as consisting of a single "line" of characters (even if it actually contains several newlines). The "start of line" metacharacter (^) matches only at the start of the string, while the "end of line" metacharacter ($) matches only at the end of the string, or before a terminating newline (unless the D modifier is also set). This is the same as in Perl.

When this modifier is used, the "start of line" and "end of line" constructs match immediately following or immediately before any newline in the subject string, respectively, as well as at the very start and end. This is equivalent to Perl's /m modifier. If there are no "\n" characters in a subject string, or no occurrences of ^ or $ in a pattern, setting this modifier has no effect.


s
When this modifier is used, a dot metacharacter (.) in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.


x
When this modifier is used, whitespace data characters in the pattern are ignored except when escaped or inside a character class, and characters between an unescaped # outside a character class and the next newline character, inclusive, are also ignored. This is equivalent to Perl's /x modifier, and makes it possible to include comments inside complicated patterns. Note, however, that this applies only to data characters. Whitespace characters cannot appear within special character sequences in a pattern, for example within the sequence (?(, which introduces a conditional subpattern.


e
When this modifier is used, preg_replace() does normal substitution of references in the replacement string, evaluates it as PHP code, and uses the result of the evaluation for replacing the match found by the pattern.

Only preg_replace() uses this modifier; it's ignored by other PCRE functions.


A
When this modifier is used, the pattern is forced to be "anchored"; that is, it's constrained to match only at the start of the string that's being searched (the "subject string"). This effect can also be achieved by appropriate constructs in the pattern itself, which is the only way to do it in Perl.


D
When this modifier is used, a dollar metacharacter ($) in the pattern matches only at the end of the subject string. Without this modifier, a dollar sign also matches immediately before the final character if it's a newline (but not before any other newlines). This modifier is ignored if the /m modifier is set. There is no equivalent to this modifier in Perl.


S
When a pattern is going to be used several times, it's worth spending more time analyzing it in order to speed up the time taken for matching. When this modifier is used, this extra analysis is performed. At present, studying a pattern is useful only for non-anchored patterns that don't have a single fixed starting character. This is equivalent to the study() function in Perl.


U
This modifier inverts the "greediness" of the quantifiers so that they're not greedy by default, but become greedy if followed by "?". Greedy quantifiers attempt to match as much of the target string as they legally can. The only limit on this behavior is that the greediness of one quantifier cannot cause the following other quantifiers in the pattern to fail. This modifier is not compatible with Perl.


X
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Any backslash in a pattern that's followed by a letter that has no special meaning causes an error, thus reserving these combinations for future expansion. By default, as in Perl, a backslash followed by a letter with no special meaning is treated as a literal. At present, no other features are controlled by this modifier.

No comments:

Post a Comment