Dec 15 2006
Regex Notes
Regular expressions (regex) let you search for a particular pattern. Regex consist of literal characters, and special characters, also called meta characters. Regex syntax has 2 modes, one inside of character classes, which is inside [ ] and one outsided of character classes.
Character classes []: Regex inside [] are in different mode and have different syntax. It matches a single character specified as a list, range or class shortcut. 03[-./]19[-./]76 Will match the above date delimited by a dot or a hyphen. Inside [] the dot is no longer a meta-character, its matched literally! Negation Operator ^ f[^u] matches something with f, followed by any character other then u Character classes shortcuts \d [0-9] digit \D non digit \w [a-zA-Z0-9_] part of word \W non word \s [ \t\n\r\f\v] whitespace \S non whitespaceAlternation: | operator is ‘or’ Jeff(re|er|izz)y => Jeffrey, Jeffery, JeffizzyConditionals: (?if then | else) if the “if” part is true, “then” expression is attmpted, otherwise “else” is attempted (?if then) else condition is optional, so is the pipe character prefixing itAnchors: ^ beg of line $ end of line \b word boundary \B non-word boundary Ruby: \A begining of a string \Z end of string (or before newline at the end) \z end of stringOptional Items: ? optionally matches a character before it, or you can specify a group() of characters before it: colou?r will match color and colour 4(th)? will match 4 and 4thRepetitions: + s1 or more times * 0 or more times ? 1 or 0 times {n} match exactly n times {min, max} min to max times {min,} at least min {,max} at most max (ruby)Backreferences: First matched group (): sed, vi: \1 ruby: $1 php: $m[1] python, java: m.group(1) c#: m.Groups[1] Group zero is typically the entire matchEscape: \. will escape dot meta characterMore Info: http://www.regular-expressions.info/
