Confoo 2012 : Regular Expressions

Talk by Jakon Westhoff (Qafoo) given on Friday 2nd March

Different languages utilize different different regular expression engines : PHP (PCRE), Java, Python, Ruby, etc…

Delimiter  with / for example : /foobar/i  : i is case insensitive

MetaCharacters :

  • * : any number of occurrences
  • + : at least once
  • ? : once or not at all
  • {x,y} : occurrences between x and y

Character classes

  • matches any character EXCEPT new line; but using the s , for example : (The.Point)s   it can also match new lines
  • character classes : [abcdef]+ any character in the bracket would be matched on or several times; ranges :  [a–cd-f]+ most of the metacharacters loose their meaning inside brackets; except for the range
  • [^abcef]+ : negates a the new line is part of it; to except the new line : [^\n]+
  • predefined character classes : \d (digit) \s : every whitespace whitespace; the  capital letters negate : \D everything but a digit
  • (something)D : no new line tolerated at the end

Alternatives

  • Logical OR : Open|Source : matches the first found : Open or Source

Escaping

  • \ in front of the character (it is supposed to be a literal one, not a special one)
  • be careful that according to your programming language, \ has also a meaning; so \\n become current…

Anchors

  • ^ beginning of the subject
  • $ end of the character
  • (^abcdef$)m enables multiline mode

Sub pattern

  • ((abc)(def)) : to extract part of the string : 1 -> abc and 2 -> def
  • named sub-pattern : P option

Readability

  • x : Extend your pattern’s legibility by permitting whitespace and comments.
  • # is ignored, so you can use it to comment
  • whitespaces are ignored if they are not escaped

Une réflexion sur « Confoo 2012 : Regular Expressions »

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée.

Le temps imparti est dépassé. Merci de saisir de nouveau le CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.