Why does using parenthesis in a POSIX regular expression change the result of a match?
A. For POSIX (extended and basic) regular expressions, but not for perl regexes, parentheses don’t only mark; they determine what the best match is as well. When the expression is compiled as a POSIX basic or extended regex then Boost.Regex follows the POSIX standard leftmost longest rule for determining what matched. So if there is more than one possible match after considering the whole expression, it looks next at the first sub-expression and then the second sub-expression and so on. So… “(0*)([0-9]*)” against “00123” would produce $1 = “00” $2 = “123” where as “0*([0-9])*” against “00123” would produce $1 = “00123” If you think about it, had $1 only matched the “123”, this would be “less good” than the match “00123” which is both further to the left and longer. If you want $1 to match only the “123” part, then you need to use something like: “0*([1-9][0-9]*)” as the expression.