python - re.findall not returning full match? -


i have file includes bunch of strings "size=xxx;". trying python's re module first time , bit mystified following behavior: if use pipe 'or' in regular expression, see bit of match returned. e.g.:

>>> myfile = open('testfile.txt','r').read() >>> print re.findall('size=50;',myfile) ['size=50;', 'size=50;', 'size=50;', 'size=50;'] >>> print re.findall('size=51;',myfile) ['size=51;', 'size=51;', 'size=51;'] >>> print re.findall('size=(50|51);',myfile) ['51', '51', '51', '50', '50', '50', '50'] >>> print re.findall(r'size=(50|51);',myfile) ['51', '51', '51', '50', '50', '50', '50'] 

the "size=" part of match gone. (yet used in search, otherwise there more results). doing wrong?

the problem have if regex re.findall tries match captures groups (i.e. portions of regex enclosed in parentheses), groups returned, rather matched string.

one way solve issue use non-capturing groups (prefixed ?:).

>>> import re >>> s = 'size=50;size=51;' >>> re.findall('size=(?:50|51);', s) ['size=50;', 'size=51;'] 

if regex re.findall tries match not capture anything, returns whole of matched string.

although using character classes might simplest option in particular case, non-capturing groups provide more general solution.


Comments

Popular posts from this blog

javascript - Laravel datatable invalid JSON response -

java - Exception in thread "main" org.springframework.context.ApplicationContextException: Unable to start embedded container; -

sql server 2008 - My Sql Code Get An Error Of Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value '8:45 AM' to data type int -