Tuesday, September 11, 2018

python regex

\w-word
\W-non word
\d-numbers
\s-white space character
\S-non white space character
\D-non digit character
^-start of string, or start of line in multi line pattern
$-end of string, or end of multi line pattern
(A|B)- A or B (case sensitive)
[^abc]-not a, b or c
[ABC]-match single character A,B or C
[a-z]-lowecase letter between a to z

.-match any single character
+-match one or previous item
*-0 or more
+- 1 or more




re.findall finds all the element in the list
re.search searches the entire string
re.match checks for a match only at the beginning of the string



group(num=0)
This method returns entire match (or specific subgroup num)

groups()
This method returns all matching subgroups in a tuple (empty if there weren't any

import re
m = re.match(r'(\w+).(\w+)\.(\w+)','www.facebook.com')
print(m.groups())
print(m.group(0))
print("here group(0)==groups()")
print(m.group(1))
print(m.group(2))
print(m.group(3))

search vs match



re.match('pattern') equals re.search('^pattern')


import re
m = re.match(r'(\w+).(\w+)\.(\w+)','www.facebook.com')
print(m.groups())
m = re.search(r'^(\w+).(\w+)\.(\w+)','www.facebook.com')
print(m.groups())

result
('www','facebook','com')
('www','facebook','com')


CLEAR CONCEPT OF GROUPS AND GROUP IN ONE PICTURE

finditer



No comments:

Post a Comment