DEVENKAN: python regex

Tuesday, September 11, 2018

python regex

\w-word
\W-non word
\d-numbers
\s-white space character
\S-non white space character
\D-non digit character
^-start of string, or start of line in multi line pattern
$-end of string, or end of multi line pattern
(A|B)- A or B (case sensitive)
[^abc]-not a, b or c
[ABC]-match single character A,B or C
[a-z]-lowecase letter between a to z

.-match any single character
+-match one or previous item
*-0 or more
+- 1 or more

re.findall finds all the element in the list
re.search searches the entire string
re.match checks for a match only at the beginning of the string

group(num=0)

This method returns entire match (or specific subgroup num)

groups()

This method returns all matching subgroups in a tuple (empty if there weren't any

import re

m = re.match(r'(\w+).(\w+)\.(\w+)','www.facebook.com')

print(m.groups())

print(m.group(0))

print("here group(0)==groups()")

print(m.group(1))

print(m.group(2))

print(m.group(3))

search vs match

re.match('pattern') equals re.search('^pattern')

import re

m = re.match(r'(\w+).(\w+)\.(\w+)','www.facebook.com')

print(m.groups())
m = re.search(r'^(\w+).(\w+)\.(\w+)','www.facebook.com')

print(m.groups())

result
('www','facebook','com')
('www','facebook','com')

CLEAR CONCEPT OF GROUPS AND GROUP IN ONE PICTURE

finditer

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)