Regular Expressions – I

regular expression (shortened as regex) is an extremely useful way of matching common patterns and texts such as emails, phone numbers, URLs etc. almost all programming languages have a regular expressions library. It is widely used in text editors also to find patterns and replace string patterns that we specify in the find and replace tool.

In Regular expression there are many special characters that has a special meaning to the regular expression library in the compiler.

\d      – Digit (0-9)

.       – Any Character except new line

\D      – Not a Digit (0-9)
\w      – Word Character (a-z, A-Z, 0-9, _)
\W      – Not a Word Character
\s      – White space (space, tab, newline)
\S      – Not White space (space, tab, newline)
\b      – Word Boundary
\B      – Not a Word Boundary
^       – Beginning of a String
$       – End of a String
[]      – Matches Characters in brackets
[^ ]    – Matches Characters NOT in brackets
|       – Either Or
( )     – Group

Quantifiers:

*       – 0 or More
+       – 1 or More
?       – 0 or One
{3}     – Exact Number
{5,10}   – Range of Numbers (Minimum, Maximum)

Sample Regex

[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+

 

The above given characters make up a regular expression and using combinations of these expression we can match patterns or strings.

 

 

Example:

1.  To find a literal string we simply write

[abc] which will search abc pattern in the text

Text : aaaaabaabcaabcabc

Regex: [abc]

Result: aaaaaba abc a abc abc

The matching ‘abc’ pattern will be highlighted in the above text.

 

2.  A more complex but simple to understand example will be to find a name in the text

Text : Hello my name is Mr. Hitesh .

  Pattern to find: Mr. Hitesh .

Regex: Mr\.?\s[A-Z]\w

 Result: Hello my name is Mr. Hitesh .

Explanation:

Mr        – a literal

\.           – search (.)dot after Mr (as dot is special in regex and specifies any character we will escape it using /(slash) ).

?            – 0 or one

[A-Z]   – is to check all the letter between A – Z.

\w         – Word Character (a-z, A-Z, 0-9, _)

 

This was a simple explanation of what a regex is and how it works. Regex is a very powerful tool to use and it can be used to find more complex and varied pattern in a string so in next article I’ll discuss how the grouping and quantifiers work.