In php regular expressions are very useful for extract information from a string, files, documents etc. So we have divided the lessons into different days so that you can learn without any pressure.
Lesson 1: Letters
In this tutorial we will discuss regular expressions as characters and we will write patterns to match a specific sequence of characters.
$string = “abcdefgh \n abcdef \n abc”;
$pattern = “/abcd(.*)/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => abcdefgh [1] => abcdef )
Lesson 2: Digits
Here we will introduce any digits from 0 to 9 to match specific sequence of characters. Because no one know where characters includes digits as well.
$string = “abc123xyz \n var g = 456 \n hello 123”;
$pattern = “/(.*)123(.*)/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => abc123xyz [1] => hello 123 )
Lesson 3: Character Period
dot(.) can match any single characters i.e. letter. digit, whitespace. Dot(.) can be escape by \. accordingly.
$string = “title. \n 123. \n +-. \n chess”;
$pattern = “/(.*)\.(.*)/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => title. [1] => 123. [2] => +-. )
Lesson 4: Only a, b, or c
In regular expressions there is a method where you want to match specific characters. For e.g. [abc] will match a single a, b and c.
$string = “can \n man \n ran \n fan”;
$pattern = “/[cmf]an/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => can [1] => man [2] => fan )
Lessons 5: Not a, b, nor c
Similar to above lesson if you want to exclude specific characters then you can do so by using [^abc] that match any single characters except a, b, c etc.
$string = “can \n man \n ran \n fan”;
$pattern = “/[^rm]an/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => can [1] => fan )
Lessons 6: Characters a to z / Numbers 0 to 9
You can match or exclude for a specific range of characters by using “-“. For e.g. [1-5] will match 1 to 5, [^a-c] will match any characters except a to c.
$string = “Anc \n Fob \n Ppc \n bax \n byy \n bcz”;
$pattern = “/[A-c][n-p][a-c]/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => Anc [1] => Fob [2] => Ppc )
Lessons 7: Repeat characters
How can you match if there are more than one character i.e. repetitions of characters? The solutions is using the curly braces. For e.g. a{3} will match exactly three times. a{1,4} will match not more than 4 times but not less than 1. a{2,} will match 2 or more.
$string = “helllllo \n helllo \n hello”;
$pattern = “/hel{2,4}o/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => helllo [1] => hello )
Lessons 8: Repeat zero or more
Sometime user can write the price as $10,000 and some times $10. So you do not know where to drop and where to pick up.
$string = “aaaabcc \n aabbbbc \n aacc \n defff”;
$pattern = “/aa+b*c+/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => aaaabcc [1] => aabbbbc [2] => aacc )
Lesson 9: Optional
Optionally means you want to match either zero or one of the preceding character. For e.g. xy?z will match either xyz or xz because “y” treated as optional.
$string = “1 hand player \n 2 hand player \n 3 hand player”;
$pattern = “/\w hand? player/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => 1 hand player [1] => 2 hand player [2] => 3 hand player )
Lesson 10: Whitespace
Whitespace defines as “space”, “tab”, “\t”, “\r”, “\n” etc. So in this case you have to deal with “\s” for any specific whitespace.
$string = “1. xyz \n 2. xyz \n 3. xyz”;
$pattern = “/\d\.\s+xyz/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => 1. xyz [1] => 2. xyz [2] => 3. xyz )
Lessons 11: Starts and ends
If you want to match from both start and end then using “^” and “$” you can achieve this.
$string = “Mission: successful \n Last Mission: unsuccessful \n Next Mission: successful upon capture of target”;
$pattern = “/^Mission: successful$/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => Mission: successful )
Lessons 12: Group capture
You can group characters using the special ( and ) (parenthesis). To capture the image file write the expression ^(IMG(\d+))\.png$.
$string = “file_a_registry_file.pdf \n file_today.pdf \n testfile.pdf.tmp”;
$pattern = “/(\w+).pdf/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => file_a_registry_file.pdf [1] => file_today.pdf [2] => testfile.pdf )
Lessons 13: Sub Group capture
You can extract multiple layers of information through regular expressions.
$string = “Hello 123 \n Hey 456 \n Hi 2015”;
$pattern = “/(\w+ (\d+))/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => Hello 123 [1] => Hey 456 [2] => Hi 2015 )
Lessons 14: More Group capture
For capture more group you can use below code.
$string = “1024X768 \n 800X600 \n 480X320”;
$pattern = “/(\d+)X(\d+)/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => 1024X768 [1] => 800X600 [2] => 480X320 )
Lessons 15: Match x or z
Particularly when you are using groups, you can use the | (OR) to mention different sets of characters.
$string = “I love toy \n I love boy \n I love joy”;
$pattern = “/I love (toy|joy)/”;
preg_match_all($pattern, $string, $matches);
print_r($matches[0]);
Output:
Array ( [0] => I love toy [1] => I love joy )
Lessons 16: Other characters
Using \w you can capture alphanumeric characters, using \D you can capture any non-digit character, using \S you can capture any non-whitespace character, and \W any non-alphanumeric character.
Everything can be achieve by “.*” pattern.
Sample Regular Expression pattern…
foo : The string “foo”
^foo : “foo” at the start of a string
foo$ : “foo” at the end of a string
^foo$ : “foo” when it is alone on a string
[abc] : a, b, or c
[a-z] : Any lowercase letter
[^A-Z] : Any character that is not a uppercase letter
(gif|jpg) : Matches either “gif” or “jpeg”
[a-z]+ : One or more lowercase letters
[0-9\.\-] : Аny number, dot, or minus sign
^[a-zA-Z0-9_]{1,}$ : Any word of at least one letter, number or _
([wx])([yz]) : wy, wz, xy, or xz
[^A-Za-z0-9] : Any symbol (not a number or a letter)
([A-Z]{3}|[0-9]{4}) : Matches three letters or four numbers
In a world where digital presence is paramount, the question isn't whether you should do…
Over the years, people have experimented with various methods to maintain healthy and beautiful hair.…
Your brand more than developing an attractive and creative logo and infectious motto. It's the…
Introduction Are you someone who has suffered from a personal injury and want to file…
Operating from home has emerged as one of the most popular ways of doing jobs…
If the consequences of our society’s ever-growing debt are what worries you, then it is…