Now that we've got all that out of the way, let's take a closer look at
some examples of how regular expressions are used in Perl, PHP and JavaScript.
In Perl, for example, you can perform some pretty advanced pattern matching
using both the rules you've already learnt, and some Perl-specific
additions.
A pattern-matching command in Perl usually looks like this:
operator / regular-expression / string-to-replace / modifiers
Let's take a closer look at each of these
components.
The operator can either be an "m" or an "s", depending on the
purpose of the regular expression -"m" is used for "match" operations only,
while "s" is used for "substitution" operations.
The regular expression
is the pattern that is to be matched. This pattern can be constructed using a
variety of characters, meta-characters and pattern anchors.
The string to
replace is...well, the string to be replaced in a find-and-replace operation.
Yeah, every once in a while, we slip you an easy one.
Finally, the
modifiers are used to control the manner in which a particular regex is applied.
There are a whole bunch of modifiers, some of them with pretty exotic names;
unfortunately, none of them are single, or interested in going out to dinner
with you.
So, the statement
s/love/lust/
would replace the first occurrence of the word "love" with
"lust". And if you wanted to perform a global search-and-replace operation,
you'd use the "g" modifier, like this
s/love/lust/g
And they say romance is dead!
You can also use
case-insensitive pattern matching - simply add the "i" modifier, as in the
following example, and watch in awe as Perl matches "jewel", "Jewel" and
"JEWEL".
m/JewEL/i
In Perl, all interaction with regular expressions takes place
via an equality operator, represented by =~; this is used as follows.
$flag =~ m/abc/
$flag returns true if $flag contains "abc"
$flag =~ s/abc/ABC/
replaces abc in the variable $flag with ABC And here's an
example of a simple Perl program which asks for your email address, and compares
it with a regex to verify whether or not it's in the correct format.
#!/usr/bin/perl
# get input
print "So what's your email address, anyway?\n";
$email = <STDIN>;
chomp($email);
# match and display result
if($email =~ /^([a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+/)
{
print("Ummmmm....that sounds good!\n");
}
else
{
print("Hey - who do you think you're kidding?\n");
}
As you can see, the most important part of this program is
the regular expression - it's been dissected below:
^([a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+
The first part
^([a-zA-Z0-9_-])
matches the username part of the email address - this could
be either a number, a character, or a combination of both.
This is
followed by an @ symbol, which is followed by the domain part of the address;
this could again include letters or numbers, and uses a period as a delimiter -
not our usage of an "escaped" period and the "+" meta-character to represent
these conditions in the second half of the expression
([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+
Obviously, this is simply an illustrative example - if you're
planning to use it on your Web site, you need to refine it a bit. For example,
the script above won't accept email addresses of the form
firstname.lastname@somedomain.com - although such addresses are also pretty
common on the Web. You have been warned!
If you prefer PHP to Perl, you
need to use the ereg() function for all pattern matching operations,this usually
takes the format
ereg(pattern, string)
where "pattern" is the pattern to be matched, and "string" is
the character string to be searched for the pattern. The next example should
illustrate this a little more clearly:
<?php
if (ereg("^([a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+",$email))
{
echo "Ummmmm....that sounds good!";
}
else
{
echo "Hey - who do you think you're kidding?";
}
?>
And finally, JavaScript. JavaScript 1.2 comes with a powerful
RegExp() object, which can be used to match patterns in strings and variables.
The important thing here is the test() method, which searches for a pattern in a
string or variable, and returns either true or false - it‚s illustrated in the
example below.
<html>
<head>
<script language="Javascript1.2">
<!-- start hiding
function verifyAddress(obj)
{
// obtain form value into variable
var email = obj.email.value;
// define regex
var pattern = /^([a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+/;
// test for pattern
flag = pattern.test(email);
if(flag)
{
alert("Ummmmm....that sounds good!");
return true;
}
else
{
alert("Hey - who do you think you're kidding?");
return false;
}
}
// stop hiding -->
</script>
</head>
<body>
<form onSubmit="return verifyAddress(this);">
<input name="email" type="text">
<input type="submit">
</form>
</body>
</html>
Obviously, there's a whole lot more that you can do with
regular expressions - checking email addresses is just the tip of the iceberg.
You can use regular expressions to validate phone numbers, currency figures, Web
site URLs, and a whole lot more - all you need is a little bit of creativity and
patience, a few slices of leftover pizza...and a therapist who
cares.
Note: All program code and examples in this article have been
tested on Linux 2.2.13/i386 with Perl 5.004, PHP 3.0.9 and Javascript
1.2.