FortiGate Version 3.0 MR4 Administration Guide
394 01-30004-0203-20070102
Using Perl regular expressions Antispam
• forti*.com matches fortiiii.com but does not match fortinet.com
To match any character 0 or more times, use ‘.*’ where ‘.’ means any character
and the ‘*’ means 0 or more times. For example, the wildcard match pattern
forti*.com should therefore be fort.*\.com.
Word boundary
In Perl regular expressions, the pattern does not have an implicit word boundary.
For example, the regular expression “test” not only matches the word “test” but
also any word that contains “test” such as “atest”, “mytest”, “testimony”, “atestb”.
The notation “\b” specifies the word boundary. To match exactly the word “test”,
the expression should be \btest\b.
Case sensitivity
Regular expression pattern matching is case sensitive in the web and antispam
filters. To make a word or phrase case insensitive, use the regular expression /i.
For example, /bad language/i will block all instances of “bad language”,
regardless of case.
Perl regular expression formats
Table 42 lists and describes some example Perl regular expression formats.
Table 42: Perl regular expression formats
Expression Matches
abc “abc” (the exact character sequence, but anywhere in the string)
^abc “abc” at the beginning of the string
abc$ “abc” at the end of the string
a|b Either of “a” and “b”
^abc|abc$ The string “abc” at the beginning or at the end of the string
ab{2,4}c “a” followed by two, three or four “b”s followed by a “c”
ab{2,}c “a” followed by at least two “b”s followed by a “c”
ab*c “a” followed by any number (zero or more) of “b”s followed by a “c”
ab+c “a” followed by one or more b's followed by a c
ab?c “a” followed by an optional “b” followed by a” c”; that is, either “abc” or”
ac”
a.c “a” followed by any single character (not newline) followed by a” c “
a\.c “a.c” exactly
[abc] Any one of “a”, “b” and “c”
[Aa]bc Either of “Abc” and “abc”
[abc]+ Any (nonempty) string of “a”s, “b”s and “c”s (such as “a”, “abba”,
”acbabcacaa”)
[^abc]+ Any (nonempty) string which does not contain any of “a”, “b”, and “c”
(such as “defg”)
\d\d Any two decimal digits, such as 42; same as \d{2}
/i Makes the pattern case insensitive. For example, /bad language/i
blocks any instance of bad language regardless of case.
\w+ A “word”: A nonempty sequence of alphanumeric characters and low
lines (underscores), such as foo and 12bar8 and foo_1