C613-22104-00 REV B How to Use URL Filtering | Page 46
Configuring URL filtering Advanced Network Protection
Details of the content of custom lists
A custom list is an ASCII formatted text file containing zero or more single-line pattern matches. So
far, we have looked at the general syntax of the entries in these files. Here we look in more detail at
the rules governing the content of these files:
There is no ordering or precedence for patterns in the file.
Spaces in the pattern are not allowed.
The wildcard, asterisk '*' can be used in the pattern to indicate a match on zero or more
characters.
If there are no '/' or '*' characters present, then all content of the domain is blocked.
"Match everything" patterns are not allowed (e.g. '*' or '*/*').
Empty or comment lines (starting with '#' or ';') are ignored.
The ‘www.’ prefix should not be included in the pattern. However patterns and URLs are
normalized before matching. More specifically:
The ‘www.’ prefix and authentication prefix ‘login:<password>@’ that may pre-pend a URL
are automatically stripped from the URL before pattern matching.
Patterns are converted to lower case.
Only the domain name should be specified for blocking HTTPS traffic because TLS SNI
contains only the domain name for the HTTPS request.
The table below describes how the pattern *mysite.com/ is matched (Blocked URLs) or not
matched (Non-blocked URLs) for a blacklist.
Table 1: A pattern matching example with explanations.
The following table lists a series of blacklisted ‘domain and string pattern’ match criteria, and
examples of URLs that would or would not be matched by these criteria.
THIS PATTERN BLOCKS THE URLS NON-BLOCKED URLS
*mysite.com/ mysub.mysite.com
www.mysite.com
mysub.mysite.com/mypage
Pattern matching
explanations
mysub.mysite.com is a match (and is
therefore blocked) because:
■ The wildcard, asterix ‘*’ matches the
prepended text ‘mysub’ in the URL,
and the remaining text in the URL
matches the pattern.
www.mysite.com is a match because:
■ The “www.” prefix is stripped off prior
to matching, and the remaining text in
the URL matches the pattern.
mysub.mysite.com/mypage is not a match
(and is therefore non-blocked) because:
■ The text ‘mypage’ in the URL is not part
of the pattern.