How to add new patterns to protect additional services?

For BlockHosts versions 2.3 or newer:

Example pattern:

    "sshd-Invalid":
        r'{LOG_PREFIX{sshd}} (Invalid|Illegal) user .* from {HOST_IP}',

See comments in the configuration file /etc/blockhosts.cfg for more patterns. The short-forms {LOG_PREFIX{}} and {HOST_IP} are used to make the patterns easier to write.

Pattern must contain {HOST_IP} keyword to identify remote host IP address.
Pattern may contain {LOG_PREFIX{service_name}} keyword to denote syslog/metalog/multilog prefix at the beginning of a log line.
Service_name is a string representing name of service in the prefix of the log lines in syslog, for example service_name may be sshd or vsftpd(pam_unix).
The log prefix does not include the trailing space, and always matches at beginning of line.

# Examples of LOG_PREFIX matches (the / delimiter is not part of match):
#   matched by {LOG_PREFIX{vsftpd(pam_unix)}}  :
#    /Dec 25 21:18:58 host vsftpd(pam_unix)[21989]:/
#   matched by {LOG_PREFIX{xinted}}  :
#    /Oct 18 04:21:52 host xinetd[3316]:/
#   matched by {LOG_PREFIX{sshd}}  :
#    /Oct  4 12:04:50 host.example.com sshd[1110]: [ID 800047 auth.info]/

For BlockHosts versions 2.2 or older:
BlockHosts uses regular expressions to detect repeated probes into your system, by trying to match the regular expressions to lines in the system log files.

This list of regular expressions can be updated in the configuration file, which by default is /etc/blockhosts.cfg.

The name of the configuration variable is: ALL_REGEXS_STR
The value for ALL_REGEXS_STR is a python dictionary. The key is a string to label the regular expression (choose any unique string) and the value is the regular expression string.

The regexps should contain a P<host> to make a named match for the IP address. P<host> is essential, and required in every pattern.

In some rare cases, the process id pattern also needs to be matched, and named as P<pid>. Process-id needs to be identified when the pattern may be matched multiple times in the system log files, and adding a process-id match prevents double or triple or multiple counting for a single probe from a rogue host. But - this is just a nicety, there is no harm in double/triple counting, the only effect is that some host may be blocked even when the actual number of attacks is less than the COUNT_THRESHOLD configuration value. Most services do not need this, so safe to leave this out.

Here's the pattern that is the most widely used -- it detects SSH attacks:

    "SSHD-Invalid": r"""sshd\[(?P<pid>\d+)\]:.*?(Invalid|Illegal) user .*? from (::ffff:)?(?P<host>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})""",

Follow the instructions in /etc/blockhosts.cfg to make sure the Python language syntax is followed, incorrect editing or errors in this file will cause blockhosts.py to fail to startup. Always test by running blockhosts.py --dry-run --verbose after editing blockhosts.cfg.

Pattern Testing

Since it is not easy to write a regular expression, it is important to test out the regular expression and matching it on one of the lines from the log. A very good test tool for this is kodos - The Python Regular Expression Debugger. Install kodos, and then run it. Enter your test regular expression in the top window, the log line in the second window named "Search String", and the bottom results window should show the groups that are matched - there must be a group named "host" for blockhosts.py to work, and this "host" group should display the IP (IPv4) address to be blocked.