> svn checkout http://svn.icir.org/bro/branches/robin/work > cd work > ./autogen.sh && ./configure && make
This document explains how to setup a Bro Cluster, i.e., a set of commodity PCs jointly analyzing the traffic of a network link. See this RAID Paper for more information about the general architecture.
A cluster consists of four types of components:
One or more frontends. Frontends load-balance the traffic across a set of worker machines.
Worker nodes. Workers are doing the actual analysis, with each seeing a slice of the overall traffic as splitted up by the frontends.
One or more proxies. Proxies relay the communication between worker nodes.
One manager. The manager provides the cluster's user-interface for controlling and logging. During operation, the user only interacts with the manager.
This documents focusses on the installation of manager, workers, and the proxies. See <somewhere different> for a discussion of how to setup a frontend. We also assume a general idea on how to configure Bro in a traditional stand-alone setup.
In this document the terms "manager", "worker", and "proxy" each refers to one Bro instance, not to physical machines. There may be multiple such instances running on the same host. For example, it's typical to run a proxy on the same host as the manager.
This documentation assumes that all cluster systems
are running a Unix. FreeBSD, Linux, and MacOS are supposed to work, though FreeBSD has seen the most testing. Other Unix systems will quite likely require some tweaking. Note that all systems must be running the same operating system.
have Python >= 2.4 installed.
have ssh and rdist installed.
have a user account set up on all nodes which can run Bro and has monitoring access to the network interface. ssh access from the manager to this account must be setup on all machines to work without asking for a password/passphrase.
have some storage available for the cluster on the local disks. In the following we will use /data/cluster as the base path for this. The Bro user must be able to either create this directory or, where it already exists, must have write permission inside this directory on all nodes.
In the following, as an example setup, we will assume that the cluster consists of four machines (not counting the frontend). The host names of the systems are host1, host2, host3, and host4. We will configure the cluster so that host1 runs the manager and the (only) proxy, and host{2,3,4} are each running one worker. This is a typical setup which will work well for many sites.
Get the current version of Bro with the cluster shell. Note that at the moment the cluster framework is still under development and requires you to use Robin's development branch. This is made available for testing only and not suitable for production use.
Configure and compile Bro (do not do make install)
> svn checkout http://svn.icir.org/bro/branches/robin/work > cd work > ./autogen.sh && ./configure && make
Change into the cluster's distribution directory:
> cd aux/cluster
Configure the cluster installation with the cluster's base path as —prefix (/data/cluster in our example as discussed above).
> ./configure --prefix=<prefix>
If your system is set up to compile Python extension modules, build Broccoli's Python module (otherwise some functionality will be disabled). If you're unsure, just try it:
> make pybroccoli
Install the cluster files on the master:
> make install
Add <prefix>/bin to your PATH.
Create a cluster configuration file. There is an example which you can edit according to the contained instructions:
> cd <prefix> > cp etc/cluster.cfg.example etc/cluster.cfg > vi etc/cluster.cfg
Create a node configuration file to define where manager, workers, and proxies are to run. There is an example which defines the example scenario described above and can be edited as needed:
> cd <prefix> > cp etc/node.cfg.example etc/node.cfg > vi etc/node.cfg
Create a network configuration file which lists all of the networks which the cluster should consider local to the monitored enviroment. Again there is an example which you can use as a template:
> cd <prefix> > cp etc/networks.cfg.example etc/networks.cfg > vi etc/networks.cfg
Install workers and proxies using the cluster shell:
> cluster install
This install process uses ssh and rdist to copy the configuration over to the remote machines so, as described above, you need to ensure that these services work before the install will succeed.
Some tasks need to be run on a regular basis. Insert a line like this into the crontab of the user running the cluster:
0-59/5 * * * * <prefix>/bin/cluster cron
The shell is an interactive interface to the cluster which allows you to, e.g., start/stop the cluster nodes or update their configuration. The shell is started with the cluster script and then expects commands on its command-line:
> cluster Welcome to BroCluster 0.1
Type "help" for help.
[BroCluster] >
As the message says, type help to see a list of all commands. We will now briefly summarize the most important commands. A full reference follows below.
Once cluster.cfg and worker.cfg are set up as described above, the cluster can be started with the start command. This will successively start manager, proxies, and workers. The status command should then show all nodes as operating. To stop the cluster again, issue the stop command. exit leaves the shell.
On the manager system, you find the current set of (aggregated) cluster logs in spool/manager/. Similarly, the workers and proxies log into spool/proxy/ and spool/<worker-name>/, respectively. The manager's logs are archived in logs/, by default once a day. Logs files of workers and proxies are discarded at the same rotation interval.
Whenenver the cluster configuration is modified in any way (including changes to custom or provided policy files and new versions of the cluster environment), install installs the new version. No changes will take effect until install is run. Before you run install, check can be used to check for any potential erros in the new configuration, e.g., typos in scripts. If check does not report any problems, doing install will pretty likely not break anything.
Note that generally configuration changes only take effect after a restart of the affected cluster nodes. The restart command triggers this. Some changes however can be put into effect on-the-fly without restarting any of the nodes by using the update command (again after doing install first). Such dynamic updates work with all changes done via the analysis command (see below) as well as generally with all policy which only modify global variables declared as redefinable (i.e., with Bro's &redef attribute).
Generally, site-specific tuning needs to be done with local policy scripts, as in a single-Bro setup. This is described see below. Some general types of analysis can however be enabled/disabled via the shell's analysis command.
The shell provides various options to control the behaviour of the cluster. These options can be set by editing etc/cluster.cfg. The config command gives list of all options with their current values. A list of the most important options also follows below.
As with a stand-alone setup, you'll likely want to adapt the Bro policy to the local environment. While some types of analysis can be customized via the analysis command, much of the more specific tuning requires writing local policy files.
By default, it is assumed that you put site-specific policy scripts into the policy/local sub-directory inside the manager's base path. To change the location of site policies, set the option SitePolicyPath in cluster.cfg to a different path.
During the first install, sample policy scripts are installed in policy/local which you can edit as appropiate: local-manager.bro and local-worker.bro are loaded by the manager and the workers respectively. In turn, they both load local.bro which contains all configuration code shared by manager and workers. If in doubt, put your customizations into local.bro so that all nodes see it. If you want to change which local scripts are loaded by the nodes, you can set SitePolicyManager for the manager and SitePolicyWorker for the workers.
The main exception to putting everything into local.bro is notice filtering, which should be done only on the manager. The example local-manager.bro comes with an example setup to configure notice policy and notice actions. You will likely want to adapt this to the local environment.
In general, all the cluster's policy scripts are loaded before the any site-specific policy so that you can redefine any of the defaults locally.
Please note that enabling a particular kind of analysis via the shell's analysis command only has an effect if the corresponding scripts are loaded by the local site policy in local.bro.
It is also possible to add additional scripts to individual nodes only. This works by setting the option aux_scripts for the corresponding node(s) in etc/nodes.cfg. For example, one could add a script experimental.bro to a single worker for trying out new experimental code.
TODO
The cluster sents four types of mails to the address given in MailTo:
When logs are rotated (default: once a day), a list of all alerts during the last rotation interval is sent. This can be disabled by setting MailAlarms=0.
When the cron command noticies that a node has crashed, it restarts it and sends a notification. It may also send a more detailed crash report containing information about the crash.
NOTICES with a notice action of NOTICE_EMAIL; see the Bro documentation for how to configure notices priorities.
If trace-summary is installed, a traffic summary is sent each rotation interval.
TODO: cluster cron logs quite a few statistics which can be analyzed/plotted for understanding the cluster's run-time behaviour.
The cluster shell does not actually need to control a full cluster but can also be used to operate just a traditional single Bro instance on the local machine. To faciliate this, the shell has a special standalone mode. Using the terminology of the Bro cluster, in this mode the single Bro acts as both manager and worker (and there's obviously no need for a proxy). If the standalone mode turns out to work well, it might eventually replace the BroLite framework which you currently get by doing make install-brolite (see the user manual). BroLite is no longer maintained.
Setting up a standalone installation is pretty easy:
Get the right Bro version and compile Bro, as described above.
Change into the cluster's distribution directory, configure the cluster framework for standalone operation and install it:
> cd aux/cluster > ./configure --standalone --prefix=/usr/local/bro > make pybroccoli (*) > make install
(*) Skip this if you're not set up to compile Python extension modules
(you can just try it if you aren't sure).
Add <prefix>/bin to your PATH.
Different than when doing a full cluster installation, the standalone mode automatically installs suitable default configuration files. Initially, you need to make only two changes:
edit the line interface in <prefix>/etc/node.cfg to tell Bro which network interface it should monitor; and
add a list of your local networks to <prefix>/etc/networks.cfg.
Now you can start the standalone Bro:
> cluster start
A default policy is installed in <prefix>/policy/local/local.bro, which you should edit.
Some tasks need to be run on a regular basis. Insert a line like this into the crontab of the user running the Bro:
0-59/5 * * * * <prefix>/bin/cluster cron
Everything else works just as it does in a "real" cluster setup, including configuration, mail notifications, log archival, and dynamic updates.
|
Warning
|
Please note that the standalone mode is (even) less tested than full cluster setups. The cluster shell is still under development and there might be some quirks (also, but not only, with respect to platform portability). Feel free to send a mail to the Bro mailinglist if you encounter any problems. |
Can I use an NFS-mounted partition as the cluster's base directory to avoid the rsync'ing?
Yes. BroBase can be on an NFS partition. Configure and install the shell as usual with —prefix=<BroBase>. Then add HaveNFS=1 and SpoolDir=<spath> to etc/cluster.cfg, where <spath> is a path on the local disks of the nodes; <spath> will be used for all non-shared data (make sure that the parent directory exists and is writable on all nodes!). Then run cluster install again. Finally, you can remove <BroBase>/spool (or link it to <spath>). In addition, you might want to keep the log files locally on the nodes as well by setting LogDir to a non-NFS directory. Usually only the manager's logs are interesting. (In some later version, the default will likely be to not archive worker/proxy logs at all.)