Monitor all the Things (with monit)!

System administrators have a *ton* of different monitoring solutions to choose from. Many of these (Nagios) are forced on them by evil forces who happen to be higher up in the corporate food chain. Some, however, are a joy to use. In this tutorial, I’ll teach you how to use one of my favorites: Monit ( https://mmonit.com/monit/ ). Monit can help you monitor all the same things as the others (CPU and disk usage, etc.), but it also

intelligently checks your services to make sure they’re up and responding properly,
can react when things go wrong (restarting services, running scripts, etc.),
has cool extra features like service management and file-hash checking (to make sure the bad guys haven’t tampered with your system binaries, for example), and
is really easy to use.

In this post, I’ll take you from “no idea what’s happening on the server” to “closely monitoring critical services.” Follow along!

Installing Monit for Linux/Unix System Monitoring

You heard right — this thing runs on all the Linuxes and Unixes. I use it to monitor an Ubuntu machine, a few Debian VPSs, and several heavy pieces of metal running FreeBSD.

To get it installed (assuming Ubuntu, as always, because it’s what most of you have installed):

$ sudo apt-get install monit

Make sure a config file exists (and will be found by monit)

Next, you’ll want to make sure a monit config file exists — this file will be called ‘monitrc’. It will already exist if you just installed monit on Ubuntu:

$ ls -alh /etc/monit/monitrc
-rw------- 1 root root 12K May 20 2014 /etc/monit/monitrc

On FreeBSD, this file was at /usr/local/etc/monitrc.sample, and I copied it over to /root/monitrc (and make absolutely sure that it has permissions of ‘600’ — owner read/write; nothing for group or others. You do not want any other accounts capable of reading this file or doing things with monit, since it can start and stop services).

There are a few different places where you can keep your monitrc file. The documentation is here: https://mmonit.com/monit/documentation/monit.html#FILES.

You can double-check that monit can see your control file (configuration file) by typing

sudo monit -t

This will check the syntax of your configuration file to make sure there are no problems. Here’s what the output of that command looks like if you don’t have a monitrc file in the right place:

$ sudo monit -t
Cannot find the control file at ~/.monitrc, /etc/monitrc, /etc/monit/monitrc, /usr/local/etc/monitrc or at ./monitrc

If all goes well, you can run monit and check what’s going on by typing:

sudo monit            # this will launch the monit daemon
sudo monit status     # show some basic system information

Set Up Basic Monit Settings + Webserver

Monit runs as a daemon (a background process that periodically wakes up, does some things, and then goes back to sleep). You’re going to define some things in this config file (monitrc) and every time Monit wakes up it goes through the list of things you told it to check, and alerts you if anything it sees anything wrong. The first lines of your config file will be something like this:

set daemon 120
set logfile /var/log/monit.log

The first line means “wake up and check all the things I’ve defined in this config file every 120 seconds, or two minutes.” The second line means “please log into a special logfile just for monit, as opposed to syslog.” Now the cool part: we’re going to ask monit to run a little web server on our localhost to give us a graphical interface to the monitoring data from this machine:

set httpd
 port 2812
 use address localhost # only accept connection from localhost
 allow somecleverusername:doodlywhumps_borfinschlumps43 # username:password

What this means is “please run a nice little web interface on localhost port 2812, and only allow someone to see the monitoring data if they know this username and password.” Using the ‘localhost’ address means that only someone sitting on the local machine will be able to contact the web server and try logging in.

This means that on a server, you’re not actually opening up a port to all the the bored angry Internet People who would love nothing more than a chance to Break Your Things. Only users who are logged in on your server will be able to log into the monit web interface. We’ll talk about how to connect to that web server a bit later.

Monitoring Rules

Now for the monitoring rules! First, we’ll monitor some of our server’s core metrics, such as cpu usage and swap usage. Add the following to your monitrc file:

# Test CPU usage including user, system and wait. Note that 
# multi-core systems can generate 100% per core 
# so total CPU usage can be more than 100%

check system $HOST
 if memory usage > 80% for 4 cycles then alert
 if swap usage > 20% for 4 cycles then alert
 # Test the user part of CPU usage 
 if cpu usage (user) > 80% for 2 cycles then alert
 # Test the system part of CPU usage 
 if cpu usage (system) > 20% for 2 cycles then alert
 # Test the i/o wait part of CPU usage 
 if cpu usage (wait) > 80% for 2 cycles then alert

This host check is taken from these monit configuration examples, a useful page that will get you up and running with monit configuration snippets. Just like in a shell script, everything after a hash is a comment, the monit ignores it.

Next, we’ll monitor a website which we’re presumably hosting on this server. It doesn’t matter if it’s actually hosted on this server; monit will simply go out, try to connect over HTTP, and happily move on if things seem to be working:

check host tutorialinux.com with address tutorialinux.com
 if failed port 80 protocol http for 2 cycles then alert

If things don’t seem to be working, monit will retry one more time at the next cycle (however long you’ve defined a cycle to be, whether that’s 30 seconds or one hour) and then alert you if the site still doesn’t respond over HTTP.

Next, we’ll monitor mysql and php-fpm, two things you might be running if you’re hosting PHP-based websites:

# check mariadb
check host mymariadb with address 127.0.0.1
 if failed ping then alert 
 if failed port 3306 protocol mysql then alert

# php-fpm
check process phpfpm with pidfile /var/run/php-fpm.pid
 if cpu > 50% for 2 cycles then alert
 if total cpu > 80% for 5 cycles then restart
 if memory > 300 MB then alert
 # if total memory > 500 MB then restart

If you’re not running these on the server where you’re setting up monit, there’s obviously no need to add these to your configuration file.

Once you’ve set up a few rules and you’re happy, run

sudo monit reload

Checking Your Monit Web Dashboard

Open up a browser and navigate to localhost:2812 (or whichever port you configured in your monitrc config file), and you should be prompted to log in with the username and password you specified. From there, you can see your monitored services: if they’re running, how long they’ve been running, etc.

Connecting to the Monit Web Interface on a Remote Server

When I run the monit web interface on a remote server, I like to keep the same “localhost” settings described above, to make sure that no one can log in from the Net. To make the server think that my browser is “local,” I just use ssh to connect to the server and give me a SOCKS proxy. Then, I route my browser traffic through it and lo! — access the monit web interface. Here’s how:

Connect to the server with SSH, and ask SSH to map a remote port from the server (2812) to a local port on your machine (12345). remotehost would be the server that is running monit.
- ```
ssh -L 1234:localhost:2812 remoteuser@remotehost.net
```
In your browser, navigate to localhost:1234 and when asked for username and password, use the same user/password which you configured your monit webserver to listen on. In our case, this is:

     user: somecleverusername
     password: doodlywhumps_borfinschlumps43

That’s it! You’re now connected to your monit instance’s web server, running on the remote server. You should be seeing delicious output from the services you’re monitoring.

Congratulations! Now you’ve got a base to play around with. To dig in deeper, check out the monit documentation.