Here are some tips for keeping your OS environment secure, the DevOps way.
Get your environment scanned early and often. And your first scan should happen even earlier. Nothing is worse than getting your first set of scan result back and realizing you have just been given 2 weeks worth of “surprise” work.
Our security department currently uses Nessus. It seems likes it could be great, but in my experience it yields many many false positives, and even some false negatives. The design is pretty sound, but I currently have no trust for the results and uses them as a basis for hand verifying each vulnerability. It defense of Nessus, I don’t know how much of this is user error or not; I only receive the weekly reports. The best part of them is they are filled with pretty good information. Each flagged vulnerability has a nice write up on what exactly they were looking for, and how they attempted to detect it ( usually package version or regular expression match in a configuration file ). So to appease security, or to help them tune their filters, I can say, “Hey, you were looking at this file, here is mine- either let us talk about what this should look like, or trouble shoot your own detection”.
The configuration management tool Chef has the perfect set of primitives for ensuring that configuration files look a certain way, and that packages are updated as they need to be. These scripts (recipes), now can be used to update the packages and then enforce that the configuration files are all rewritten correctly (if need be).
A little organization can be helpful here. In each recipe we comment it with every nexus plugin id that went in to making it. In each template we comment the top of the file, “# This file is managed by chef”, and each file name is representative of it’s path in linux. These recipes are pretty simply by comparison, so it is doable to make them completely idempotent with minimal work.
Update the environment
You’ll notice that I love Jenkins for automation, and that some plugins can help keep your secrets safe.
Here, Knife’s ssh multi-ssh library is going to push out a command to every server in an entire environment. The command is going to be a “sudo chef-client -o role[Harden]”. This role will contain all these rock solid hardening recipes including package updates and kernel installs.
Reboot [ if necessary ]
Now, people never want to let me take a production environment down during the day. So I’ve got to be able to safely get this all done at night. First, Jenkins has a pretty cool scheduling plugin:
Second, I need a reboot to be as safe as possible. This can be a lot of application specific work because I want to stop the second something seems amiss, and this involves automated health checking.
This is the core of the code right now. It reboots each host, one at a time, one zone at a time. It will block indefinitely if a computer fails to comeback from reboot, or fails to pass its health check. There is a boring function here not show, “host2healthcheck(type)”. This is simply a large case/switch ladder that returns a specialized class that implements the health check interface.