What Are “DevOps” Skills?

Someone recently asked me about DevOps ‘courses’ which got me thinking about what the required skillsets are for getting into roles that include the word “DevOps” in the title or description.

I’ve been working in various “DevOpsy” roles professionally for about 7 years now and I still don’t know exactly what it means. Each company seems to have a different definition, and the philosophy that it all started from is now only a distant memory :-D.

That said, every “DevOps” role I’ve had has drawn on some combination of the following skillsets:

  • Linux Skills
  • Networking and Web/HTTP skills. And DNS. Always DNS. And caching, oh god the caching problems.
  • Some cloud provider’s stack — AWS, GCP, Azure, whatever — knowing how different architectures are implemented, using the tools they expose to infrastucture designers and operators.
  • Familiarity with a CI/CD process (specific tools are usually not important in interviews, as long as you’re comfortable with ONE of them).
  • Generalized troubleshooting and problem solving skills. Almost every problem you face as a DevOps person will be 15% known, 85% unknown. The ability to quickly learn about the problem domain and start troubleshooting is invaluable.
  • Be comfortable with the software development process — how software gets written and deployed. Know the basics of software tooling — git, the basics of the language your devs are writing in, debugging tools for that language/environment, etc.
  • Be *really* comfortable with reading through (and puzzling over) large codebases.

It *really* helps to have some programming (developing software with a team of other devs) experience, although it’s not a hard requirement.

I’m trying to stay away from specific tools recommendations in this post, but several important ones come to mind.

My question to you: what skills do you find yourself using at your DevOpsified job?

I was wrong about Docker

I recently found an answer that I gave someone to a Facebook chat question in ~2016-2017, and was amazed at how “technically right” I was (i.e. totally wrong for the career question that I was answering).

I just wanted to post this to show that I am often too technologically focused, and that this can often get in the way of getting hired and getting things done the way everyone else is doing them. On the flipside, my heart is in the right place and this approach certainly wouldn’t have HURT the person I was giving advice to.

Here it is!


Yeah, I should make some docker vids! So, LXC is the original Linux-native implementation of containers. There was an implementation before, called OpenVZ, but that relied on a patched Linux kernel and I don’t think it’s used much anymore.

The mainline kernel itself exposes the “ingredients” for containers (user/process namespacing, cgroups, etc.) and LXC was one of the original implementations of ‘containers’ using these native building blocks.

Docker was originally a wrapper around LXC, but has now actually created their own implementation of the container back-end (using the same exposed ‘ingredients’ from the Linux kernel that LXC uses).

LXC (and now LXD) is under-hyped, solid, and a bit more intuitive for learning (more like FreeBSD Jails, less hype/magic than Docker). I think it’s good to learn LXC or FreeBSD jails before jumping into Docker.

All the concepts from LXC are totally applicable to Docker as well.

Docker, as a software product, layers some extra tools on top of this basic ‘containerization’ tech: they have their docker ‘hubs’ — pre-built application containers. They have extra networking stuff on top of what you’d get with LXC. They have something that looks like very basic container linking/service discovery. They have enormous amounts of hype, which can help you get a high-paying job (seriously. It’s worth learning just for that.)

I’m torn, because I’ve used Docker since before it was an open-source product (it used to be a paid thing called DotCloud), and have hated it for almost the entire time. It’s a complex, unstable, and questionably architected piece of software. It’s also incredibly overhyped and misused.

BUT: Docker has some great ideas in it, and is absolutely worth learning for your career. As a technology, it leaves a lot to be desired.

I’d love to hear what you think of Docker after getting comfortable with LXC. LXC will teach you the concepts that underlie all of this containerization stuff, and Docker adds some new features (and new headaches) onto that.

Let me know what you think, if/when you start your Docker journey!

The Hardest (and most fun) Problems to Troubleshoot

I recently wrote a FAQ-style post about System Administration and technology careers in general. One of the best questions I was asked was about what kinds of really interesting troubleshooting problems I’ve had to deal with. Here’s that question, along with my answer:

What’s one of the most interesting things you’ve had to troubleshoot / do while maintaining a system?

I’m leaving out specific examples because they’re a mixture of non-public information and hyperspecific (uninteresting) technical stuff, but I can give some outlines for what generally makes for interesting problems to solve.

The really interesting problems I’ve seen tend to be related to performance, networking, and distributed systems. Usually they require a combination of different knowledge to solve:

  • Systems/OS: What is the operating system doing when everything slows down? What’s causing it to do that?
  • Networking/Distributed Systems: What’s actually happening when these machines communicate? How are they supposed to share and manage state, deal with network partitions, and ensure high availability? What are they *actually* doing when this problem happens?
  • Software Development: Which part of the code is causing this network/OS issue, and which code path leads there? Can I actually look at and modify this code? Is this code written by our developers, or an open-source project? What can I do to confirm the issue and test a fix? Can I contribute a fix back to the upstream project?