And if you think, as a DevOps engineer, that you shouldn’t have to care about the business performance, you’re wrong.
According to DevOps Research and Assessment (DORA), high-performing DevOps teams consistently outpace their competitors in four key areas:
- Deployment frequency: This term refers to how often your engineers can deploy code. Improving performance aligns with deploying multiple times per day as desired.
- Lead time: Lead time is how long you take to go from committing new code to running that code in a production environment. The highest performers, according to DORA, have a lead time of under an hour, whereas average performers need up to a month.
- MTTR (Mean Time to Recover): MTTR refers to how long you take to restore a service after an incident or outage occurs. Ideally, you want to aim for under an hour. An outage costs serious money, especially when it impacts profit centers of the application. Long outages destroy trust, decrease morale, and imply additional organizational challenges.
- Change failure: This term refers to the rate at which changes to your system negatively impact the performance. Although you will never reach a change failure rate of zero percent, you can absolutely approach zero by increasing your automated tests and relying on a deployment pipeline with continuous integration checks and gates — all of which ensure quality.
Eliminating perfection as a measure of DevOps success
DevOps relies on the mantra “Done is better than perfect.” It seems to be one of these impossible-to-attribute quotations, but the words nonetheless speak truth. Attempting to attain perfection is an enemy of effectiveness and productivity.Most engineers, including those of the DevOps variety, suffer from some version of analysis-paralysis — a mental affliction that limits your productivity in an attempt to overanalyze your work and sidestep any potential mishap.
Training imperfection into your work requires you to embrace the possibility of failure and the inevitability of refactoring. Creating feedback loops around the customer and looping back to various stages of the pipeline are primary tenants of DevOps. In DevOps, you’re connecting the ends to bend the line into a circle.
When you think iteratively and circularly, pushing out code that’s not perfect seems a lot less scary because the code isn’t carved into stone. Instead, it’s in a temporary state that DevOps engineers improve frequently as you gather more data and feedback.
Designing small teams for DevOps
You’ve likely heard of Amazon’s “two-pizza” teams. The concept broadly speaks to the importance of small-sized teams. Now, the exact number of people that comprise a two-pizza team varies according to your appetites.It’s a good idea to keep teams under 12 people. When a group approaches 9, 10, or 11 people, try splitting it into two. The sweet spot for group size is around 4–6 people. Your exact number may vary depending on the people involved, but the point is this: When groups get too large, communication becomes challenging, cliques form, and the teamwork suffers.
Here’s one other bonus goal when forming DevOps teams: even numbers. It’s a good idea to give people a “buddy” at work — someone they can trust above all others. In even-numbered groups, everyone has a buddy and no one is left out. You can pair off evenly and it tends to work well. Forming even-numbered groups isn’t always achievable because of personnel numbers, but it’s something to keep in mind.
A formula for measuring communication channels is n (n – 1) / 2, where n represents the number of people. You can estimate how complex your team’s communication will be by doing a simple calculation. For example, the formula for a two-pizza team of 10 would be 10 (10 – 1) / 2 = 45 communication channels. You can imagine how complex larger teams can become.
Tracking your DevOps work
If you can get over the small overhead of jotting down what you do every day, the outcomes will provide you with exceptional value. Having real data on how you use your time assists you in tracking you and your team’s efficacy. As Peter Drucker famously said, “If you can't measure it, you can't improve it.”How many days do you leave work feeling like you did nothing? You just had meeting after meeting or random interruptions all day. You’re not alone. Many workers have the same problem. It can be difficult to track your progress and therefor your productivity. The divergence between our feelings of efficacy and the reality of our efficacy is dangerous territory for any DevOps team.
Try using pen and paper rather than some automated tool for this. Yes, you can use software to track how you use your time on your computer. It can tell you when you’re reading email, when you’re slacking, and when you’re coding, but it lacks nuance and often misses or incorrectly categorizes large chunks of time.
After you have an idea of what you’re doing and when, you can start to identify which activities fall into which quadrants of the Eisenhower Decision Matrix. What busy work are you doing routinely that provides no value to you or the organization?
Reducing friction in DevOps projects
One of the best things a manager can do for a DevOps engineering team is to leave them alone. Hire curious engineers who are capable of solving problems independently and then let them do their job. The more you can reduce the friction that slows their engineering work, the more effective your team will be.Reducing friction includes the friction that exists between teams — especially operations and development. Don’t forget specialists like security, too.
Aligning goals and incentives increases velocity. If everyone is focused on achieving the same things, they can join together as a team and move methodically toward those goals.
Humanizing alerting for DevOps success
Every engineering team has alerts on actions or events that don’t matter. Having all those alerts desensitizes engineers to the truly important alerts. Many engineers have becomes conditioned to ignore email alerts because of an overabundance of messages.Alert fatigue ails many engineering organizations and comes at a high cost. If you’re inundated daily, picking out the important from a sea of the unimportant is impossible. You could even say that these messages are urgent but not important . . . .
Email is not an ideal vehicle for alerting because it’s not time sensitive (many people check email only a few times a day) and it’s easily buried in other minutiae.
Applying what you’ve learned about rapid iteration, reevaluate your alerting thresholds regularly to ensure an appropriate amount of coverage without too many false positives. Identifying which alerts aren’t necessary takes time and work. And it’ll probably be a little scary, right? Deleting an alert or increasing a threshold always comes with a bit of risk.What if the alert is actually important? If it is, you’ll figure it out. Remember, you can’t fear failure in a DevOps organization. You must embrace it so that you can push forward and continuously improve. If you let fear guide your decisions, you stagnate — as an engineer and as an organization.