The Art of Vulnerability Management
tl;dr Alexandra Nassar of Medallia describes how to create a positive vulnerability management culture and process that works for engineers and the security team.
First: Understanding the Lay of the Land
When Alexandra started, there were a number of process and communication challenges causing frustration for both the development and security teams around vulnerability management.
Instead of leaping in and making changes, she took the time to interview engineers, their managers, the security team, and even the PMO (the project management department).
She identified a number of consistent issues:
Developers didn’t know how to prioritize fixing the vulnerabilities assigned to them vs. the other tasks in their backlog.
Vulnerability fix tasks could easily get lost in the backlog. Sometimes engineers didn’t even know they had fixes assigned to them.
Vulnerability-related tasks and info were split across a number of tools and systems. There was no standard workflow and process; instead, issues were often handled as one-offs.
There was an overreliance on individuals - remediations would be assigned to a particular engineer, which could get dropped when they went on vacation or paternity/maternity leave.
Developers felt like the security team was setting all of the logistics around remediation (SLA and timeline). They wanted to be a part of the process.
Alexandra also took a step back and asked teams, “What is a vulnerability to your team?”
Don’t take things like this for granted: different people and teams may view things different, so it’s really important to nail down important terms together.
Where Does Vulnerability Data Live?
After speaking with a number of dev teams, Alexandra learned that they’d prefer to keep track of vuln data in the issue tracking software they already used for standard feature work, Jira.
There are two main ways vulnerability tasks could live in Jira:
With the associated team and project (e.g. In the SRE team’s project)
In a separate project dedicated to vulnerability data.
Alexandra ended up choosing #2 because different teams can have their own custom workflows with how they handle their tasks, which she didn’t want to interrupt, and by having the vulnerabilities tracked separately they wouldn’t get lost in teams’ existing backlog. This also gave her full control and flexibility over the workflows she creates.
Determining How it Should Work
When defining the vulnerability management workflow, in addition to interviewing all of the relevant stakeholders, Alexandra also manually sampled a number of old vulns to make sure that every state they could be in was represented in the process she defined.
Other important points:
Create a consistent intake workflow - Regardless of the source of the vulnerability, whether it was from a tool (like Qualys or Nessus), an external pen test, bug bounty, or another source, all vulnerabilities undergo the same initial triaging process and are input, tagged, and labeled in the same system.
Make tickets detailed and actionable - Alexandra noticed that teams weren’t working on many of the tickets created by tools, and after speaking with them, she found that it was because the tickets didn’t have enough context such that the teams could fully understand the issue and its importance as well as how to fix it. Now, the security team ensures that every security ticket has sufficient context as well as actionable guidance on how to fix the issue.
Metadata To Track
Medallia tracks several features on security-related Jira tickets:
Priority - In security we tend to use “risk level”, but that terminology is unfamiliar to developers. By using “priority,” it enables developers to rank the relative importance of the security issues against the other issues in their backlog.
CVSSv3 Score and Vector - Choose whatever rating system you want, but make sure it’s universal and consistent.
Team - Security issues are assigned to teams rather than individuals to ensure they don’t get dropped.
Release tag - Useful for tracking a vulnerability and knowing when you need to validate the fix in production.
Source - What discovered the vulnerability? Third party pen testing, bug bounty, etc. This info is also quite useful for the compliance team.
Due date - A flexible recommendation from the security team, which can be adjusted based on developer feedback and their workload / priorities.
Building the Process
One common complaint developers had with the prior process is that after they had completed their part, while waiting on someone else to push things forward, the security issue was still cluttering up their Jira board.
To address this, Alexandra created new Security and Development kanban boards so that relevant tickets only showed up on someone’s board when they have work to do.
In the security dashboard, there are two primary states a ticket can be in: incoming vulnerabilities that need to be triaged and issues that need security attention (e.g. a developer has a question, a fix needs to be validated, etc.).
The developer dashboard has the following states:
To acknowledge: After the security team has triaged an incoming issue and added the necessary context, it goes here for development teams to start reviewing.
To Scope: The relevant development team estimates how much work the issue will take to fix and places it into the sprint planning process.
In progress: Developers are working on the ticket.
Rejected: The development team doesn’t believe this is an issue, has mitigating factors, or otherwise “Won’t Fix.”
Due date extension: Due to the expected amount of work and their timelines, a different due date is proposed, which must be approved by the security team.
Needs security attention: The developers have completed their part, now they need the security team to review what was done.
At 23:30, Alexandra gives a demo of this workflow.
Good metrics are crucial for understanding where your security program is and if the efforts you make are causing an impact.
When leadership comes to Alexandra and says, “Our Quarterly Business Review is coming up, what goals should we set?”, having metrics enables her to make specific suggestions.
For example, if there are a number of open vulns, perhaps developers may not be scoping the required work appropriately, so in the next quarter, let’s aim to try to get to zero overdue vulns. If there are many high or critical vulns that are overdue, she can focus in on that and try to determine the root cause.
It’s useful to show trends, like presenting an overview by team or org to executives who have many teams reporting to them.
Metrics can also help you deal with fixes more effectively; for example, there may be 100 vulns that will be fixed by the same action, such as applying a certain patch. Alexandra tries to help development teams fix issues more efficiently and strategically by applying labels to certain vulnerability classes or efforts to help development teams treat fixing them like a project, rather than a series of one-offs.
Metrics can also be used during management meetings to create some healthy competition between teams.
Example vulnerability dashboards
How do you handle escalations?
If there’s something that needs immediate attention, Alexandra brings it up in the weekly security check-in meeting she holds every Thursday, attended by the CISO, all of the security managers, and all directors and VPs of the organization. It’s always 30 minutes at the same time and she sets the agenda on Monday. If there’s nothing in particular to discuss she doesn’t cancel the meeting, but lets it become an open discussion forum that people can attend if they want.
Metrics and Data-driven Security Programs
I’m a big fan of using data and metrics to drive security efforts and measure their effectiveness. If you are too, check out my summary of the talk by Koen Hendrix of Riot Games where he measured the maturity level of various development teams and correlated that with their relative bug bounty payouts, and this summary of a talk by Arkadiy Tetelman on how the Airbnb team used bug bounty reports to determine where and how they invested their AppSec team and money.
Culture: Driving Change
You can have the best tools and processes in the world, but ultimately you need buy-in and cultural change for the security team to be viewed as an ally and have what you recommend be adopted.
When Alexandra made these Jira workflow changes, she met with all of the different engineering teams to train them on how to use the Kanban boards she created and socialized the value / better efficiency it provided them.
One of the biggest wins has been leveraging Medallia’s security champions program, which includes an engineer from every team in the company. The champions are valuable liasons with development teams, and Alexandra has found that teams with active security champions have fewer overdue vulns.
Other initiatives to get your colleagues excited about security include gamifying security metrics and making a healthy competition of the number of open or overdue vulns, holding CTFs or bug bashes, and definitely swag. Once Alexandra didn’t have enough budget to get everyone t-shirts, as she’d planned, so she asked their facilities team if they’d be willing to make custom badges for the people who won the CTF. They agreed, so the winners got to walk around with unique, security-themed badges and receive props from their peers.
Meet with engineers to understand their workflow and pain points in your current vulnerability management process. Learn the systems and tools they use and how they use them.
Use development teams' language and terminology whenever possible to maximize inter-team understanding and minimize confusion.
Fit the vulnerability management process into how development teams currently work; do not force them to use an external tool, the friction is too high.
Every security ticket that reaches development teams should a) be verified to be a true positive, b) needs to contain all of the relevant contextual information so developers can understand the issue and its relative importance, and c) have clear, actionable guidance on how to fix it. Adopt a customer service-type of interaction model with development teams.
Create a single, all-encompassing vulnerability management process that all vulnerabilities flow through: a single entry point and process that is followed, from entry, to triage, to resolution. Create this process based on interviewing development and security teams to understand their needs, and manually sample previous bugs to determine what bug "states" were needed in the past.
Once you make process changes, meet with all of the affected teams to ensure they understand why the changes were made and how they can effectively adopt the new process; don't assume they'll automatically get it.
Determine the set of meta info you're going to track about vulnerabilities and track them consistently; for example, the severity ("priority"), CVSSv3 score and vector, relevant team and/or project, release tag, the source that found it (pen test, bug bounty, etc.), and its due date.
Track metrics over time (number of bugs found, by source, by criticality, number of bugs past SLA, etc.). Use these metrics to diagnose process problems as well as areas that merit deeper investment from the security team for more systematic, scalable wins. Share metrics with leadership so they understand the current state of affairs, and consider using metrics to cause some healthy competition between teams.
Get your colleagues excited about security via internal marketing efforts, like gamifying security metrics, holding CTFs and bug bashes, and distributing swag, like stickers, t-shirts, or custom badges for people who make efforts in security.