Data Driven Bug Bounty

tl;dr Arkadiy Tetelman describes how to effectively launch a bug bounty program and how tracking vulnerability metrics can make an AppSec team more impactful.

Arkadiy Tetelman, Application Security Engineer - Airbnb twitter, linkedin
BsidesSF 2018
💬 abstract 🖥️ slides 📹️ video

As an application security engineer, you’re going to have limited resources and availability to get all the projects done that you want. Using this data-driven approach is going to let you prioritize your issues and measure ROI.

Key Takeaways

Based on his bug bounty experiences at Twitter and Airbnb, Arkadiy describes:

  • How vulnerability metrics can make an AppSec team more effective, and which attributes are useful to keep track of

  • How to successfully launch and run an effective bug bounty program

While the talk title is about bug bounty, ultimately these ideas and the methodology apply to taking a data-driven approach as an AppSec team in general, regardless of the source of vulnerabilities.

  • Arkadiy has experience with bug bounty programs from his time at Twitter and Airbnb, whose programs had several similarities: both ramped up their program slowly (unpaid soft launch vs unpaid public program + paid private program), paying for triage (NCC Group vs HackerOne), and having several AppSec engineers on triage rotations.

Requirements for Running a Data-driven Bug Bounty Program: First you need 1) a breakdown of the different components of the products that vulnerabilities will fall into, 2) the teams responsible for each component, and 3) a bug taxonomy (e.g. XSS, CSRF, broken access controls, etc.).

Arkadiy then presents several figures and shows how they can make your AppSec team more effective:

  • Vulnerabilities by category - inform your AppSec team’s priorities based on the most common and impactful bug classes. Know where to investing in tooling and libraries to mitigate them.

  • Time to Fix by Team - helps you hold dev teams accountable and know which teams need more 1:1 AppSec engineer support.

  • Open Security Vulnerabilities by Priority - measure improvement over time and raise the AppSec team’s visibility in your company, which can provide valuable social capital.

  • Vulnerability Source => Resolution - gives you insight into if scanners may need to be tuned and if your bug bounty program is healthy.

How to launch a bug bounty program:

  1. Start with a pen test and assess yourself so you’re not flooded with submissions.

  2. Launch as a private program with few researchers and limited scope.

  3. Increase the number of researchers and program scope slowly, tuning your workflow along the way.

  4. Go public when you’re ready - when your process is streamlined, there are 100s of researchers in your private program, and you’ve seen a down tick in submissions.

The core health metrics of an effective bug bounty program are: the time it takes to first respond to a researcher, time to triage, time to bounty, and time to resolution. Time to response and time to bounty are overall the most important.

Bug Bounty Program Logistics - Twitter and Airbnb

Arkadiy has experience with bug bounty programs from his time at Twitter and Airbnb. There were several similarity between both companies:

  • Both took steps to ramp up their bug bounty program slowly to manage the initial influx of submissions: Twitter had an initial unpaid soft launch before paying and Airbnb started with an unpaid public program and a paid private program before merging them into one public paid program.

  • Both paid for frontline triage services: Twitter paid NCC Group and Airbnb used HackerOne’s service.

  • Both had several AppSec engineers on triage rotation: Twitter at ~4-6 people on 1-week rotations, Airbnb had 5 people on 2-week rotations.

Running a Data-driven Program

Thesis: data provides half the value of a bug bounty program.

Fixing the vulnerabilities reported by bug bounty researchers is valuable, but there’s also significant value in collecting metrics, such as the vulnerability classes you’re seeing, their resolution time, how often they’re going past SLA, which teams go past SLA, which teams introduce bugs, how responsive you are in your bug bounty program (time to triage, time to pay out), etc.

Requirements for Running a Data-driven Bug Bounty Program

Here's what you need to get started:

  1. A breakdown of the different components of the products that vulnerabilities will fall into

  2. The teams responsible for each component

  3. A bug taxonomy

Your internal bug taxonomy is the set of issues you're going to attribute vulnerabilities to. Ideally this taxonomy is specific enough that you can gain useful insights from the data but not so detailed that it's prohibitively difficult to tag vulnerabilities.

Once your bug taxonomy is defined, you need to diligently tag each security issue appropriately. Airbnb used custom fields in Jira to store this info. While it can be quite tedious to label historical data, the insights you can gain are quite worth it.

Note that bug taxonomies are flexible and will likely change over time. For Airbnb, they have a blanket "security misconfiguration" class, but once one specific type of issue has occurred a number of times, they'll split that into its own category.

Arkadiy presents a number of figures generated from bug bounty data, describes the insights you can gain from them, and how you can use that data to become a more effective AppSec team.

Vulnerabilities by Categories

x-axis is vulnerability class, y-axis is number of vulnerabilities

Having a breakdown of the classes of vulnerabilities that are present across your applications allows you to:

  • Know your risk breakdown and focus your energy there- invest in tools and libraries to mitigate the most prevalent and impactful vulnerability classes.

  • Inform quarter planning - There’s always more projects you’d like to work on than you have time. This info lets you prioritize projects based on the impact you think it may have and the expected effort required.

    • Some less prevalent vulnerability classes may be technically interesting to work on but solving them won’t move the needle.

  • Measure ROI - At the end of the quarter, did the projects you worked on have the outcomes you wanted? If not, why? Maybe it was the wrong approach or maybe you need to invest additional effort.

Average Time to Fix by Team

x-axis represents each individual team and the y-axis is time to fix

  • Track vulnerability ticket SLAs - Which teams are consistently missing SLAs? Which teams are trending better or worse over time?

  • Hold teams accountable - When you see a team is working much slower than your top team, investigate to determine why. Teams have their own priorities and roadmap. When you can bring objective data like this to discussions it can provide valuable context and motivation for the team to improve.

  • Give positive reinforcement - Don’t always be the bearer of bad news. Celebrate good behavior so teams want to work with you.

    • For example, the Airbnb AppSec team threw a cupcake party for a specific team that hit a sustained 0 open vulnerabilities past SLA.

Open Vulns by Subteam and Priority

x-axis represents individual subteams, with a stacked bar for each vulnerability severity, and the y-axis is the number of open vulnerabilities

When you look at the data like this, clear trends become apparent and next actions become obvious. In this case, the Airbnb AppSec team started partnering closely with the team that had the largest number of open vulnerabilities to help them improve.

This information helps drive conversations forward, as when you reach out to development teams you have data backing up your requests, not just qualitative opinions.

Reach out to teams with tact; otherwise, people can be defensive and not want to work with you. Instead, be positive and collaborative.

What’s the root cause here? Is it a prioritization or resource issue? Can we provide you more support? Are there misaligned incentives? How can we work together to bring these issues/numbers down?

Open Security Vulns (By Priority, Last 90 days)

If your company doesn’t have the resources to build a dashboard around this yet, you can often use existing bug bounty dashboards which present the data similarly.

x-axis is date and the y-axis is the number of open vulnerabilities, with each line representing a vulnerability severity.

Measure improvement (or lack of improvement) over time - This is really important for the business decisions you’re making. If things aren’t working, do you need to invest more? Also, share the wins!

Use data to drive business goals - Data can help you make the case for more resources, changing current resource allocation, or recommending other big picture organizational changes.

Raising the Security Team's Visibility and Getting Organizational Buy-in

This figure is great because unlike the prior one that listed specific vulnerability classes, it requires no security domain knowledge to interpret. This means it can be shared widely, in presentations, a security newsletter, a dashboard, or an internal wiki page.

One challenge for security teams is that many of them are invisible by default until there's a breach. By sending out the data you're collecting and being broad and visible with it, this shows your non-security colleagues the work you're doing and the value you're adding (e.g. the issues that are being resolved because of your work). This visibility is really helpful over the long term.

By being more visible, you gain additional executive and leadership support, building more social capital within your company. This is especially useful, for example, if in the future you need to make a non transparent change in a process or development workflow that adds some friction- you'll get less pushback because the security team and its wins have been visible.

Vulnerability Source => Resolution

This figure (and the others in this talk) were generated with Superset, a visualization tool Airbnb built that can hook into any SQL data source. Airbnb was running Jira on prem so they had direct DB access- if your company is using the cloud version, you’ll have to export your data somehow if you want to use Superset.

The left hand side of this figure represents the vulnerability source (e.g. HackerOne, pen test, vulnerability scanners, etc.) and the left side is resolution state - whether the issue is still open or fixed, marked as won’t fix, etc.

You can watch for changes in this data over time for purposes like:

  • Tuning scanners - If a greater percentage of scanner reports are getting marked as invalid, perhaps you need to tune, replace, or get rid of this type of tool.

  • Bug bounty program health - Has there been a decrease in the number of bug bounty submissions? Investigate if the rewards haven’t been appropriate, maybe you haven’t been responding quickly enough, etc.

One thing Arkadiy didn’t say, which I think would also be interesting, is correlating the vulnerability source and volume with impact. For example, maybe scanning tools are reporting many issues, but they all tend to be low severity, whereas pen tests aren’t reporting many issues but the ones they do are high impact. The goal here is determining:

Which sources, by cost, are resulting in the most high impact vulnerabilities / overall reducing our risk?

Bug Bounty Program Health

In order to gain the benefits of bug bounty, it’s essential to maintain a healthy program. The core health metrics that are important are:

  • The time it takes to first respond to a researcher

  • Time to triage

  • Time to bounty

  • Time to resolution

Bug bounty program generally collect this information automatically and display it publicly, so you don’t need to capture it separately.

Arkadiy believes the two most important metrics are time to response and time to bounty. These are actually more important than the bounty amount itself- researchers like quick feedback.

The benefits of maintaining a healthy bug bounty program include:

  • More reseachers testing your assets and higher quality reports.

  • Researchers you have a close relationship will give you early notice of issues they find before they formally submit them and may let you provide input to the public write-ups they produce describing the issue.

Another key aspect to note is that researchers talk. If you treat one poorly, they’ll likely mention this to their friends and this may discourage others from working on your program.

How to Launch a Bug Bounty Program

If you’re launching a bug bounty program:

  1. Start with a pen test and assess yourself.

    • If you don’t do this first, you’ll see a large influx of issues and won’t be able to handle them all, leading to bad experiences for researchers.

    • This will also show you what your weak spots are so you can write them into the program scope.

  2. Launch as a private program with few researchers and limited scope.

    • Ramp up slowly so you can work out kinks in the process so everything runs smoothly.

  3. Increase the number of researchers and program scope slowly, tuning your workflow along the way.

    • Make sure you’re triaging and resolving tickets in a reasonable timeframe.

    • If there are any systematic issues, resolve them before doing a full program roll out.

  4. Only go public when you’re ready.

    • By this time, you should have your process down pat: the triage -> validate -> assign -> remediate flow should be efficient.

    • You should also have a significant number of researchers in your private program at this point, at least a few hundred.

    • Ideally you’re also seeing a downtick in the number of issues reported.

When you finally launch your fully public program, be ready to handle a large influx of issues (more eyes cause more issues to surface).

So remember: define your bug taxonomy, tag each issue in your bug tracking system (like Jira) with the appropriate vulnerability class, note the source that discovered the issue, the team that owns resolving it, and the bug’s SLA.

Stay in Touch!

If you have any feedback, questions, or comments about this post, please reach out! We’d love to chat.