tl;dr sec
Posts
Every AI Talk from BSidesLV and BlackHat USA 2024

Every AI Talk from BSidesLV and BlackHat USA 2024

A list of all of the talks, abstracts, recordings, slides, papers, and tools from BSidesLV and BlackHat USA

Clint Gibler
August 27, 2024

This page contains the abstracts, and where possible, links to the slides, recordings, and tools for AI-related talks at BSidesLV and Black Hat 2024.

If you’d like a quick summary of all of these + DEF CON 2024 talks that you can read in a few minutes, see:

👉️ TL;DR: Every AI Talk from BSidesLV, Black Hat, and DEF CON 2024

For the abstracts and supporting links of every AI-related talk at DEF CON (Main Track and >10 villages), see AI Talks from DEF CON 2024.

BSidesLV
Black Hat Startup Spotlight Competition

BSidesLV

Keynote, Day 1: "Secure AI" is 20 years old

Sven Cattell, nbhd.ai
📺️ Video

Tags: #ai_history

Machine Learning (ML) security is far older than what most people think. The first documented "vulnerability" in a ML model dates back to 2004. There are several well oiled teams that have been managing AI risk for over a decade.

A new wave of “AI red teamers” who don’t know the history and the purpose are here. Some are doing brand safety work by making it harder for LLMs to say bad things. Others are doing safety assessments, like bias testing. Both of these aren’t really “red teaming” as there isn’t an adversary.

The term is getting abused by many, including myself as I organized the misnamed Generative Red Team at DEFCON 31. There are new aspects to the field of ML Security, but it’s not that different. We will go over the history and how you should learn about the field to be most effective.

Don’t Make This Mistake: Painful Learnings of Applying AI in Security

Kirill Efimov, Mobb
Eitan Worcel, Mobb

Tags: #appsec

Leveraging AI for AppSec presents promise and danger, as let’s face it, you cannot solve all security issues with AI. Our session will explore the complexities of AI in the context of auto remediation. We’ll begin by examining our research, in which we used OpenAI to address code vulnerabilities. Despite ambitious goals, the results were underwhelming and revealed the risk of trusting AI with complex tasks.

Our session features real-world examples and a live demo that exposes GenAI’s limitations in tackling code vulnerabilities. Our talk serves as a cautionary lesson against falling into the trap of using AI as a stand-alone solution to everything. We’ll explore the broader implications, communicating the risks of blind trust in AI without a nuanced understanding of its strengths and weaknesses.

In the second part of our session, we’ll explore a more reliable approach to leveraging GenAI for security relying on the RAG Framework. RAG stands for Retrieval-Augmented Generation. It's a methodology that enhances the capabilities of generative models by combining them with a retrieval component. This approach allows the model to dynamically fetch and utilize external knowledge or data during the generation process.

And what if it was hacked? Tactics and Impacts of Adversarial Machine Learning

Larissa Fonseca, Axur

Tags: #attacking_ai

According to the World Economics Forum annual report “Approximately half of executives say that advances in adversarial capabilities (phishing, malware, deep fakes) present the most concerning impact of generative AI on cyber”. It is already a fact that the world is already entering, if not inside, the AI bubble and facing this reality as soon as possible will help companies be better prepared for the future. However, with the velocity required to implement AI and surf into this new technology the risks involved may be put behind to give place to velocity. Based on this scenario this talk is designed to explore the adversarial attacks applied to ML systems and present the results of research made observing cybersecurity communities focused on sharing AI Jailbreaks and how those behave when applied to the most used AIs in the market.

Cybersecurity and Artificial Intelligence Risk Management Challenges for the Next Generation of Public Safety Systems

Raymond Sheh, NIST

Tags: #public_policy

Public safety agencies are adopting increasingly connected and intelligent systems. Next-generation 911 provides dispatchers with ever more information. Robots searching for lost people leverage AI features and novel forms of communication. An incident commander at a wildland fire can get up-to-the-second information from satellite, aircraft, robots, personnel, and sensors, while leveraging AI to predict the fire’s evolution. But how much do they know about the novel risks of all this new technology?

This talk serves as a rallying cry to the cybersecurity community to help public safety agencies to appropriately, responsibly, and ethically adopt these new advances in connectivity and AI. I will present an overview of how public safety approaches the topic of technology, where there are gaps in their understanding, and the impacts that they can have on their ability to keep us safe. I will then discuss how practitioners from across the cybersecurity community can help, ranging from developers, testers, and hackers, through to those in governance and management.

Devising and detecting spear phishing using data scraping, large language models, and personalized spam filters

Arun Vishwanath, Avant Research Group
Fred Heiding, Harvard University
Bruce Schneier, Harvard University
Jeremy Bernstein, MIT
Simon Lerman, Stanford Existential Risks Initiative
🖥️ Slides
📖 Whitepaper

Tags: #phishing

We previously demonstrated how large language models (LLMs) excel at creating phishing emails (https://www.youtube.com/watch?v=yppjP4_4n40). Now, we continue our research by demonstrating how LLMs can be used to create a self-improving phishing bot that automates all five phases of phishing emails (collecting targets, collecting information about the targets, creating emails, sending emails, and validating the results). We evaluate the tool using a factorial approach, targeting 200 randomly selected participants recruited for the study. First, we compare the success rates (measured by pressing a link in an email) of our AI-phishing tool and phishing emails created by human experts. Then, we show how to use our tool to counter AI-enabled phishing bots by creating personalized spam filters and a digital footprint cleaner that helps users optimize the information they share online. We hypothesize that the emails created by our fully automated AI-phishing tool will yield a similar click-through rate as those created using human experts, while reducing the cost by up to 99%. We further hypothesize that the digital footprint cleaner and personalized spam filters will result in tangible security improvements at a minimal cost.

Disinform your Surroundings: AI and disinformation campaigns

Tessa Mishoe (X), Red Canary
🖥️ Slides

Tags: #disinformation

Humanity has some serious issues defining what is real and what is fake. We base our reality upon our proven evidence of the world - our observables. What if what we observe is so convincing that it causes entire movements of falsity? In this talk, we explore the use of AI technologies in disinformation campaigns around the world. We’ll cover some past campaigns and their long-term effects, the technology behind them, and some actions you as a non-AI lifeform can take to prevent rampant overuse in human rhetoric.

EHLO World: Spear-Phishing at Scale using Generative AI

Josh Kamdjou, Sublime Security

Tags: #phishing

Email-based attacks remain at the forefront of the cybersecurity threat landscape, ever-evolving to circumvent defenses and trick unsuspecting users. In this presentation, we discuss the risks of Generative AI in the context of the email threat landscape. Specifically, we examine how Generative AI facilitates the automation of targeted email attack creation, resulting in increased campaign reach, diversity, and the likelihood of success.

We'll show real, in-the-wild attacks with completely fabricated contents, including conversations between multiple individuals that never happened, to demonstrate the sophistication LLMs can afford attackers in conducting convincing phishing campaigns at scale.

Attendees will leave this talk with an understanding of the impact of Generative AI on the email threat landscape and what to expect in the coming years.

AI in the human loop: GenAI in security service delivery

Preeti Ravindra, Expel

Tags: #blue_team

Security co-pilots, chatbots and automation that leverage large language models are rampant in Security Operations with the intent of boosting analyst productivity and outcome quality. While there is a lot of focus on implementing GenAI use cases for the SOC, there is little focus on understanding the effects of introducing GenAI tooling before and after implementation in an analyst workflow leading to a counter-productive "AI in the human loop" scenario.

This session covers:

- Results from A/B testing different types of AI models with different levels of tooling and workflow integration and what it means for a security practitioner

- Insights gained around friction points in integrating and obtaining alignment with GenAI in SecOps

On Your Ocean's 11 Team, I'm the AI Guy (or Girl)

Harriet Farlow, Mileva Security Labs

Tags: #attacking_ai

One of my favourite movie franchises is the Oceans movies. What’s not to love about a heist, plot twist and George Clooney?

In this talk I’m going to convince you why, if you’re preparing your next heist, you should have me on your team as the AI guy (technically girl, but guy has a better ring to it).

I asked around my local intelligence agencies but they wouldn’t let me play with their biometrics systems, so I got the next best thing - cooperation with Australia’s 4th finest casino, Canberra Casino (plus some of my own equipment). I’m going to show you how to bypass facial recognition, retina scanners, and surveillance systems using adversarial machine learning techniques (AML). These techniques let me ‘hack’ machine learning models in order to disrupt their operations, deceive them and cause them to predict a target of my choosing, or disclose sensitive information about the training data or model internals. AI Security is the new cyber security threat, and attacks on AI systems could lead to misdiagnoses in medical imaging, navigation errors in autonomous vehicles, and successful casino heists.

Incubated Machine Learning Exploits: Backdooring ML Pipelines Using Input-Handling Bugs

Suha Sabi Hussain, Trail of Bits
🖥️ Slides, 📺 Video (and from HOPE 2024)

See also Trail of Bits’ awesome-ml-security GitHub repo
Suha also recommended Ilia Shumailov’s paper on LLM censorship being undecidable, which applies to a lot of prompt injection and jailbreaking work.

Tags: #attacking_ai

Machine learning (ML) pipelines are vulnerable to model backdoors that compromise the integrity of the underlying system. Although many backdoor attacks limit the attack surface to the model, ML models are not standalone objects. Instead, they are artifacts built using a wide range of tools and embedded into pipelines with many interacting components.

In this talk, we introduce incubated ML exploits in which attackers inject model backdoors into ML pipelines using input-handling bugs in ML tools. Using a language-theoretic security (LangSec) framework, we systematically exploited ML model serialization bugs in popular tools to construct backdoors. In the process, we developed malicious artifacts such as polyglot and ambiguous files using ML model files. We also contributed to Fickling, a pickle security tool tailored for ML use cases. Finally, we formulated a set of guidelines for security researchers and ML practitioners. By chaining system security issues and model vulnerabilities, incubated ML exploits emerge as a new class of exploits that highlight the importance of a holistic approach to ML security.

BOLABuster: Harnessing LLMs for Automating BOLA Detection

Jay Chen, Palo Alto Networks
Ravid Mazon, Palo Alto Networks
📖 Blog

Tags: #appsec

BOLA poses severe threats to modern APIs and web applications. It's considered the top risk by OWASP API and a regularly reported vulnerability on HackerOne Top10. However, automatically identifying BOLAs is challenging due to application complexity, wide range of input parameters, and the stateful nature of modern web applications.

To overcome these issues, we leverage LLM's reasoning and generative capabilities to automate tasks, such as understanding application logic, revealing endpoint dependencies, generating test cases, and interpreting results. This AI-backed method, coupled with heuristics, enables full-scale automated BOLA detection. We dub this research BOLABuster.

Despite being in its early stages, BOLABuster has exposed multiple vulnerabilities in open-source projects. Notably, we submitted 15 CVEs for a single project, leading to critical privilege escalation. Our latest disclosed vulnerability, CVE-2024-1313, was a BOLA vulnerability in Grafana, an open-source platform with over 20 million users. When benchmarked against other state-of-the-art fuzzing tools, BOLABuster sends less than 1% of the API requests to detect a BOLA.

In this talk, we'll share the methodology and lessons from our research. Join us to learn about our AI journey and explore a novel approach to vulnerability research.

JIT Happens: How Instacart Uses AI to Keep Doors Open and Risks Closed

Dominic Zanardi, Instacart
Matthew Sullivan, Instacart

Tags: #appsec

Instacart has been on a journey to migrate employees from long-lived access to just-in-time (JIT) access to our most critical systems. However, we quickly discovered that if the request workflow is inefficient, JIT won’t be adopted widely enough to be useful. How could we satisfy two parties with completely different priorities: employees who want access and want it right now, and auditors who want assurance, control, and oversight? How could we avoid slipping back into old habits of long-lived access and quarterly access reviews?

In this demo-driven technical talk, we’ll show how Instacart’s developed an LLM-powered AI bot that satisfies these seemingly competing priorities and deliver true, fully-automated JIT access. This talk will be informative for anyone curious about how AI bots can be leveraged to automate workflows securely. We’ll step through how to best utilize LLMs for developing or enhancing internal security tooling by demonstrating what works, what doesn’t, and what pitfalls to watch for. Our goal is to share tactics that others can use to inform their own AI bot development, increase organizational efficiency, and inspire LLM-powered use cases for security teams beyond access controls.

Security for AI Basics - Not by ChatGPT

Chloé Messdaghi, HiddenLayer

Tags: #securing_ai

Are you tired of the same old cybersecurity conference talks? Fed up with the routine discussions about securing AI? Then get ready for something refreshingly different. Join me for a quick adventure filled with offbeat anecdotes and outrageous scenarios – imagine cybercriminals attempting to teach self-driving cars the cha-cha slide and chatbots gossiping about their creators' music taste. Amidst the puns and dad jokes, this talk will unveil everything you need to know about security for AI, including unconventional strategies to secure AI against the unexpected. I'll do my best to keep you entertained every step of the way during this 101 talk.

DoH Deception: Evading ML-Based Tunnel Detection with Black-Box Attack Techniques

Emanuel Valente, iFood

Tags: #red_team

This presentation is part of a graduate research project that delves into the vulnerabilities of Machine Learning (ML) models specifically designed to detect DNS Over HTTPS (DoH) tunnels. Previous research has primarily focused on developing models that prioritize accuracy and explainability. However, these studies have often overlooked the potential of adversarial attacks, leaving the models vulnerable to common adversarial attacks like black-box attacks. This presentation will demonstrate that all cutting-edge DoH tunnel detection models are vulnerable to black-box attacks. Our approach leverages real-world input data generated by DoH tunnel tools, which are constrained in the attack algorithm.

Moreover, we will show specific vulnerable features that model developers should avoid. When this feature type is considered, we successfully evaded all DoH tunnel detection models without using advanced techniques.

Notably, the audience can use the same methods to evade most Machine Learning-Based Network Intrusion Detection Systems, underlining our findings' immediate and practical implications.

Hacking Things That Think

Matthew Canham, Psyber Labs

Tags: #attacking_ai

The rush to embed AI into everything is quickly opening up unanticipated attack surfaces. Manipulating natural language systems using prompt injection and related techniques feels eerily similar to socially engineering humans. Are these similarities only superficial, or is there something deeper at play? The Cognitive Attack Taxonomy (CAT) is a continuously expanding catalog of over 350 cognitive vulnerabilities, exploits, and TTPs which have been applied to humans, AI, and non-human biological entities. Examples of attacks in the CAT include linguistic techniques used in social engineering attacks to prompt a response, disabling autonomous vehicles with video projection, using compromised websites to induce negative neurophysiological effects, manipulating large language models to expose sensitive files or deploy natively generated malware, disrupting the power grid using coupons, and many other examples. The CAT offers the opportunity to create on demand cognitive attack graphs and kill chains for nearly any target. This talk concludes with a brief demo integrating cognitive attack graphs into a purpose-built ensemble AI model capable of autonomously assessing a target's vulnerabilities, identifying an exploit, selecting TTPs, and finally launching a simulated attack on that target. The CAT will be made publicly available at the time of this presentation.

Black Hat Startup Spotlight Competition

Congratulations to the Top 4 Finalists:

Dark Reading overview

Security Weekly interview with Knostic’s Sounil Yu.

BlackHat - Talks

Practical LLM Security: Takeaways From a Year in the Trenches

Richard Harang, NVIDIA
🖥️ Slides

Tags: #securing_ai, #attacking_ai

As LLMs are being integrated into more and more applications, security standards for these integrations have lagged behind. Most security research either focuses 1) on social harms, biases exhibited by LLMs, and other content moderation tasks, or 2) zooms in on the LLM itself and ignores the applications that are built around them. Investigating traditional security properties such as confidentiality, integrity, or availability for the entire integrated application has received less attention, yet in practice, we find that this is where the majority of non-transferable risk lies with LLM applications.

NVIDIA has implemented dozens of LLM powered applications, and the NVIDIA AI Red Team has helped secure all of them. We will present our practical findings around LLM security: what kinds of attacks are most common and most impactful, how to assess LLM integrations most effectively from a security perspective, and how we both think about mitigation and design integrations to be more secure from first principles.

From HAL to HALT: Thwarting Skynet's Siblings in the GenAI Coding Era

Chris Wysopal, Veracode
🖥️ Slides

Tags: #appsec

This talk explores the transformative impact of GenAI on software development and its subsequent implications for cybersecurity. With GenAI, developers are shifting from traditional code reuse to generating new code snippets by prompting GenAI, leading to a significant change in software development dynamics. This advancement introduces new AppSec challenges as AI-generated code from LLMs trained on vulnerable OSS leads to vulnerable generated code. The higher code velocity enabled by generated code turns into higher vulnerability velocity and all the challenges velocity brings to security testing and remediation. The OSS training data set is also susceptible to data poisoning attacks. To make matters worse, developers, who should be the "person-in-the-middle", tend to trust GenAI created code more than human created code. This presentation will delve into real-world data from multiple academic studies, examining how GenAI is reshaping software security landscapes, the associated risks, and potential solutions to mitigate these emerging challenges.

Predict, Prioritize, Patch: How Microsoft Harnesses LLMs for Security Response

Bill Demirkapi, Microsoft

Tags: #blue_team

The Microsoft Security Response Center deals with one of the largest volumes of vulnerability reports for a single company in the world. While fixing more vulnerabilities helps keep customers secure, it also poses numerous practical challenges. How do we maintain a consistent quality in our response? How do we give priority to the bugs that matter?

This talk is a crash course into leveraging Large Language Models (LLMs) to reduce the impact of tedious security response workflows. We won't be building a ChatGPT wrapper or yet another chat bot. Instead- we'll focus on the bleeding edge of LLM capabilities, particularly fine-tuning, to achieve real impact. We'll focus on three LLM use cases: deriving just enough information about vulnerabilities we can share with customers, predicting key facts about a report, like its severity, and finally, generating a root cause based on a crash dump. The audience should expect to walk away with actionable methodologies for using your organization's data to drive security response workflows.

AI Safety and You: Perspectives on Evolving Risks and Impacts

Nathan Hamiel, Kudelski Security
Amanda Minnich, Microsoft
Nikki Pope, NVIDIA
Mikel Rodriguez, Google DeepMind

Tags: #public_policy

AI deployments are accelerating, plunging deeper into the systems we use daily. As the flag of innovation is waved atop the mountain of compute, one topic missing from the conversation is safety. AI Safety is often framed as the spirit animal of the existential risk crowd, making it appear as though it has little relevance unless you think AI will wipe out humanity, but this couldn't be further from the truth. As AI technology gets closer to us, more ingrained in our systems, and opaque algorithms used to make critical decisions, we must ensure these systems are safe to use. Various harms can manifest from these deployments. Not addressing AI safety almost ensures these harms emerge, affecting not only the organizations deploying these technologies but also the humans that use them.

AI safety isn't a topic constrained to the purview of academia and government but a responsibility of organizations building and deploying AI solutions. So, what are the harms? What are an organization's responsibilities related to AI Safety? How are AI safety and AI Security related? In this discussion, we dispel myths about AI safety and discuss the challenges and responsibilities of companies building and deploying AI technologies. We also examine the role of security professionals in this new area and deliver valuable food for thought to get started.

Isolation or Hallucination? Hacking AI Infrastructure Providers for Fun and Weights

Hillai Ben-Sasson, Wiz
Sagi Tzadik, Wiz

Tags: #attacking_ai

More and more companies are adopting AI-as-a-Service solutions to collaborate, train and run their artificial intelligence applications. From emerging AI startups like Hugging Face and Replicate, to mature cloud companies like Microsoft Azure and SAP – thousands of customers trust these services with their proprietary models and datasets, making these platforms attractive targets for attackers.

Over the past year, we've been researching leading AI service providers with a key question in mind: How susceptible are these services to attacks that could compromise their security and expose sensitive customer data?

In this session, we will present our novel attack technique, successfully demonstrated on several prominent AI service providers – including Hugging Face and Replicate. On each platform, we utilized malicious models to break security boundaries and move laterally within the underlying infrastructure of the service. As a result, we were able to achieve cross-tenant access to customers' private data, including private models, weights, datasets, and even user prompts. Furthermore, by achieving global write privileges on these services, we could backdoor popular models and launch supply-chain attacks, affecting AI researchers and end-users alike.

Join us to explore the unique attack surface we discovered in AI-as-a-Service providers, and learn how to mitigate and detect the kind of vulnerabilities we were able to exploit.

Ignore Your Generative AI Safety Instructions. Violate the CFAA?

Kendra Albert, Harvard Law School
Jonathon Penney, Osgoode Hall Law School
Ram Shankar Siva Kumar, UC Berkeley
🖥️ Slides

Tags: #public_policy

Prompt Injection is one of the most popular attack vectors on Large Language Models (LLMs), and notably at the top of OWASP Top 10 for LLMs. It is also relatively easy to carry out and can have insidious consequences including exfiltrating private data. But from a legal and policy perspective is prompt injection considered hacking? This talk presents the first ever legal analysis of this novel attack against LLMs marrying adversarial ML research with cybersecurity law.

Companies are already beginning this question to court: recently, OpenAI made a claim in their lawsuit against NYTimes that the newspaper hacked ChatGPT using "deceptive prompts". More urgently, equating prompt injection to hacking, also has the ability to stifle and chill AI security research.

We use the Computer Fraud and Abuse Act (CFAA), the most significant anti-hacking law in the United States, to examine two popular kinds of prompt injection. This talk will show how the United States Supreme Court's interpretation of the CFAA is unwieldy when applying it to LLMs.

Policy makers, tech lawyers and defenders will takeaway that red teaming LLMs via prompt injection may indeed violate the CFAA. From DEFCON's red teaming of Generative AI CTFs to researchers who engage in good faith attempts to understand production systems' vulnerability to exploitation risk legal action by any company that finds their work crosses a line.

Although there are more narrowly scoped interpretations of the CFAA that are also plausible readings, how close the call is shows how legal safe harbors are needed for AI security research.

Threat Hunting with LLM: From Discovering APT SAAIWC to Tracking APTs with AI

Hongfei Wang, DBAPPSecurity Co Ltd
Dong Wu, DBAPPSecurity Co Ltd
Yuan Gu, DBAPPSecurity Co Ltd

Tags: #blue_team

In December 2022, we captured the first sample of APT SAAIWC. LLM helped us swiftly identify other attack samples from APT SAAIWC among those submitted throughout the year. Following analysis of these samples, we were the first to disclose the organization's attack activities.

The role played by LLM in this analysis amazed us, leading us to decide on its broader application across various stages of threat hunting. Besides uncovering details about APT SAAIWC, we will also share how we utilized LLM in filename-based threat hunting, automating sample hunting through YARA rules generated by LLM, and its broader application in threat intelligence and hunting.

What Lies Beneath the Surface? Evaluating LLMs for Offensive Cyber Capabilities through Prompting, Simulation & Emulation

Michael Kouremetis, MITRE
Marissa Dotter, MITRE
Alex Byrne, MITRE
Dan Martin, MITRE
Ethan Michalak, MITRE
Gianpaolo Russo, MITRE
Michael Threet, MITRE
🖥️ Slides

Tags: #red_team

Large Language Models (LLMs) show remarkable aptitude for analyzing code and employing software, leading to concerns about potential misuse in enabling autonomous or AI-assisted offensive cyber operations (OCO). Current LLM risk assessments present a false sense of security by primarily testing models' responses to open-ended hacking challenges in isolated exploit/action scenarios, a bar which today's off-the-shelf LLMs largely fail to meet. This fails to quantify graduated risks that LLMs may be capable of being adapted or guided by a malicious adversary to enable specific preferred tactics and techniques. In effect, this has left cyber defenders without a confident answer to the question "Does this LLM actually pose an offensive cyber threat to my system?"

We address this gap by developing a more granular and repeatable means to measure, forecast, and prioritize defenses to near-term operational OCO risks of LLMs. In this talk, we present a rigorous, multifaceted methodology for evaluating the extent to which a given LLM has true offensive cyber capabilities. This methodology includes not only LLM prompt and response evaluation mechanics but also high-fidelity cyber-attack simulations and emulation test scenarios on real cyber targets. In effect, with our evaluation framework, selected LLMs are put through a barrage of repeatable tests, scenarios, and settings to elicit whether ever increasing levels of offensive cyber capabilities exist within the model's capacity.

For this talk, we will detail our LLM evaluation methodology, technical implementation and tooling, provide results from our initial round of LLM evaluations, and have a real demonstration of an LLM evaluation for offensive cyber capabilities.

AI TRACK MEETUP

Nathan Hamiel, Kudelski Security

Black Hat is thrilled to introduce our new AI Track Meetup. These meetups provide a unique opportunity for Briefings attendees to network, share insights, and engage in meaningful discussions about the latest industry challenges and opportunities.

Attention Is All You Need for Semantics Detection: A Novel Transformer on Neural-Symbolic Approach

Sheng-Hao Ma, TXOne Networks Inc.
Yi-An Lin, TXOne Networks Inc.
Mars Cheng, TXOne Networks Inc.
🖥️ Slides

Tags: #blue_team

To identify a few unique binaries even worth the effort for human experts to analyze from large-scale samples, filter techniques for excluding those highly duplicated program files are essential to reduce the human cost within a restricted period of incident response, such as auto-sandbox emulation or AI detection engine. As VirusTotal reported in 2021 ~90% of 1.5 billion samples are duplicated but still require malware experts to verify due to obfuscation.

In this work, we proposed a novel neural-network-based symbolic execution LLM, CuIDA, to simulate the analysis strategies of human experts, such as taint analysis of the Use-define chain among unknown API calls. Our method can automatically capture the contextual comprehension of API and successfully uncover those obfuscated behaviors in the most challenging detection dilemma including (a.) dynamic API solver, (b.) shellcode behavior inference, and (c.) commercial packers detection WITHOUT unpacking.

We demonstrate the practicality of this approach on large-scale sanitized binaries which are flagged as obfuscated but few positives on VirusTotal. We surprisingly uncovered up to 67% of binaries that were missed by most vendors in our experiment, by the factor of those threats successfully abuse the flaw of VC.Net detection to evade the scan. Also, this approach shows the inference intelligence on behavior prediction for shellcode without simulation, instead, only by using the data-relationships on the stack to infer the relative unique behaviors involved in the payload.

Moreover, to explore the limitation of our transformer's contextual comprehension on the obfuscation problem, we evaluate the transformer with state-of-the-art commercial packers, VMProtect and Themida. Our approach successfully forensics-based investigates the original behaviors of the running protected program without unpacking. Furthermore, this approach reveals a few unexpected findings of the protection strategies of the commercial packers themselves. In conclusion, our method explores the possibility of using LLM to sample the reversing experience, analysis strategies of human experts, and success in building robust AI agents on practical obfuscated code understanding.

From MLOps to MLOops - Exposing the Attack Surface of Machine Learning Platforms

Shachar Menashe, JFrog
🖥️ Slides

Tags: #attacking_ai

Following the widespread adoption of AI, ML and LLMs, organizations are required to facilitate MLOps. The easiest way to streamline these processes is to deploy an open-source ML platform in the organization, such as MLflow, Kubeflow or Metaflow, which supports actions such as model building, training, evaluation, sharing, publishing and more.

Our talk will explain how MLOps platforms can become a gold mine for attackers seeking to penetrate the organization and move laterally within it - we will present an analysis of the six most popular OSS MLOps platforms, showing how each MLOps feature can be directly mapped to a real-world attack. We will demonstrate how server-side and client-side CVEs we discovered in multiple platforms can be used for infecting both the MLOps platform servers and their clients (data scientists and MLOps CI/CD machines).

Most importantly - we will illustrate how the inherent vulnerabilities in the formats used by these MLOps platforms can be abused to infect an entire organization, even when the platforms are fully patched!

The talk will provide insights both for red teams and blue teams - attendees will gain knowledge on how to better deploy an MLOps platform in the organization, how to brief users of these platforms and how each feature of these platforms can be attacked.

Reinforcement Learning for Autonomous Resilient Cyber Defense

Sara Farmer, DSTL
David Foster, Uplift490
James Short, QinetiQ
Dr Andy Harrison FORS, QinetiQ
Ian Miles, Frazer-Nash Consultancy
Matthew Lacey, Cambridge Consultants Ltd
Chris Willis, BAE Systems
Daniel Harrold, BAE Systems
Chris Parry, BAE Systems
Gregory Palmer, BAE Systems
John H., Trustworthy AI
Tom Wilson, Smith Institute
Ryan Heartfield, Exalens
Mark Dorn, Cambridge Consultants Ltd
Demian Till, Cambridge Consultants Ltd
David Rimmer, Cambridge Consultants Ltd
Samuel Bailey, Cambridge Consultants Ltd
Peter Haubrick, Cambridge Consultants Ltd
Jack Stone, Cambridge Consultants Ltd
Madeline Cheah, Cambridge Consultants Ltd
Pedro Marques, BT plc.
Alfie Beard, BT plc.
Jonathan Francis Roscoe, BT plc.
Alec Wilson, BMT
Ryan Menzies, BMT
Marco Casassa Mont, BMT
Neela Morarji, BMT
Lisa Gralewski, BMT
Esin Turkbeyler, BMT
🖥️ Slides
📖 Whitepaper

Tags: #blue_team

Future cyber threats include high volumes of sophisticated machine speed cyber-attacks, able to evade and overwhelm traditional cyber defenders. In this talk, we summarise a large body of UK Defence research extending and applying Reinforcement Learning (RL) to automated cyber defence decision making, e.g. deciding at machine speed which action(s) to take when a cyber-attack is detected.

To support this work, we have matured simulators and tools including development of advanced adversaries to improve defender robustness. Promising concepts include two contrasting Multi Agent RL (MARL) approaches and deep RL combined with heterogenous Graph Neural Networks (GNNs).

Demonstration systems include Cyber First Aid, industrial control systems, and autonomous vehicles. We have demonstrated that autonomous cyber defence is feasible on 'real' representative networks and plan to increase the number of high-fidelity projects in the next year.

Deep Backdoors in Deep Reinforcement Learning Agents

Vasilios Mavroudis, The Alan Turing Institute
Jamie Gawith, University of Bath
Sañyam Vyas, Cardiff University
Chris Hicks, The Alan Turing Institute
🖥️ Slides

Tags: `#

Deep Reinforcement Learning (DRL) is revolutionizing industries by enabling AI agents to make critical decisions at superhuman speeds, impacting areas like autonomous driving, healthcare, and cybersecurity. However, this groundbreaking technology also introduces a new frontier of threats as these agents, often assumed to be benign, can be compromised through outsourced training or models downloaded from online repositories.

Join us for an eye-opening exploration into the hidden dangers of DRL backdoors. Discover how the demanding nature of DRL training and the opaque nature of AI models create vulnerabilities to supply chain attacks, leaving users defenseless against covert threats. We will unveil the sophisticated methods adversaries can use to embed backdoors in DRL models, showcasing practical demonstrations that start with simpler scenarios and escalate to high-stakes environments.

In this session, we'll dive into the world of DRL backdoors, exposing their stealthy integration and activation. Witness firsthand how attackers can compromise even the advanced systems with minimal detection. Finally, learn which techniques can detect and neutralize these backdoors in real-time, empowering operators to act swiftly and prevent catastrophic outcomes. Don't miss this critical briefing on securing the future of AI-driven technologies.

From the Office of the CISO: Smarter, Faster, Stronger Security in the Age of AI

Ann Johnson, Microsoft
Sherrod DeGrippo, Microsoft

Tags: #blue_team

The Microsoft Office of the CISO drives the decisions and changes that shape the security posture of one of the largest software companies in the world. Security leadership continues to make enormous investments in threat intelligence, research and development, and AI. Microsoft is mining insights from its global-scale threat intelligence to make smarter security choices and is pioneering and deploying innovative security strategies and solutions to turn the tables on adversaries. Join Ann Johnson, Deputy CISO, and Sherrod DeGrippo, Director of Threat Intelligence Strategy, for a discussion about protecting and defending an organization, how threat intelligence shapes security strategy, and how AI is transforming what we know about security today.

Every AI Talk from BSidesLV and BlackHat USA 2024

A list of all of the talks, abstracts, recordings, slides, papers, and tools from BSidesLV and BlackHat USA

Table of Contents

BSidesLV

BlackHat - Talks