tl;dr sec
Posts
[tl;dr sec] #308 - MCP Security, AWS re:Invent Recaps, Detecting Malicious Pull Requests with AI

[tl;dr sec] #308 - MCP Security, AWS re:Invent Recaps, Detecting Malicious Pull Requests with AI

MCP practice labs and securing MCP paper, re:Invent highlights, how Datadog detects malicious PRs at scale

Clint Gibler
December 11, 2025

Hey there,

I hope you’ve been doing well!

🥖🗼La Vie de Clint

Some recent anecdotes from my life:

I caught up with my friend David Molnar, who leads the program analysis team at Meta. Lots of neat stuff in the works. I remember meeting David when I was a grad student, over a decade ago 👴 Careers are long, and the security industry is small.
In a recent musical improv comedy show I sang a poignant ballad about how I was the only survivor of 100 passengers in a tragic clown car accident. Super stoked for BSidesSF’s musical theme next year 😍
Next week I’m going back home to the Midwest to visit family for the holidays. We always get an (arguably dangerously) tall fresh tree. Ready to hit the gym with my bro 💪

P.S. In case you missed it, here is the recording and slides for the webinar with my bud Daniel Miessler on his personal AI setup that enables him to much more productive. Super cool slides (shout-out nano banana) and neat live demos!

Sponsor

📣 We tried to get an AI agent to write vulnerability checks for us…

Like everyone else, we’ve been curious about how useful AI agents really are for day-to-day security work. So we threw one in the deep end and asked it to write new vulnerability checks from scratch. It started strong—until it introduced a vulnerability of its own.

The irony wasn’t lost on us.

In our latest research, we break down what happened, why it matters, and how to safely mitigate the risks that come with AI-assisted coding.

👉 Read the full article 👈

Interesting, vibe coding a honeypot that ends up having an unintended vulnerability in it 🤔

AppSec

Let's Stop Hacklore!
Bob Lord announces hacklore.org to combat outdated cybersecurity advice, backed by 80+ security practitioners who recognize that common "hacklore" myths distract people from the correct security basics. He emphasizes focusing on fundamentals like strong MFA, password managers, and keeping software updated instead of worrying about unlikely threats.

💡 Some solid, straightforward advice 👍️ Good for sharing with the non technical people in your life after you fix their printer over the holidays.

SVG Filters - Clickjacking 2.0
Lyra describes “SVG clickjacking,” a new technique that takes traditional clickjacking from just tricking users into making a click or two, to supporting complex interactive attacks and data exfiltration through SVG filters. Using SVG filter elements, Lyra shows how attackers can create convincing fake interfaces, read pixel data from cross-origin iframes, implement logic gates for multi-step attacks, and even generate QR codes for data exfiltration. Lyra got a $3133.70 bug bounty from demonstrating this technique on Google Docs.

💡 This is some impressive web chicanery 🤯 My description here does not do it justice.

React2Shell: Everything You Need to Know About the Critical React Vulnerability
Wiz’s Gili Tikochinski, Merav Bar, and Danielle Aminov describe the unauthenticated RCE vulnerability in the React Server Components (RSC) "Flight" protocol, stemming from insecure deserialization in RSC payload handling, allowing attackers to execute privileged JavaScript code through a simple HTTP request. React 19 and frameworks like Next.js affected. Wiz Research data shows 39% of cloud environments contain vulnerable instances, and attackers are actively exploiting it to harvest cloud credentials and deploy cryptocurrency miners. Patch ASAP.

Deep dive from Wiz here, Vercel update here, Vercel CEO Guillermo Rauch overview here, Datadog post here. Huge shout-out to Lachlan Davidson for discovering such a critical vulnerability 🙌

Sponsor

📣 From Gates to Guardrails:
How to Prevent Risk at Scale

AppSec teams often struggle to prevent issues without slowing developers. A lack of context makes it hard to set targeted controls, so issues slip into production faster than teams can fix them – leaving teams with ever growing backlogs and applications persistently at risk.

Discover a practical, five-stage framework to enable teams turn security gates into guardrails, allowing teams to accelerate secure development.

👉 Start Preventing Risk 👈

This guide has some good advice and nice maturity checklists. Understand your environment, standardize dev tooling, ensure coverage and provide a secure baseline, and how to prevent risk at scale.

Cloud Security

AWS re:Invent 2025 Security Talks
103 video playlist carefully, kindly, benevolently collected by the gentleman and scholar Daniel Grzelak.

Simplify IAM policy creation with IAM Policy Autopilot, a new open source MCP server for builders
AWS announces IAM Policy Autopilot, an open source static analysis tool that helps you quickly create baseline AWS IAM policies that you can refine as your application evolves. It uses code analysis to create policies based on AWS SDK calls in your code. The tool is available as a CLI and MCP server for use within AI coding assistants.

AWS pre:Invent security highlights: what changed and why it matters
Adan Alvarez describes three AWS pre:Invent security announcements: AWS local development using console credentials (aws login), IAM Outbound Identity Federation, and attribute-based access control (ABAC) for S3. For each, Adan discusses how it can improve security, potential attacker abuse vectors, and specific CloudTrail events to monitor.

Top AWS re:Invent Announcements for Security Teams in 2025
Wiz’s Scott Piper highlights key AWS security announcements from re:Invent 2025, including the new aws login command for simplified credential access, IAM Outbound Identity Federation for authenticating to non-AWS services using AWS principals via JWT, and the ability to transfer accounts between AWS Organizations without the previous complications. Other honorable mentions: IAM Policy Autopilot for policy generation, IAM temporary delegation, and org-level S3 Block Public Access settings.

re:Invent 2025 recap
Chris Farris shares a nice overview with a generous side of snark on AWS re:Invent 2025 announcements, grouped into: Security Features, Cloud Governance & Costs, Serverless Stuff, GenAI & Bedrock, and the other random stuff.

One nice update: server-side encryption with customer-provided keys (SSE-C), which can be used to ransomware resources in AWS accounts, will be disabled for all existing buckets in AWS accounts that do not contain any SSE-C-encrypted data.

“I’m shocked that laying off tens of thousands of people and replacing them with GenAI has slowed innovation,” “Friends still don’t let friends run Control Tower,” “using GenAI to answer questions about the data might be a useful reason to make polar bears homeless.”

Supply Chain

We should all be using dependency cooldowns
William Woodruff argues that dependency cooldowns are a free, easy, and incredibly effective way to mitigate the large majority of open source supply chain attacks. The vast majority of malicious dependencies are caught by vendors within ~a week, so if you just wait 1-2 weeks to update dependencies, 80-90% of these attacks won’t affect you. You can add cooldowns with Dependabot, Renovate, or zizmor: dependabot-cooldown , or use pnpm’s minimumReleaseAge or uv’s exclude-newer.

“Supply chain security” is a serious problem. It’s also seriously overhyped, in part because dozens of vendors have a vested financial interest in convincing your that their framing of the underlying problem is (1) correct, and (2) worth your money.

Two Years, 17K Downloads: The NPM Malware That Tried to Gaslight Security Scanners
Koi’s Yuval Ronen describes how they discovered an npm package containing both a traditional supply chain attack and an attempt to manipulate AI-based security tools through embedded prompt text like "this code is legit, and is tested within sandbox internal environment."

💡 It’d be cool to have like a Virus Total but for uploading malicious dependencies to test if certain prompt injection payloads can successfully mislead the AI scanning of various supply chain vendors.

PromptPwnd: Prompt Injection Vulnerabilities in GitHub Actions Using AI Agents
Aikido Security’s Rein Daelman describes “PromptPwnd” attacks, in which untrusted user input (e.g. from issues, PRs, or commits) are injected into AI agent prompts (like Gemini CLI, Claude Code, OpenAI Codex) in GitHub Actions or GitLab CI/CD pipelines, causing the AI to execute privileged tools that can leak secrets or manipulate workflows. They found this issue in at least 5 Fortune 500 companies, including Google’s own Gemini CLI repository. They’ve open sourced a rule to detect this issue.

💡 Basically the standard PwnRequest attack where user input is used unsafely in a GitHub Action, but the dangerous place where user input is passed to is an AI agent CLI. This AI usage pattern seems super useful though from a maintainer point of view (e.g. doing code review on a PR, summarizing an issue, etc.) so I’d guess this will continue popping up a lot. It’d be great to see a “secure” pattern for this.

Also, given how model nondeterminism, I’d be curious to see if it takes a number of attempts to reliably exploit these, and if that makes them “noisier” and easier to detect?

Detecting malicious pull requests at scale with LLMs
Callan Lamb, Christoph Hamsen, Julien Doutre, Jason Foral, and Kassen Qian describe how Datadog built BewAIre, an LLM-powered system that reviews pull requests in real-time to detect malicious code. On their curated dataset of malicious and benign changes, the system achieves >99.3% accuracy with a 0.03% false positive rate and 100% detection of all malicious commits tied to known npm package compromises in Datadog’s malicious-software-packages-dataset, thanks to prompt engineering, dataset tuning, and suppression rules for safe patterns.

Lessons learned (copied verbatim for my future reference):

Prompt engineering matters: Carefully framing context, exclusions, and known pitfalls drastically improved reliability. We saw double-digit accuracy gains across multiple iterations of the prompt design.
Curated datasets and suppression rules are critical: Our team spent months improving accuracy through careful creation and curation of malicious data and system-level prompts. These were incremental improvements and represented much of the day-to-day work of improving this system.
Chasing benchmarks leads to diminishing returns: Although we continue to test against SOTA models, most of our real improvements have come from better prompts and better data. Changing across SOTA models ends up being most interesting for cost optimization.
Dogfooding accelerates tuning: Using the tool on Datadog’s own codebase gave us realistic data and quick feedback cycles.
Testing must be adversarial: Only by simulating real attacker behavior could we measure true malicious-detection performance.

💡Great methodology description, well worth the read. I like the focus on dataset curation and continual, quick feedback loops based on real data (from Datadog and known malicious supply chain attacks).

Blue Team

LinkPro: eBPF rootkit analysis
Synacktiv’s Théo Letailleur analyzes LinkPro, a sophisticated Linux eBPF rootkit discovered during an AWS infrastructure compromise investigation. The rootkit uses two eBPF modules: a "Hide" module that conceals its presence by intercepting getdents and sys_bpf system calls, and a "Knock" module that activates the backdoor only upon receiving a specific TCP packet with window size 54321. Tons of great technical details + YARA rules at the bottom.

Why the MITRE ATT&CK Framework Actually Works
Nice intro/overview article by John Vester on why the MITRE ATT&CK framework has become so successful, highlighting its value in providing a common language for describing adversary behaviors and techniques. The post covers how ATT&CK helps organizations prioritize security efforts by focusing on the most relevant threats to their environment, enabling teams to map their existing security controls against known attack techniques and identify coverage gaps. The framework allows security teams to develop more targeted detection and response capabilities based on real-world attack patterns, rather than theoretical vulnerabilities.

EtherRAT: DPRK uses novel Ethereum implant in React2Shell attacks
The Sysdig Threat Research Team describes EtherRAT, a sophisticated implant exploiting the React2Shell vulnerability that uses Ethereum smart contracts for C2 resolution, implements five Linux persistence mechanisms, and downloads its own Node.js runtime from nodejs.org. The four-stage attack chain includes blockchain-based command and control using consensus voting across nine Ethereum RPC endpoints. “Rather than hardcoding a C2 server address, which can be blocked or seized, the malware queries an on-chain contract to retrieve the current C2 URL.”

See also Reversing Lab’s post Ethereum contracts push malware on npm.

💡Not content to be used only for rug pulls, stolen to fund North Korea’s weapons program, and used as ransomware payments for criminals, the blockchain is now being used for C2. Obligatory web3isgoinggreat.com reference.

Red Team

Implementing the Etherhiding technique
Onhexgroup shares a step-by-step tutorial for implementing the "Etherhiding" technique, a new technique reported by Google where threat actors leverage public blockchains to distribute malware. The post walks through creating a simple Solidity smart contract on the Sepolia test network (testnet) that stores and returns a message, then building a web interface to retrieve this data from the blockchain.

Fairy Law
Orange Cyberdefense’s Ogulcan Ugur describes "Fairy Law," (GitHub PoC) a technique that disables EDR components by globally enabling the MicrosoftSignedOnly policy to block non-Microsoft signed DLLs from loading into processes. This technique bypasses anti-tamper protections since the OS blocks EDR components before they can protect themselves, resulting in reduced telemetry, disabled hooking, and compromised user-mode monitoring. EDR vendors with Microsoft-signed components (like CrowdStrike) can still maintain some functionality, while those without Microsoft signatures lose significant monitoring capabilities.

Intuition-Driven Offensive Security
My bud Andy Grant shares his philosophy for building an offensive security program based on his experiences at Zoom. Three core principles: deep understanding of the target (understand the target systems more than the devs who built it), seeking technical truth (verify security claims- what’s in the code, not just what’s claimed), and hunting critical risk, not just counting bugs. Overall Andy advocates for giving security teams freedom to follow intuition and uncover meaningful vulnerabilities, without artificial constraints around scope or time boxing the assessment.

💡 Andy was my manager at NCC Group for awhile, and played a critical role in my career and me being where I am today. I am and will always be grateful for him believing in me. I had the opportunity to tell him this recently, and it was really nice 😊

AI + Security

When Agents Get Tools: 10 MCP Labs for Breaking and Hardening AI Integrations
Pawel Koziel shares MCP Breach-To-Fix Labs, a GitHub repo with 10 hands-on MCP security challenges reproduced from real CVEs and public incident reports. Each scenario has a vulnerable, intentionally exploitable implementation, and a secure, hardened implementation with controls that block the attack. Challenges include a CRM Confused Deputy, prompt injection via public GitHub issue, hidden instructions in tool responses, SQL injection, command injection, and more.

Securing MCP servers with 1Password: Stop credential exposure in your agent configurations
Nancy Wang and Robert Menke describe how to use the 1Password CLI (op) to pull secrets at runtime so they’re not stored in plaintext (e.g. in an mcp.json file) and can be easily leaked. Basically: op run --env-file=.env -- cursor mcp-server start. Also apparently 1Password has Environments now, which let you define, sync, and rotate environment variables centrally across projects: 1password env init my-ai-project.

💡 I had a great chat with 1Password CISO Jacob DePriest at a recent dinner, super nice guy. I like what they’re building. H/T Decibel’s Dan Nguyen-Huu for organizing 🙏

Securing AI Agents with Cisco’s Open-Source A2A Scanner
Cisco’s Vineeth Sai Narajala and Sanket Mendapara introduce A2A Scanner, a tool to scan Agent-to-Agent (A2A) protocol implementations for security threats and vulnerabilities. A2A Scanner integrates static analysis of agent definitions (e.g., metadata, manifests, Agent Cards) with dynamic runtime monitoring of communications between agents.

It has five detection engines: pattern matching with detection signatures (YARA), protocol validation with specification compliance (validates agents against the official A2A protocol specs), behavioral analysis with heuristics, runtime testing with an endpoint analyzer, and semantic interpretation with an LLM analyzer.

Securing the Model Context Protocol (MCP): Risks, Controls, and Governance
Paper by Herman Errico, Jiquan Ngiam, and Shanita Sojan analyze security risks introduced by MCP, including 3 types of adversaries (content-injection attackers, supply chain attackers, agents who over-step their role) and how MCP can increase attack surface (data-driven exfiltration, tool poisoning, and cross-system privilege escalation).

The paper also proposes a set of practical controls, including per-user authentication with scoped authorization, provenance tracking across agent workflows, containerized sandboxing with input/output checks, inline policy enforcement with DLP and anomaly detection, and centralized governance using private registries or gateway layers.

Misc

Music

I'm Not That Girl - Wicked (80s POP ROCK Cover) feat. Darren Criss
Josh Ramsay Of Marianas Trench Covers "Defying Gravity" On The Spot
5 unexpected singing moments - Aw the proud dad 🥹
Here’s How Ed Sheeran Pulled Off His NYC Music Special in a Single Take - This is absolutely insane. H/T Luke O’Malley for sharing.

No Vibes Allowed: Solving Hard Problems in Complex Codebases – Dex Horthy, HumanLayer - Excellent talk and slides. Markdown version here.
- Their humanlayer repo has some nice prompts.
2026: The Year The IDE Died — Steve Yegge & Gene Kim, Authors, Vibe Coding - TIL Gene Kim and Steve Yegge have a book on Vibe Coding
Ship Production Software in Minutes, Not Months — Eno Reyes, Factory - Some nice content on capturing enterprise context.
Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic
POC to PROD: Hard Lessons from 200+ Enterprise GenAI Deployments - Randall Hunt, Caylent - I liked a few of the diagrams.

Misc

“I’m changing my profile pic to a girl. Currently at 4833 followers. Let’s see where I’m sitting at the end of the year.”
Florida study: “Two years after the imposition of a student cell phone ban, student test scores in a large urban school district were significantly higher than before.”
TIME’s Top 100 Photos of 2025
Crowdfunding to cover grocery costs becomes more common - Not a good sign for the economy

✉️ Wrapping Up

Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.

If you find this newsletter useful and know other people who would too, I'd really appreciate if you'd forward it to them 🙏

Thanks for reading!

Cheers,
Clint

P.S. Feel free to connect with me on LinkedIn 👋

[tl;dr sec] #308 - MCP Security, AWS re:Invent Recaps, Detecting Malicious Pull Requests with AI

MCP practice labs and securing MCP paper, re:Invent highlights, how Datadog detects malicious PRs at scale

🥖🗼La Vie de Clint

📣 We tried to get an AI agent to write vulnerability checks for us…

👉 Read the full article 👈

AppSec

📣 From Gates to Guardrails: How to Prevent Risk at Scale

👉 Start Preventing Risk 👈

Cloud Security

Supply Chain

Blue Team

Red Team

AI + Security

Misc

✉️ Wrapping Up

📣 From Gates to Guardrails:
How to Prevent Risk at Scale