• tl;dr sec
  • Posts
  • [tl;dr sec] #331 - How Adversaries Use AI, Skill Issues, Using IDEs for C2

[tl;dr sec] #331 - How Adversaries Use AI, Skill Issues, Using IDEs for C2

Google's deep dive on how threat actors are using AI, bypassing malicious skill scanning, using VS Code dev tunnels for command and control

Hey there,

I hope you’ve been doing well!

👩‍❤️‍👨 Repo-mantic Comedy

Recently I had one of those moments where you remember that LLMs are trained on the vast, beautiful, complicated collection of human knowledge.

I was using Codex to port a feature from one code base to another, and it said:

“…I’m reading the exact code paths now so the port preserves behavior instead of inventing a prettier cousin.” 😂 

Dear reader, I had questions. Like: how many bodice ripper novels and country music lyrics are in the training corpus? What other secrets lie in the weights?

"It is a truth universally acknowledged that one does not rank one's cousins by attractiveness.”

Jane Austen

I didn’t think I read anything about dating preferences in its model card.

LLMs are strange, and amusing sometimes. This situation makes me think of the goblins post.

I love demoing what you can build with coding agents to friends, but maybe I’ll hesitate before doing this next Thanksgiving, just in case…

Sponsor

📣 AI will make every asset a potential zero-day target. Are you ready?

The AI-attack era has arrived. Thousands of zero-days in the pipeline. Target-specific exploits generated in minutes. Unattributed, one-off attacks that bypass detection — while your dashboard stays green.

runZero is built for this reality. Know every asset on your attack surface, uncover every exposure, map every attack path, and validate your segmentation — before the exploit drops. We deliver deep intelligence across IT, OT, IoT, cloud, and mobile, so defenders can win by default. Even against AI.

I’ve heard great things about runZero, and HD Moore is a legend (and super nice).

AppSec

What My Privacy and Security Stack Actually Looks Like
Great guide by Yael Grauer: Use a PO Box and EasyOptOuts to scrub your home address from the internet (Big Ass Data Broker Opt-Out List), meet new contacts in public, use YubiKeys for MFA, 1Password/Bitwarden, encrypted drives, privacy screens in public, Privacy Badger, Mullvad VPN, uBlock Origin, Signal with disappearing messages, Google’s Advanced Protection Program, Lockdown Mode for Apple Devices, Google Fi for SIM-swap protection, iCloud's Hide My Email for aliases, and more.

Comparing AI Application Security Testing Platforms
Doyensec's Luca Carettoni and Anthony Trummer conducted a side-by-side comparison of two AI-powered penetration testing platforms, Aikido's Attack AI Pentest and XBOW's Lightspeed, manually validating all findings to determine true positives versus false positives. The evaluation assessed configuration complexity, impact on tested applications, report quality, cost, speed, and overall testing effectiveness.

💡 Great example of a thoughtful benchmarking methodology and comparison that measures a variety of useful dimensions like: did a human tester agree with the severity ratings, what was the overlap in findings between the tools, and more. It’d be great to see more comparisons this detailed, but it does take a lot of time and effort.

Zapocalypse: The Attack Chain That Could Have Hijacked Zapier
Token Security’s Yair Balilti describes chaining five known primitives to achieve NPM publishing rights to Zapier's design-system package, which would have enabled JavaScript execution in every authenticated Zapier session (yikes). Starting from a "Code by Zapier" Python sandbox where os.system worked, Yair scraped orphaned AWS STS credentials from /proc/self/mem (since del os.environ[k] doesn't zero heap memory), then used the misnamed allow_nothing_role (which actually permitted ECR enumeration and image pulls) to extract 1,111 container images via direct ECR API calls bypassing Docker's GetAuthorizationToken requirement. He then discovered a high-privilege NPM token with bypass_2fa: true and scope.name: null leaked in container build metadata via ARG/ENV in image config history, plus a hardcoded Zapier Actions MCP key belonging to a LiteLLM co-founder that enabled Gmail impersonation.

💡 Attack chain enabling publishing arbitrary JavaScript served by <your domain> and ran in every one of your user’s sessions… $3,000. Sometimes I’m surprised more security researchers don’t turn to crime. To be clear, I’m not encouraging bad behavior, nor is this a unique case, I’ve seen many examples of “I could compromise <all of your users>” and the payout is a few grand. Sometimes the impact to payout ratio feels 🙃 

Sponsor

📣 Prowler: the world’s most widely adopted open cloud security platform

Prowler automates security and compliance across any cloud environment, with agentless coverage of cloud infrastructure, SaaS, Kubernetes, containers, Infrastructure as Code, and more. It detects vulnerabilities and misconfigurations, prioritizes risks, accelerates remediation, and automates audit-ready compliance. 

Prowler has become the security platform of choice for thousands of cloud teams, with 45M+ downloads, 13K+ GitHub stars, and 300+ global contributors. Prowler Cloud delivers cloud security 10x more cost-effectively than alternatives.

Prowler is great, love the open core nature. Also fun demo format 👍️ 

Cloud Security

From Leaked AWS Key to Data Exfiltration in 60 Seconds: Are We Ready?
Adan Alvarez tested Claude Code's ability to move from a leaked AWS IAM key to data exfiltration without AWS-specific guidance, finding that in 7 of 12 runs, the AI agent successfully completed the attack chain in approximately 60 seconds. The scenario involved a CI/CD user with read access to a Terraform state file containing credentials that could assume a privileged role, with all successful runs following an identical six-phase kill chain: GetCallerIdentity, policy enumeration (ListUserPolicies/GetUserPolicy), credential recovery from S3, AssumeRole, bucket enumeration, and exfiltration.

Adan notes that CloudTrail's 5-minute log delivery delay means traditional alerting may be too slow to prevent sub-minute attacks which is why Adan is betting on honeytokens and honeypots to waste the agent's time before it finds anything real. See the scenario on GitHub here.

💡 Interesting- I hadn’t thought about that as much yet, but that’s a great point: log sources that only ship every 5 minutes could be a problem if an entire kill chain can be fully automated in a minute or two. Yikes.

Adding Strands Security Agents to Shadow Asset Scanner
Sena Yakut built a shadow asset scanner that uses boto3 to sweep AWS for exposed S3 buckets, stale IAM keys, public Lambda function URLs, and similar misconfigurations. On top of that she layered a Strands Agents SDK that reads the raw findings and reasons across them for multi step attack paths rather than presenting each item in isolation.

The architecture runs as a collaborative Swarm where specialized agents pass context to each other in sequence. The Error Analyst handles failures from the boto3 pass first, the Attack Chain Analyst then stitches findings into chained scenarios mapped to MITRE ATT&CK tactics, the Summary Agent compresses what comes out, and the Chat Agent serves four report formats (standard, executive, technical, and compliance) along with remediation commands. The agents reach the scanner through Strands' tool interface and share state across handoffs, with caps on handoff count and execution time keeping the swarm from looping indefinitely.

Supply Chain

Skill Issues: Compromising Claude Code with malicious skills & agents
Reversec's James Henderson demonstrates how Claude Code skills and sub-agents can serve as initial access vectors, with risks comparable to installing untrusted pip packages. Henderson describes two attack paths. The first runs through skill frontmatter: setting allowed-tools: Bash(*) alongside dynamic context inputs like !socat ... executes commands before the LLM processes them, while direct reverse shell requests to Claude get refused.

The second path runs through sub-agents and permissionMode: bypassPermissions, which skips consent prompts but doesn't prevent agents from reasoning about commands. To bypass that reasoning, Henderson runs npm install against a localhost registry serving backdoored packages, giving the agent legitimate cover to execute malicious code without exposing the payload.

The sorry state of skill distribution
Trail of Bits's Samuel Judson and Tjaden Hess were able to bypass ClawHub’s malicious skill detector, Cisco’s agent skill scanner, and all three of the scanners integrated into skills.sh in a few hours, using techniques like prepending 100,000 newlines to hide malicious code, embedding payloads in .docx archives and poisoned .pyc bytecode files, and using prompt injection to convince guard models that malicious registry configurations were legitimate corporate infrastructure. The attacks exploited weaknesses in the scanners: truncated file analysis, limited file type coverage that ignored binaries and hidden files, and the ability for attackers to iteratively refine attacks against static scanning targets.

Recommendation: avoid public skill marketplaces like skills.sh and ClawHub entirely, instead curate internal skill repositories using trusted sources. PoC repo: trailofbits/overtly-malicious-skills.

💡 Also WHAT - “The official MS Office skills from Anthropic for handling .docx, .xlsx, and .pptx files each contain a script called soffice.py… which hacks around the socket block by using LD_PRELOAD to patch in either 1) an existing $TMP/lo_socket_shim.so”, or 2) a library dynamically compiled out of C code embedded in a docstring.” 🫠 😂 

Blue Team

From Exploit Code to Production Detection: Building a CVE-2026-31431 (Copy Fail) detection with Agents
Datadog's Ryan Simon walks through Copy Fail, a Linux kernel bug that lets an unprivileged user corrupt the page cache through its crypto socket interface and quietly rewrite setuid binaries like /usr/bin/su to escalate to root. Ryan used a single coding agent with a custom skill for each step to compress the full detection engineering cycle into one session, from threat analysis through live exploit testing to production deployment. The detection itself is a three stage chained rule that uses process scoped variables to track bind(AF_ALG), setsockopt(SOL_ALG), and splice or open operations on SUID binaries or PAM configs.

💡 The “Accelerating the Detection Engineering Lifecycle with agents” section at the bottom has some great tactical details on how specific steps are scaled 👌 

Adversaries Leverage AI for Vulnerability Exploitation, Augmented Operations, and Initial Access
Google Threat Intelligence Group's (GTIG) Q2 2026 AI Threat Tracker describes several recent developments in how threat actors are using AI in their operations and targeting AI infrastructure directly. GTIG identified the first case of a threat actor using a zero-day it believes was developed with AI, a 2FA bypass in a popular open-source web admin tool disrupted before mass exploitation. PRC and DPRK actors are running their own AI-augmented vuln research workflows, while Russia-nexus malware CANFAIL and LONGSTREAM use LLM-generated decoy logic to obfuscate payloads against Ukrainian targets. PROMPTSPY, an Android backdoor first identified by ESET, embeds an autonomous agent that drives device interactions through gemini-2.5-flash-lite.

Threat actors are also going after AI infrastructure itself. TeamPCP (UNC6780) compromised the LiteLLM and BerriAI repositories alongside Trivy and Checkmarx to plant the SANDCLOCK credential stealer and extract AWS keys and GitHub tokens from build environments. They're also industrializing LLM access through middleware like Claude-Relay-Service and CLIProxyAPI alongside automated account-registration pipelines. The common pattern is a maturing ecosystem where the orchestration layers around AI (wrapper libraries, skill packages, API connectors) are now part of the software supply chain attack surface.

💡 Wow, excellently detailed blog on how threat actors are using AI. Covers a number more things than I have the space to include here.

Red Team

NomShub: Weaponizing Cursor's Remote Tunnel Through Indirect Prompt Injection and Sandbox Breakout
Straiker’s Karpagarajan Vikkii and Amanda Rousseau describe NomShub, a vulnerability chain in Cursor where a malicious repository can silently hijack a developer's machine, combining indirect prompt injection, a sandbox escape via shell builtins (export and cd to escape workspace restrictions and write to ~/.zshenv for persistence), and Cursor's built-in remote tunnel to give attackers persistent, undetected shell access triggered simply by opening a repo.

The Accidental C2: Exploring Dev Tunnels for Remote Access
SpecterOps's Adam Chester examined Visual Studio Code Dev Tunnels as a potential C2 framework, discovering they consist of multiple protocol layers: REST management API for tunnel discovery and token generation, WebSocket tunneling, SSH connections using the russh crate, and MsgPack RPC for command execution. Adam released Ouroboros, a Rust tool that implements the stack outside VS Code, with RPC methods like spawn, fs_read, fs_write, and fs_connect to interact with existing dev tunnels for remote code execution and file operations.

Adam found that FOCI (Family of Client IDs) and BroCI (Nested App Authentication) clients can be leveraged to pivot from compromised Microsoft applications like Teams or Azure Portal to gain access tokens for the Dev Tunnels Service, enabling lateral movement and initial access scenarios. Adam conducted this research with significant assistance from GPT-5.4-Cyber, which mapped the protocol layers and created the russh patch.

💡 Next example of leveraging coding agents to quickly understand a new, complex code base and stack.

Claude Code Hooks as Initial Access & Persistence
Zhangir Ospanov describes how Claude Code's Hooks feature can be weaponized for initial access and persistence by embedding malicious commands in .claude/settings.json files, similar to the VSCode tasks backdoor previously exploited by Lazarus Group. Attackers can plant hooks at the project level that execute when a developer clones and runs Claude Code, or achieve persistence by modifying the global config (~/.claude/settings.json) to trigger payloads across all sessions. The technique uses lifecycle events like SessionStart, PreToolUse, and PostToolUse to execute arbitrary shell commands.

Detection: audit claude/ directories in cloned repositories, watch ~/.claude/settings.json with file integrity monitoring, and review hook commands for anything suspicious before running Claude Code on untrusted code. A proof of concept repository with the full payloads is available if you want to test detection rules locally.

💡 Reporting to you live from the field: features to run arbitrary code… support running arbitrary code. Lots of Living Off the Land opportunities for modern IDEs and coding agents.

AI + Security

Quicklinks:

*Sponsored

openbashok/promptzero
Local proxy tool by OpenBash that detects and replaces sensitive data such as IP addresses, hostnames, credentials, and personal information in your prompts before they leave your environment, then restores the real values in the response. Detection combines Presidio + spaCy named entity recognition (English and Spanish) for entities like persons, organizations, emails, and passports, with regex layers covering network infrastructure and country-specific identity documents. Each session keeps a bidirectional mapping table that stays local, and you can verify nothing real ever leaves by routing the upstream connection through Burp or mitmproxy and inspecting what actually reaches Anthropic.

LLMjacking: what these attacks are, and how to protect AI servers
Kaspersky's Stan Kaminsky describes an experiment where a researcher ran a Raspberry Pi honeypot dressed as a high-performance AI server with Ollama, LM Studio, and similar local frameworks. Shodan found it in three hours, and the box saw 113,000 requests in a month, with 23% aimed at the AI stack. Attackers used LLM-Scanner to fingerprint models through /api/tags and /v1/models, scanned for AI agents via /.cursor/rules, inventoried MCP servers via /.well-known/mcp.json, and hunted .env files for credentials. The focus was resource theft, not RCE, mostly proxying calls to Anthropic models and parsing vuln data from social posts.

Kaminsky shares some key defensive measures for private AI infrastructure, such as binding single-machine deployments to localhost so they aren't reachable from the network, swapping plain API key auth for OIDC or OAuth2 with short-lived tokens, segmenting the network with IP allowlists, running EDR on the boxes hosting AI models, setting per-role usage quotas with anomaly alerts on resource consumption, and shipping every request and response to a SIEM with tamper-resistant storage.

Inside Claude Managed Agents
Pluto Security's Yotam Perkal reverse-engineered Anthropic's Claude Managed Agents cloud runtime, finding gVisor sandboxing with a three-layer egress control system. Outbound traffic routes through a JWT-authenticated proxy with TLS inspection, the container has no direct DNS, and a network-level firewall blocks direct outbound. Together these prevent proxy bypass even when proxy environment variables are unset.

The architecture separates session, harness, and sandbox into distinct trust zones, so a compromised sandbox cannot tamper with audit logs, influence orchestration, or access vault credentials. Yotam calls the vault credential proxy the platform's strongest property, with secrets never entering the sandbox and instead injected server-side at request time, so prompt injection has nothing to steal.

Yotam notes that the defaults ship for convenience. The egress JWT is readable by any sandbox process and contains organization metadata plus the complete egress allowlist, which Anthropic silently expands with six additional infrastructure hosts (including a staging endpoint) even in limited networking mode. All eight tools are enabled by default with an always_allow permission policy and unrestricted networking. For hardening your deployment the post recommends disabling the default toolset, allowlisting only necessary tools, using limited networking, storing credentials in vaults, and monitoring session events.

Misc

Misc

Humor

AI / Tech

✉️ Wrapping Up

Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.

If you find this newsletter useful and know other people who would too, I'd really appreciate if you'd forward it to them 🙏

Thanks for reading!

Cheers,
Clint

P.S. Feel free to connect with me on LinkedIn 👋