- tl;dr sec
- Posts
- [tl;dr sec] #295 - AI Code Analysis, AWS Detection Engineering, Anthropic Threat Intel Report
[tl;dr sec] #295 - AI Code Analysis, AWS Detection Engineering, Anthropic Threat Intel Report
Using AI to find vulnerabilities in code, mastering AWS logs for detection engineering, how threat actors are misusing Claude (#4 will surprise you)
Hey there,
I hope you’ve been doing well!
🥇 Gonna be Golden
This past weekend I watched KPop Demon Hunters, and it was great 🥰
It’s been getting a lot of buzz, and I think it’s well deserved.
I won’t spoil the plot if you haven’t seen it yet (spoiler warning: the linked songs will though), but what stuck out to me is that even though the setting is fantastical, the core themes are very human and universal:
Hiding a part of yourself even from those closest to you, for fear of being judged, and feeling shame about it. Especially if it makes you “different.”
See also: Elsa in Frozen.
Being seen / showing yourself, “no more hiding.” (Golden)
The healing power of vulnerability and sharing past hurts in relationships. (Free)
A group of friends each with their own “deals,” who ultimately love each other because of, not just in spite of these quirks.
“I broke into a million pieces and I can’t go back. But now I’m seeing all the beauty in the broken glass. The scars are part of me, darkness and harmony, my voice without the lies, this is what it sounds like.” (What It Sounds Like)
There’s something beautiful about telling stories that connect people. I’d love to do that one day.
(P.S. This intro was not sponsored by Netflix, but… it could be 🤙)
Sponsor
📣 Turn Your AI Policy Into Something People Actually Read
Most AI policies are written by legal, ignored by employees, and forgotten until something breaks. AI Policy Studio is a free tool to help you write something better. It’s practical, usable, and aligned with how people actually work. Built by the team at Harmonic Security, it’s structured, fast, and no login required.
👉 Launch Tool 👈
I gave this a spin, and it only took me like 2min to go through. The questions it asked were useful framing (and referenced some legal/compliance stuff I forgot to consider), and the policy it outputted looks useful 👍️ Nice!
AppSec
Phrack #72
The new Phrack that came out during DEF CON. I’ve linked to individual articles but not the whole thing yet.
tristanlatr/burpa
Burp Automator - A Burp Suite Automation Tool. It provides a high level CLI and Python interfaces to Burp Suite scanner and can be used to set up Dynamic Application Security Testing (DAST).
The New Commandments of Security Teams
Post by my bud Maya Kaczorowski (who recently-ish cofounded Oblique, a self-serve IGA solution) in which she outlines nine principles that successful security teams have adopted to transform from "the department of no" into strategic business partners and revenue enablers. She emphasizes focusing on fixing classes of problems through engineering approaches rather than addressing individual issues, prioritizing usability in security solutions to prevent workarounds, and measuring actual risk reduction instead of just identifying vulnerabilities.
Maya advocates for security teams to delegate authority within guardrails, teach tradeoffs rather than rigid rules, and recognize that corporate security (including device management and identity) is critical to an organization's overall security posture.
“Instead of being the team that finds problems, become the team that solves them. Instead of just writing policies, build systems where the secure choice is the easy choice. Instead of reviewing every decision, give teams the tools and context to make good decisions themselves.
This is security finally operating like other mature business functions. The security teams making this transition are becoming indispensable business partners. The ones that aren’t are still explaining why security matters and will keep wondering why they're always fighting for resources and relevance.”
Sponsor
📣 Hear From the RSA’25 Sandbox Winner: The Future of Vulnerability Management
If you’re tired of drowning in scanner noise, you’re not alone. First-generation tools like Qualys and Tenable were built 20 years ago for a static internet and bury teams in false positives instead of telling you what is actually exploitable. Only about 6% of CVEs are ever exploited, yet attackers weaponize them in hours. ProjectDiscovery, the 2025 RSA Sandbox winner, flips the script with runtime validation, community-driven detections within hours, and auto-scans that prove what’s exploitable now. Less noise, faster action, less developer waste, and actual risk reduction.
ProjectDiscovery builds and shares so many awesome tools (e.g. nuclei), and they seem to be a favorite company of a number of my sharp friends 🤘
Cloud Security
NetSPI/ATEAM
By NetSPI’s Karl Fosaaen and Thomas Elling: ATEAM (Azure Tenant Enumeration and Attribution Module) is a Python reconnaissance tool that discovers Azure services and attributes tenant ownership information by testing resource names against six different Azure services.
💡 I pity the fool who doesn’t secure their Azure environment ✊
AWS Detection Engineering: Mastering Log Sources for Threat Detection
Muh. Fani Akbar describes how their detection engineering team rebuilt their AWS security monitoring after a simulated attacker remained undetected for two weeks. The post discusses evaluating different AWS log sources (rec: prioritize CloudTrail and VPC flow logs), building detections for these log sources, implementing real-time detection rules and behavioral analysis rules, automated response (e.g. auto disabling a user account and revoking a user’s sessions) and more.
Recommendations: context is everything (the most effective detections combine multiple data sources and enrich events with business context, user behavior baselines, and threat intel), and the most sophisticated attacks span multiple AWS services, so correlating relationships across CloudTrail events, VPC Flow Logs, and GuardDuty findings can reveal attack patterns that individual log sources would miss.
AWS Detection Engineering - Architecting Security Logging at Scale in AWS
Muh. Fani Akbar describes shares a comprehensive AWS security logging architecture based on lessons from a fintech breach where attackers maintained a 127-day presence due to inadequate logging. The post walks through a four-layer approach (log collectors, aggregators, brokers, and alerting) using tools like CloudWatch Agent, Kinesis Firehose, Amazon Managed Streaming for Kafka, and SNS, with intelligent log sampling and MITRE ATT&CK-mapped detection rules for threats like credential dumping, lateral movement, and data exfiltration.
The architecture emphasizes collecting the right data rather than everything, enriching logs with context, automating verification, and continuously testing logging systems to prevent visibility gaps that attackers exploit.
AWS CDK and SaaS Provider Takeover
Very clever attack described by Ryan Gerstenkorn on how SaaS platforms using AWS's AssumeRole to access customer-provided roles can be vulnerable to account takeover when CDK bootstrap roles exist in the platform’s AWS account. For SaaS platforms that allow customers to specify role ARNs during onboarding, if an attacker specifies the SaaS provider's own CDK roles (which trust the account's root principal without ExternalID protection), the platform will successfully assume its own internal role, granting the attacker access to the provider's AWS account.
Ryan recommends fixing this vulnerability by adding an explicit Deny statement to the SaaS proxy role's identity policy that prevents same-account role assumption using aws:ResourceAccount and aws:PrincipalAccount condition keys.
Blue Team
Quicklinks
Hackers have threatened to leak Google databases unless Google fires two employees on their Threat Intelligence Group team. Being called out by name by threat actors seems like “career achievement unlocked” in threat intel 🤘
Google is launching a cyber "disruption unit" focused on "legal and ethical disruption" options to proactively identify and take down malicious campaigns. We cut to: Tavis Ormandy and P0 folks cracking their knuckles.
Google's Threat Intelligence Group (GTIG) details a widespread data theft campaign by UNC6395 targeting Salesforce instances through compromised OAuth tokens in the Salesloft Drift application.
Salesloft Drift Breach Tracker - Nudge put together a nice overview of the companies confirmed affected by the breach.
FBI cyber cop: Salt Typhoon pwned 'nearly every American'
"One of the most consequential cyber espionage breaches" in US history. The operation, which began in 2019 and was discovered last fall, targeted approximately 200 American organizations including major telecommunications companies, affected 80 countries, and was linked to three Chinese entities. The attackers collected bulk data from millions of Americans, geo-located mobile phone users, monitored internet traffic, recorded phone calls, and intercepted communications content from high-profile targets including over 100 current and former presidential administration officials from both political parties.
poppopjmp/VMDragonSlayer
By Agostino Panico: An automated multi-engine framework for unpacking, analyzing, and devirtualizing binaries protected by commercial and custom virtual machine based protectors like VMProtect, Themida, and custom malware VMs. It combines dynamic taint tracking, symbolic execution, pattern and semantic classification, and machine learning–driven prioritization to dramatically reduce manual reverse engineering time. Presented at DEF CON 2025.
Velociraptor incident response tool abused for remote access
Sophos researchers investigated an intrusion that involved a threat actor deploying the legitimate open-source Velociraptor digital forensics and incident response (DFIR) tool to download and execute Visual Studio Code with the likely intention of creating a tunnel to an attacker-controlled command and control (C2) server.
Red Team
dashingsoft/pyarmor
A tool used to obfuscate Python scripts, bind obfuscated scripts to specific machines or setting expiration dates for obfuscated scripts.
Hiding Your C2 Traffic With Discord & Slack
Arslan Masood describes hiding C2 traffic using Discord and Slack APIs, which may be able to blend in with legitimate workplace app traffic and avoid detection, compared to a separate C2 server. They’ve also released SierraOne and SierraTwo as basic yet functional C2 frameworks that leverage Discord and Slack respectively.
Ghost Calls: Abusing Web Conferencing for Covert Command & Control (Part 1 of 2)
Praetorian explores how web conferencing traffic can be leveraged for covert command and control, focusing on Zoom's architecture and protocols (see also Adam Crosser’s Black Hat USA 2025 talk, DEF CON video). Web conferencing is a great channel for C2 because the communication is bursty, uses multiple methods to egress through client networks, is end-to-end encrypted, most traffic is relayed through a large globally distributed networks of proxy-like servers, etc.
The post does an impressive deep dive into Zoom’s protocols, desktop and web client, egress techniques, and Praetorian has published turnt, a tool designed for smuggling interactive command and control traffic through legitimate TURN servers hosted by reputable providers such as Zoom.
AI + Security
Google Big Sleep AI Tool Finds Critical Chrome Vulnerability
Google has patched a critical use-after-free vulnerability (CVSS 9.8) in Chrome's ANGLE graphics library that was discovered by their Google DeepMind x Project Zero #collab AI security tool, Google Big Sleep.
💡 This is impressive because Chrome is a quite hardened, heavily reviewed piece of software- targeted by nation states, heavily fuzzed, high bug bounty payouts, etc., and this autonomous AI-powered detection tool still found a serious bug. Nice.
Slice: SAST + LLM Interprocedural Context Extractor
Friend of the newsletter Caleb Gross introduces Slice (SAST + LLM Interprocedural Context Extractor), a tool that combines CodeQL, Tree-Sitter, and LLMs to find vulnerabilities across complex call graphs without requiring code compilation, tool use, or agentic frameworks. The post walks through using Slice to successfully and consistently reproduce the discovery of a use-after-free in the Linux kernel's SMB implementation (see Sean Heelan’s post) on 10/10 consecutive runs for <$4 per run, using the following process:
A permissive CodeQL query to find potential UAFs. (1722 candidates)
Use tree-sitter to fetch the rest of the code context to send to an LLM, filter by max call depth. (217 candidates)
Triage potential vulnerabilities with a small model (GPT-5 mini). (9 candidates)
Deeply analyze the remaining candidates with GPT-5 (high reasoning).
💡 This is a very thoughtful post, highly recommend. I think giving LLMs access to the right code info (via program analysis, tree-sitter, LSP, …) and/or static analysis tools directly is going to be Big™️. As is LLM triage, which we’ve already seen a number of places.
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale
Zhun Wang, Tianneng Shi, Dawn Song et al introduce CyberGym (code, paper), a large-scale evaluation framework designed to assess the capabilities of AI agents on real-world vulnerability analysis tasks. CyberGym includes 1,507 benchmark instances with historical vulnerabilities from 188 large software projects, sourced from vulnerabilities found by OSS-Fuzz, Google’s continuous fuzzing campaign.
Models are given a description of the vulnerability and the pre-patch codebase and are tasked with generating a PoC that triggers the vulnerability. The best performing agent (OpenHands + Claude-Sonnet-4) successfully reproduced 17.85% of target vulnerabilities. The agents also discovered 15 zero-days.
💡 Love the detailed evaluation of different models and how different factors influence the agents’ effectiveness, and awesome that they released the dataset and code.
Finding vulnerabilities in modern web apps using Claude Code and OpenAI Codex
Semgrep’s Romain Gaucher, Vasilii Ermilov, and yours truly evaluated AI Coding Agents' vulnerability detection capabilities by testing Anthropic's Claude Code and OpenAI Codex against 11 large open source Python web applications, finding they could identify real vulnerabilities with a simple prompt and no scaffolding!
We found that Claude Sonnet 4 was best at finding IDOR vulnerabilities, and both overall struggled with taint-style vulnerabilities across functions and files. We also observed meaningful non-determinism, where identical prompts produced completely different results across multiple runs.
💡 I especially like the section on the problem with benchmarks, especially AI SAST ones. Expect to see more data-focused posts like this!
Detecting and countering misuse of AI: August 2025
New threat intelligence report from Anthropic describing how cybercriminals are misusing Claude, including:
‘Vibe hacking’: In a theft and extortion campaign, Claude Code was used to automate reconnaissance, harvest victim credentials, and penetrate networks. Claude was allowed to make both tactical and strategic decisions, such as deciding which data to exfiltrate, and how to craft psychologically targeted extortion demands.
North Korean IT workers: The regime’s training capacity used to be a major bottleneck, but now operators who don’t know how to code or speak English can pass technical interviews and maintain their positions.
AI was used to generate ransomware-as-a-service code with advanced evasion capabilities, encryption, and anti-recovery mechanisms.
See also this interview with Jacob Klein and Alex Moix on the report.
💡 The first example, using Claude Code as a command center to execute attacks (vs just generating code) is pretty neat, check out the report for a deeper breakdown.
Misc
“Most obstacles melt away when we make up our minds to walk boldly through them.”
Feelz
Music
Misc
✉️ Wrapping Up
Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.
If you find this newsletter useful and know other people who would too, I'd really appreciate if you'd forward it to them 🙏
Thanks for reading!
Cheers,
Clint
@clintgibler