tl;dr sec
Posts
[tl;dr sec] #333 - Perplexity's Bumblebee, Evading Cloud Logging, AI Vuln Hunting Spec

[tl;dr sec] #333 - Perplexity's Bumblebee, Evading Cloud Logging, AI Vuln Hunting Spec

OSS tool to scan packages, agent configs, editors, and browser extensions for malware, tactics for evading cloud logging, a specification to generate your own custom agentic AI security scanning system

Clint Gibler
June 18, 2026

Hey there,

I hope you’ve been doing well!

🙏 Busy, Exciting, Busy

Thanks so much to everyone who reached out last week! I received so many kind emails, LinkedIn comments, and texts, I was filled with joy. My heart grew at least 3 sizes 🥹

Apologies if I haven’t responded yet, there are somehow even more things going on inside OpenAI than you’d expect, and I’ve been a bit buried.

We’ve been sprinting on some things… that you might see soon 🤭

Speaking of, I’m actually going to be doing a live session with some colleagues next Thursday about Daybreak, our vision for empowering defenders.

We’ll discuss cyber models, early insights from working with leading teams, show how these capabilities fit into security workflows, and discuss where AI-assisted defense is headed next.

We’ll likely cover new security product stuff we’re shipping, and even a live demo (I’m starting my sacrifices to the demo gods now 🙏).

👉️ Join live Thursday June 25, 1pm PDT 👈️

Hope to see you there!

Sponsor

📣 Is Your Segmentation Real, or Just a Comfortable Illusion?

AI-generated exploits don't need sophistication, just a gap you don't know exists. Target-specific attacks are generated in minutes, built to find what your tools can't see.

runZero shatters the segmentation illusion. New attack path mapping and topology visualizations reveal how an attacker can move through your environment. Safely enumerate sub-assets hidden behind protocol gateways like Modbus, BACnet, and EtherNet/IP that other tools miss entirely.

Every asset, every exposure, every attack path across IT, OT, IoT, cloud, and mobile. With runZero, defenders win by default. Even against AI.

👉 Start your 21-day free trial 👈

I’ve heard runZero is crazy good at mapping environments.

AppSec

Formal methods and the future of programming
Jane Street’s Yaron Minsky describes how they’re building a formal methods team after 25 years of skepticism, driven by the belief that agentic coding has fundamentally changed the cost-benefit calculus of formal verification. Yaron argues that AI agents both reduce the cost of formal methods (by making proof construction more accessible) and increase the benefits (by providing better verification for AI-generated code that tends toward "slop", and by offering the universal guarantees that agents need for effective feedback during training and coding).

“Our hope is to make formal methods as pervasively useful of a tool for building software as sophisticated type systems are for us today.”

💡 Useful things to reflect on whenever there are meaningful tech changes, AI or otherwise: what used to be hard that is now easy? What used to be impossible that is now feasible? What used to be too costly or too slow that now could make sense? I’m actually pretty bullish on formal methods in the AI era.

Making secret scanning more trustworthy: Reducing false positives at scale
Microsoft's Mariko Wakabayashi details how the Agents Offense team helped GitHub adopt Microsoft’s Agentic Secret Finder's verification approach into its AI-powered secret scanning, cutting customer-confirmed false positives by 75%. Rather than feeding the model more data, the approach extracts focused, high-signal usage context such as whether a detected value is assigned to a variable and later passed into an API request, authentication header, database client, or cloud SDK call, as well as execution paths. Providing the right context lets the model separate real exposures from noise like UUIDs, test data, or placeholders without reducing detection coverage.

💡 This work sounds neat, and How we built Copilot secret scanning shares a bit more details about GitHub’s approach, but I wish both were a bit more detailed. For an example I like, see Wiz’s post How We Fine-Tuned a Small Language Model for Secret Detection in Code.

Codex Discovered a Hidden HTTP/2 Bomb
Calif's Quang Luong, Jun Rong and Duc Phan used Codex to discover HTTP/2 Bomb, a remote denial of service affecting nginx, Apache httpd, Microsoft IIS, Envoy, and Cloudflare Pingora in their default configurations. The attack chains two techniques that have been public since 2016. The first abuses HTTP/2's header compression, where a single saved header can be referenced thousands of times, and each one-byte reference forces the server to allocate a full header in memory. The second tells the server its receive buffer is full, then drips just enough updates to keep the connection from timing out so the server never frees anything.

A single client on a 100Mbps home connection can pin 32GB of server memory in roughly 20 seconds against Apache and Envoy. A Shodan search found 880,000+ websites supporting HTTP/2 and running one of these servers, though many sit behind a CDN. Calif published PoCs and Docker labs at califio/publications.

Sponsor

📣 Discover the Architecture of Stopping-Power

Modern threats move faster than platforms built on delayed telemetry, API-derived state, and after-the-fact correlation can control. The next generation of cloud security will not be won or lost on how much risk it can describe. It will be won or lost on how effectively it can convert context into stopping power.

In their landmark paper, Agentic Cloud Security Platforms: The Shift to Runtime Security, Software Analyst Cyber Research (SACR) demonstrates how the limits of CNAPP are architectural.

👉 Download this important read 👈

I do wonder where things are headed if we assume AI-powered attackers can pivot and move through a network faster. How will cloud tools adapt? 🤔

Cloud Security

Disrupting AWS logging
(In 2016!) Daniel Grzelak demonstrates multiple techniques for disrupting AWS CloudTrail logging after compromising an account, ranging from obvious methods like delete-trail and stop-logging to stealthier approaches. Key tactics include disabling multi-region logging with update-trail --no-is-multi-region-trail --no-include-global-service-events to operate freely in non-home regions while suppressing global IAM events, creating an immutable encryption-only KMS key via create-key --bypass-policy-lockout-safety-check so CloudTrail keeps writing logs that nobody can decrypt, redirecting logs to attacker-controlled S3 buckets, modifying bucket policies to block CloudTrail writes, and setting 1-day S3 lifecycle expirations to auto-delete files.

You can also deploy an AWS Lambda function triggered by S3 object-create events to delete logs immediately on write, winning any race condition against SIEM ingestion while staying within Lambda's 1 million free monthly invocations to avoid detection through unusual billing patterns.

💡 As they say, read Daniel’s posts to avoid being a cyber-patsy.

Blinding the Watchmen: Abusing Cloud Logging Services for Defense Evasion and Visibility
Palo Alto Networks's Yahav Festinger describes seven attack techniques targeting AWS CloudTrail and Google Cloud Logging services, organized into defense evasion and continuous visibility (transferring logs to the attacker’s accounts, giving visibility into the victim’s environment). Defense evasion techniques include stopping logging, deleting the log storage destination, deleting the log router, impairing logging via an attacker-controlled encryption key, and log poisoning. Continuous visibility techniques include configuring a new log routing resource and log redirection. Attackers can exploit permissions like cloudtrail:StopLogging, s3:DeleteBucket, logging.sinks.update, and KMS key modifications to blind security tools, manipulate audit trails, or exfiltrate logs to attacker-controlled destinations for passive reconnaissance.

Festinger recommends restricting access to logging service APIs to highly privileged users, using immutable log repositories like AWS's 90-day CloudTrail Event History and Google Cloud's _Required log bucket, and implementing bucket policies that prevent non-admin modifications. For detection, CloudTrail log file integrity validation flags log poisoning but is off by default for trails created via API or CLI.

Supply Chain

Perplexity is open sourcing Bumblebee
Perplexity has released Bumblebee, a read-only scanner that checks developer machines for risky packages, extensions, and AI tool configurations during supply-chain incidents by parsing metadata files directly without executing code. The tool scans four surfaces: language package managers (npm, pnpm, PyPI, Go modules, etc.), AI agent configs (MCP), editor extensions (VS Code family), and browser extensions (Chromium and Firefox). Bumblebee supports three scan profiles: baseline for routine scans, project for targeted repo checks, and deep for active incident response.

Bumblebee avoids triggering malicious install scripts by never invoking package managers or running lifecycle hooks, instead reading lockfiles, manifests, and installed package metadata directly. Perplexity integrates Bumblebee into their workflow where Perplexity Computer drafts catalog updates as GitHub PRs after threat signals emerge, humans review them, and Bumblebee then scans endpoints with the updated catalog to identify exposed systems.

💡 Smart approach, I like this a lot: a single tool to inventory all of these developer attack surfaces, gathers the data in a way that avoids accidental code execution from malicious packages, and integrates with a continuously updating threat catalog 👌

Securing CI/CD in an agentic world: Claude Code Github action case
Microsoft's Dor Edry and Amit Eliahu demonstrate how Anthropic's Claude Code GitHub Action could leak CI/CD secrets when processing untrusted GitHub content like issues, pull requests, and comments. While the Bash tool ran inside a Bubblewrap sandbox with environment variables scrubbed, the Read tool bypassed that isolation and could read /proc/self/environ directly, exposing ANTHROPIC_API_KEY and other runner credentials.

To confirm exploitability, they hid a prompt injection inside an HTML comment that instructed the agent to read sensitive files and truncate the first seven characters of any API key found, bypassing Claude's safety filters and GitHub's Secret Scanner. From there, exfiltration was possible via WebFetch, Bash, or issue comments. Anthropic patched the issue in Claude Code 2.1.128 by blocking access to sensitive /proc files.

💡 Making agents useful and secure is tough, lots of sharp edges. Bypassing safety filters/secret scanners via truncating the secret prefix + evading the sandbox is clever. See also the “Research methodology” section at the bottom which is neat: first they used an AI model to do automated, black-box research, then fed the AI model the target Actions codebase and the obfuscated Claude SDK for a human/AI white box collaborative security audit.

Blue Team

grepstrength/malsnitch
Tool by Kevin Winborne that scans malware artifacts like string dumps, FLOSS output, or Binary Ninja exports to extract embedded secrets such as C2 credentials, crypto keys, API tokens (GitHub PATs, AWS, Stripe, Slack), exfiltration channel credentials (e.g. Discord webhooks, Telegram bot tokens), and hardcoded SMTP/FTP/HTTP credentials.

Cisco-Talos/EvidenceForge
By Cisco Talos: An open-source tool that generates realistic, multi-format security logs for threat hunting training by solving the core problem of synthetic data: cross-source consistency. EvidenceForge uses a canonical SecurityEvent model to emit 20+ correlated log formats (Windows Security, Sysmon, Zeek, eCAR EDR/XDR, syslog, bash history, Snort, web access, and proxy logs) from a single source of truth, ensuring LogonIDs, PIDs, timestamps, and Zeek UIDs match across all outputs. A causal expansion engine adds prerequisite events with realistic timing, like DNS queries before connections and Kerberos TGT/TGS before domain logons, and a Hawkes process models user activity including Monday login storms and Friday early departures. Network visibility modeling further determines what each sensor can realistically observe.

Scenarios are authored through Claude Code or Codex agent skills that draw on MITRE ATT&CK, while log generation itself is fully deterministic with no LLM calls. A 4-pillar evaluation framework then scores the output across parseability, plausibility, causality, and timing.

Red Team

Mythic Embarking on the Open Seas: Containerized Payload Delivery for Kubernetes Assessments
SpecterOps' Alex Rodriguez introduces two new Mythic C2 extensions, container_wrapper and container_registry, that streamline containerized payload delivery for Kubernetes assessments by wrapping Mythic payloads into OCI-compatible containers and hosting them in self-managed registries. The container_wrapper uses Buildah to package payloads into container images via Mythic's web UI, while container_registry uses skopeo to push these images to a distribution-based registry deployed behind an HTTPS redirector. This approach lets operators hand clients a simple Kubernetes manifest to deploy in their own clusters, eliminating manual Docker CLI workflows and supporting container-centric penetration testing in environments where worker nodes have internet egress and no restrictive image policies are enforced.

Don’t Jump the Turnstile: Lessons from the Field
SpecterOps's Zach Stein describes how he beat email phishing sandboxes with Cloudflare Turnstile on a red team engagement, after the same sandbox had defeated every standard evasion he reached for. He first used the Satellite framework to block Linux user agents, but the sandbox swapped to a Windows user agent. He then added a filter page that only redirected after mouse movement, but the sandbox crawled the redirect URL straight out of the HTML. So he implemented Turnstile as a CAPTCHA-like verification layer that hides the redirect URL from the page source, keeping sandboxes from crawling to the payload while looking legitimate to a real user.

Zach published the build as Turnstyle, a Flask WSGI app fronted by Apache, mod_wsgi, and a Certbot certificate, with an Ansible deployment script in the repo.

AI + Security

Quicklinks

How Android helps keep you safe from impersonation scams with fake call detection - New feature helps protect you from scammers using AI deepfakes to impersonate your contacts. Shout-out to my friend Rachel Tobac for helping inspire the work 🤯
[NEW] Discover shadow AI agents via the browser - Most AI agent discovery tools rely on APIs. The problem? A lot of agentic AI platforms don’t expose agent details via an API. Nudge Security just closed this blindspot with browser-based AI agent discovery.*
r/ollama - 30 Days of an LLM Honeypot

^*Sponsored

visa/visa-vulnerability-agentic-harness
By Visa: An open-source agentic SAST pipeline that uses Claude, OpenAI, or any combination of frontier models for autonomous vulnerability discovery. It prioritizes triage speed over raw discovery volume, with Mean Time to Adapt as its primary metric. Threat modeling focuses the attack surface, multi-agent deterministic voting reduces false positives, and structured triage artifacts compress the path to actionable findings.

The pipeline runs nine stages, beginning with attack surface mapping and STRIDE/OWASP threat modeling, then specialized research lenses for language, crypto, logic bugs, access control, batch/ETL, and IaC, followed by adversarial verification and exploit chain construction. Output includes Markdown reports and SARIF 2.1.0 artifacts.

CiscoDevNet/foundry-security-spec
Cisco has released the Foundry Security Spec, an open specification for building agentic AI security evaluation systems, distilling production lessons into 130 functional requirements across eight core agent roles (Indexer, Cartographer, Detector, Triager, Validator, Reporter, Coverage Guide, and Orchestrator) plus five optional extensions (Deep-Tester, Variant-Hunter, Attack-Mapper, Remediator, Self-Improver).

The spec is deliberately infrastructure-agnostic with explicit [NEEDS CLARIFICATION] markers for organization-specific decisions, designed to be consumed via spec-kit's clarify-specify-plan-implement workflow rather than shipped as runnable code. Foundry implements a detection-to-prevention flywheel where exploratory agents hunt alongside CodeGuard rule sweeps, recording rule gaps that get generalized back into the corpus, so each evaluation improves both detection across all future targets and prevention in developers' LLM coding assistants. See also the companion blog post by Omar Santos.

💡 Very cool project idea: a specification for building your own AI-powered code scanner. You customize the spec with your environment, and then it builds according to that. It makes me think of some SciFi show where you plug in your requirements, and then it materializes food or whatever you’re imagining. “Replicators” in Star Trek.

Misc

Joshua Saxe - Banning Mythos represents a basic misunderstanding of AI cybersecurity
David Sacks on the Fable ban
Katie Moussouris - The Fable 5 Export Controls Harm US Cyber Defense
Hank Green - The Riskiest Moment of the AI Bubble
Bloomberg - Ray Dalio on the bond market, a weaker dollar driving gold demand, and AI bubble concerns
Bloomberg - Inside Anthropic, the $965 Billion AI Juggernaut - Neat profile of Dario and Daniela Amodei, with a little Boris.

Music

Lea Salonga's Audition for Miss Saigon
Row Row Row Your Boat but in different keys at the same time
"Go the Distance" - Broadway's Leading Men Concert
Colors of the Wind - Live Orchestra | Pocahontas
Charles Cornell - Phil Collins Made EVERY Parent Sob With 1 Key Change | You'll Be In My Heart - My dad loved Phil Collins, I remember watching Tarzan with him when I was young 🥹 I burned a few of the best songs from the soundtrack onto a CD when I was in school, and we’d listen to it in the car when we were driving somewhere together, like my karate practice. I could see this song hitting very different as a parent.

Misc

Midjourney Medical - Holy cow, the image generation site is getting into healthcare, and planning to build a better body scanner, and spa in SF 🤯
When Pikachu gave the most epic speech in Pokemon history
A day in the life of a personality-maxxer
Throwback NFL commercials are the best
Jon Oliver - Moderator: 1 | Politician: 0
Mark Manson - 20 Years of Therapy Summarized in 13 Minutes
Trying to understand investing and stock price 😂
Leila Hormozi - Do This & Watch How Fast People Stop Disrespecting You
Sygnia - Velvet Ant’s Operation Highland: How a China-Nexus Actor Infiltrated an Internal Network Undetected
What Dreamworks Understands About Evil That Disney Doesn't

✉️ Wrapping Up

Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.

If you find this newsletter useful and know other people who would too, I'd really appreciate if you'd forward it to them 🙏

Thanks for reading!

Cheers,
Clint

P.S. Feel free to connect with me on LinkedIn 👋