• tl;dr sec
  • Posts
  • [tl;dr sec] #229 - Prompt Injection Defenses, Security @ OpenAI, Microsoft's Deception Infrastructure

[tl;dr sec] #229 - Prompt Injection Defenses, Security @ OpenAI, Microsoft's Deception Infrastructure

New repo surveying prompt injection defenses, how OpenAI uses LLMs for internal security, insights on MS's honeypot infra

Hey there,

I hope you’ve been doing well!

🪩 BSidesSF and RSA

Brace yourself- conference week is nigh upon us!

I’m super excited for my BSidesSF talk at 1:30pm on Saturday, where I’m going to distill more applications of AI to cybersecurity than I’ve ever seen in one place. It will be the highest info density talk I’ve ever given 😅 

I’ll have a limited number of tl;dr sec t-shirts on me (see the Semgrep booth for more), and plenty of stickers.

If you’re in town for RSA, here’s the authoritative list of like 100+ RSA parties.

Personally, I’m excited for the Dark Reddit party and Security Soirée with Semgrep, Harness, and Trail of Bits.

I know you already know this, but it’s totally fine to not drink. I usually don’t drink, or at most, I have like one the whole night.

Hope to see you around! Feel free to come say hi, or give a fist bump or knowing nod.

🔥 Announcing: prompt-injection-defenses

We're two years into the history of prompt injection, and there has been some really cool research on possible defenses.

So much so, that I think we needed a tl;dr - so I teamed up again with my bud Rami McCarthy to put together this GitHub repo compiling all the great research we've seen.

Check it out for a half dozen approaches to Instructional Defense, Guardrails & Overseers, Firewalls & Filters, Ensemble Decisions, Canaries, and a variety of research proposals that aren't deployed in practice quite yet.

Please let us know what you think! Contributions welcome 🙏


📣 CVE Management is Painful - Find out why.

Ever feel like CVEs are the ultimate headache?

Chainguard’s recent research highlights the thousands of hours companies spend on vulnerability management, and how the process of identifying, triaging, and remediating CVEs proves to be a significant endeavor. See how organizations across industries handle vulnerability management tasks - from the complexities of remediation, where ease of upgrading and testing software plays a pivotal role, to the impact on developer productivity and business tasks.

Learn more about the key findings shaping container security strategies. 

👉️ Read the full report 👈️

I’ve had friends at multiple companies tell me that Chainguard’s minimal container images save them a huge amount of toil, freeing security headcount up for other things 👍️ 

BSidesSF Talks

There are far too many awesome talks to list, but I wanted to call out a few:


(The) Postman Carries Lots of Secrets
Truffle’s Joe Leon walks through how they found ~1,700 unique credentials, and estimate >4,000 live credentials, to be leaking publicly on Postman’s Public API Network, for a variety of popular SaaS and cloud providers, plus a new TruffleHog command to scan Postman workspaces.

Using feature flags for security
Alex Smolen shares some of the ways they use feature flags for security at Launch Darkly: uncoupling code deployment from security code review (devs can deploy unreviewed code changes to production and security review can occur before release without slowing down shipping), granting access to external security testers (let bug bounty hunters, pen testers, or authenticated scanners test new features before they’re generally available), and faster security incident remediation (turn off problematic code until mitigated).


📣 What REALLY goes into prioritizing application risks? 🌈💎

Prioritization is a hot topic in AppSec—and for good reason. 

Teams are inundated by alerts from SAST, DAST, SCA tools, and security processes. Without question, many of those are false positives. Some are vulnerabilities that are unlikely to be exploitable. And a few might be catastrophic risks to your business. 

How do you zero in on the real risks? 

Apiiro’s ASPM leverages deep code analysis and runtime context to go beyond simple alert aggregation with prioritization based on the risk likelihood and impact. Dive into our blog to uncover Apiiro’s secret sauce behind effective risk prioritization!

The GIF of Apiiro’s new interactive contextual prioritization funnel is pretty slick, filtering down from all risks to those that are Internet-facing. Filters to slice and dice data are 👌 

Cloud Security

By Padok: A script that implements different Cognito attacks, such as unwanted account creation, account oracle, and identity pool escalation.

Semgrep for Terraform Security
Excellently detailed and practical post by Rami McCarthy on evangelizing secure-by-default modules, implementing hard guardrails and invariants, securing CI/CD by detecting malicious Terraform that can execute arbitrary code, and highlighting subtle footguns.

I think secure defaults / guardrails are one of the highest leverage things we can do as security professionals, and it’s great to see concrete examples, plus Semgrep rules you can use for free 🙌 Also, there’s a great References section, love to see it!

Container Security

Image signing validation on K8s
Brian Davis walks through using Sigstore's policy-controller for image signing validation on Kubernetes, and demonstrates the enforcement working using Kind (Kubernetes IN Docker), a tool for running k8s clusters locally.

Fun with Kubernetes Authorization Auditing
Rory McCune discusses the complexity of auditing permissions in Kubernetes clusters with multiple authorizers. Basically, you need to review each authorization system used in the cluster and look at the permissions granted in each one. Be careful when using automated tooling that audits Kubernetes permissions, as in most cases it likely will only support RBAC, and won’t provide any information about rights granted in other authorization systems.

Supply Chain

A GitHub Action that given an organization or repository, produces information about the contributors over the specified time period.

A tool that allows Rust binaries to be auditable by embedding data about the dependency tree in JSON format into a dedicated linker section of the compiled executable, enabling scanning for known bugs or vulnerabilities in production at scale. See also: chalk.

By AppThreat’s Prabhu S, Caroline Russell, et al: A Binary Linter to check the security properties and capabilities in executables (e.g. does it use network connections or do file operations?). It can also generate SBOMs for binaries.

By Adnan Khan: An example repo for GitHub Actions Time of Check to Time of Use (TOCTOU) vulnerabilities, including a tool to monitor for an approval event (either a comment, label, or deployment environment approval) and then quickly replace a file in the pull request (PR) head with a local file specified as a parameter.

Blue Team

Examining the Deception infrastructure in place behind code.microsoft.com
Microsoft’s Ross Bevington describes how they turned what was once a dangling subdomain into a honeypot to collect threat intelligence, allowing Microsoft to monitor malicious activity, better understand the 0day and Nday ecosystem, and gather information about attackers, their tools, techniques, and intentions. Neat details and behind the scenes 👍️ 

JA4T: TCP Fingerprinting and How to Use It to Block Over 60% of Internet Scan Traffic
John Althouse announces the latest additions to the JA4+ family of network fingerprinting tools, that add the ability to fingerprint client and server operating systems, devices, particular applications, hosting characteristics, and even if a connection is going through a tunnel, VPN, or proxy.

If built into a WAF, firewall, or load balancer, it can be used to block 60% of all Internet scan traffic using JA4T fingerprints of known scanners (masscan, ZMap, Nmap), based on GreyNoise data.

Red Team

A site that provides a variety of resources and VMs that can be used to learn about vulnerability analysis, exploit development, software debugging, binary analysis, and more. Challenges: Linux privilege escalation, file system race conditions, buffer overflows, format strings and heap exploitation, etc.

That time I built an LD_PRELOAD worm
lcamtuf discusses when he created a proof-of-concept LD_PRELOAD worm to settle an argument about distributed trust and take a potshot at the paradigm of using su and sudo instead of logging in as root.


How to delete the data Google has on you
Guide by The Verge on how to find your data Google has, manually delete it, turn on auto-delete or auto-delete after a certain amount of time.

FTC Says Ring Employees Illegally Surveilled Customers, Failed to Stop Hackers from Taking Control of Users' Cameras
Under proposed FTC order, Ring will be prohibited from profiting from unlawfully accessing consumers videos, pay $5.8 million in consumer refunds.

AI + Security

How we built Text-to-SQL at Pinterest
Awesomely detailed post on how Pinterest built a Text-to-SQL feature to assist data users with transforming analytical questions directly into code. They share their prompts and overall architecture. This is a great example of iterating on / “productionizing” an LLM use case, beyond just an initial prototype.

The I in LLM stands for intelligence
Curl maintainer Daniel Stenberg shares his frustration with convincing-ish looking bug bounty submissions written by LLMs that hallucinate/aren’t real issues that nonetheless take away maintainer time from doing actual important work.

See also these complaints from Joda-Time on LLM-driven invalid CVEs.

Go Beyond Gatekeeping: A Systems Design Approach to Security Engineering
In this SANS CloudSecNext 2023 keynote, OpenAI’s Karthik Rangarajan shares some great insights and lessons learned from his experiences scaling security in a hyper growth environment (great Paved Road / secure defaults content), and concludes with some examples of how OpenAI is using AI internally, including triaging bug bounty submissions, helping with access management, triaging #security Slack channel questions, an SDLC bot, and a bot to triage detection alerts.

LLM Agents can Autonomously Exploit One-day Vulnerabilities
Paper: “We show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems. We collected a dataset of 15 one-day vulnerabilities that include ones categorized as critical severity in the CVE description. When given the CVE description, GPT-4 is capable of exploiting 87% of these vulnerabilities.”

Chris Rohlf has a nice rebuttal of the overall methodology and claims; basically that all of the CVEs have PoCs that can be easily Googled, the paper does not indicate that models can recreate the exploitation. CVE co-founder Steve Coley also expressed doubts, and this meme made me laugh.

Personally, I think it’s great that the authors are sharing their work, even if there is room for improvement. It’s important for all of us to collectively push security forward together, and it’s good to respectfully critique work in search of truth. We’re all on the same team  


By Project Discovery: A utility tool for identifying the technology associated with DNS/IP network addresses, including CDN, cloud, and WAF providers.

On IBM acquiring HashiCorp
I was surprised by this acquisition. Fintan Ryan explains that HashiCorp is highly dependent on a subset of customers with >$100K ARR (19% of customers), their Net Dollar Retention has declined, and “simply put, this is a business with rapidly slowing growth that cannot support its existing valuation, never mind the valuation at IPO.”

Some AI generated songs to make you laugh

✉️ Wrapping Up

Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.

If you find this newsletter useful and know other people who would too, I'd really appreciate if you'd forward it to them 🙏

Thanks for reading!