Good primer on tokenizers for security researchers. Expect parser differential vulns soon.


  1. A prompt that Prompt Guard 2 flags as safe 2. The target LLM accepts the same prompt and understands it well enough to trigger a bypass of the system prompt

This is a very important point that people tend to miss. An injection actually needs to make the underlying LLM do a bad thing.