Good primer on tokenizers for security researchers. Expect parser differential vulns soon.
- A prompt that Prompt Guard 2 flags as safe 2. The target LLM accepts the same prompt and understands it well enough to trigger a bypass of the system prompt
This is a very important point that people tend to miss. An injection actually needs to make the underlying LLM do a bad thing.