Tag
#agent-security
12 posts tagged agent-security.
- red-team
LLM Attack Taxonomy: Prompt Injection, Agent Hijack, and What's Hitting Production
A practitioner's map of LLM attack classes — from direct prompt injection and jailbreaks to indirect injection, RAG poisoning, and agent tool-call abuse — organized by OWASP 2025 and MITRE ATLAS.
- prompt-injection
Prompt Injection Examples: Attack Payloads by Class
Concrete prompt injection examples across five attack classes — direct override, system-prompt leak, indirect RAG poisoning, agent tool-call hijack, and multimodal smuggling — with PoC payloads and defender actions.
- jailbreak
LLM Bypass Techniques: Attack Families, PoC Patterns, and Why Guardrails Keep Failing
A practitioner map of LLM bypass technique families — prompt injection, jailbreak personas, encoding obfuscation, RAG poisoning, and agent-specific
- red-team
AI Red Team: Methodology, Tooling, and the Attack Surface That Actually Matters
A practitioner's guide to AI red teaming — what makes LLM attack surface different from traditional app testing, the techniques that reliably produce
- prompt-injection
Prompt Hacking: A Practitioner's Taxonomy of LLM Attack Classes
Prompt hacking covers three distinct attack classes against LLMs: direct injection, indirect injection, and jailbreaking.
- red-team
The Audit Gap: Why Red-Teaming Can't Certify Governance Claims
A new position paper by Seth and Sankarapu formalizes the structural mismatch between what AI governance frameworks require evaluators to verify and what
- prompt-injection
Prompt Injection in 2025: OpenAI vs. Broken Defenses
OpenAI's November 2025 advisory on prompt injection arrived the same week a 14-researcher arXiv paper showed adaptive attacks achieve >90% success against
- prompt-injection
LLM Prompt Injection: From Instruction Override to Agent Takeover
A practitioner's breakdown of how LLM prompt injection payloads are constructed, why the threat class changes when agents can invoke tools, and what
- prompt-injection
Prompt Injection Examples: A Practitioner's Attack Library
A technical breakdown of real prompt injection examples — direct, indirect, multimodal, and RAG-poisoning attacks — with conditions, payloads, and what
- prompt-injection
LLM Prompt Injection: Taxonomy, Real Patterns, and Defenses
A technical breakdown of LLM prompt injection — direct, indirect, and agent-targeting variants — grounded in real-world attack patterns observed in
- prompt-injection
Prompt Injection Attack: Techniques, Variants, and Defenses
A practitioner's breakdown of prompt injection attacks — direct, indirect, and multi-modal — covering the HouYi framework, real CVEs, and mitigations that
- red-team
LLM Security: A Practitioner's Map of the Attack Surface
What LLM security actually means in 2026 — the attack classes red teamers test, the controls that hold up under fire, and the frameworks that map the territory.