Generative AI Security
Generative AI security spans LLMs, multimodal systems, diffusion pipelines, system prompts, and the broader application logic wrapped around these models.
Overview
Generative AI has changed how users interact with models. Instead of static predictions, systems now produce open-ended language, images, code, plans, and tool calls. That flexibility creates new misuse and security issues.
Threat model
Representative threats include prompt injection, jailbreaks, data leakage, unsafe tool use, model theft, alignment bypass, harmful generation, and misuse of retrieval or plugin layers.
Countermeasures
Practical defenses involve layered prompt design, permission scoping, retrieval isolation, model monitoring, output policy checks, abuse detection, watermarking where appropriate, and human oversight for high-risk actions.
Open challenges
A difficult open problem is evaluating security at the system level. The model, prompt, retrieval stack, memory, tools, and user interface together determine the real behavior, so isolated model evaluation is not enough.