Select your country

Not finding what you are looking for, select your country from our regional selector:

Search

6 of 6 - Forging forward with GenAI

Wicus Ross
Senior Security Researcher

 

Introduction

Part 6, this blog post, will be the final blog post in this series. This blog post will explore the unique security challenges that are present in systems that rely on generative artificial intelligence (genAI) as part of its workflow.

The preceding blog posts in this series are located here:

  • Part 1 Recap of the Security Navigator 2025 chapter on AI

  • Part 2 The potential impact of genAI on the world

  • Part 3 High-level overview of how large language models (LLMs) function

  • Part 4 Continuing and wrapping up part 3

  • Part 5 Examining agentic AI and the various pieces of this new paradigm

Everyone is a programmer

On 17 June 2025 Andrej Karpathy delivered a keynote address at the Y Combinator AI Startup School in San Francisco that claimed that LLMs are basically a new type of operating system (OS)1. Karpathy reasons that this new paradigm is born from an evolution of software.

Originally, programs were written with a programming language or computer code, and this executed natively on a computer. Karpathy refers to this as “Software 1.0”. This evolved with the introduction of neural networks. Here weights are tuned and is “executed” by the neural network to give birth to “Software 2.0”. Karpathy reveals that software has changed again, and programs are written using natural language in the form of prompts. These prompts are interpreted by LLMs that result in actions and outcomes. This new way of creating “programs” or “instructions” means that LLMs can be treated as an OS effectively.

Natural language makes for a very rich and powerful medium through which we can interface with machines when using an LLM. The challenge is that there is a lot of room
for ambiguity or misinterpretation. Clarifying intent and managing expectations of the user requires clear context for the benefit of the LLM, but humans can be intentionally vague, manipulative, careless, lazy, etc.

Classical programming languages, the “Software 1.0” paradigm, have syntax rules that are enforced by a compiler or a run-time environment. These are, for the most part, well defined and exact. This ensures that, for the most part, a system behaves in a predictable manner. Over the years we have learned that certain defensive programming practices must be employed to guard against error or exploitation.

LLMs, in the “Software 3.0” paradigm, are less strict regarding the “programs” they execute. This can result in any number of outcomes when combined with other loosely defined elements that an LLM can leverage. The non-deterministic and stochastic properties of neural networks, its weights, along with other execution parameters is responsible for this “emergent behavior”.

Emergent behavior, as defined using George Dyson’s definition, reads:

“Emergent behavior is that which cannot be predicted through analysis at any level simpler than that of the system as a whole.Emergent behavior, by definition, is what’s left after everything else has been explained.”2

Dan Geer and Dave Aitel highlight this problematic scenario and the implications this has for cybersecurity frameworks:

“Current cybersecurity frameworks—such as the National Institute of Standards and Technology’s Secure Software Development Framework or Open Worldwide Application Security Project guidelines—assume predictable human coders and defined accountability. These assumptions collapse beneath the weight of AI’s velocity, opacity, and autonomy.”3

LLMs are an important underpinning of the current agentic AI ecosystem. Chaining these agentic AI services together makes for a large and complex system with many possible unknown outcomes. This is mostly due to the fact that LLMs process natural language as “instructions” but also as “data”. Mixing data and instructions in the same “execution pipeline” breathes life into this emergent behavior.

What recourse exists when one of these autonomous systems produces unwanted results? Can consumers expect recompence if something goes wrong and who determines that? Certainly, malicious users will do their best to identify and exploit weaknesses in these systems at the expense of someone else.

It is now possible for anyone using a few words to script an agentic AI system into performing serval tasks on behalf of the user, which previously would have required expert programmers to accomplish. Everyone can now create instructions for machines.

https://www.youtube.com/watch?v=LCEmiRjPEtQ

George Dyson, Darwin Among the Machines, Addison-Wesley, 1997

3 https://www.lawfaremedia.org/article/ai-and-secure-code-generation

Efforts afoot

Secure software development practices have evolved a lot since the early days of the internet. The software development community realized quickly that generative AI has potential, but also there are aspects of the technology that require defined best practices and mitigations to limit the impact of malicious users and to guard against unforeseen side-effects.

Community groups such as OWASP and the Cloud Security Alliance (CSA) were among the first to publish guidance on how to secure generative AI powered applications. Here is a non-exhaustive list of prominent projects that address challenges in this space:

  • OWASP Top 10 for LLM Application 4
  • OWASP Securing Agentic Application Guide 1.0 5
  • OWASP Multi-Agentic system Threat Modelling Guide v1.0 6
  • OWASP Agentic AI – Threats and Mitigations 7
  • CSA Agentic AI Red Teaming Guide 8
  • CSA Agentic AI Identity & Access Management 9
  • CSA Secure Agentic System Design: A Trait-Based Approach 10
  • CAS Agentic AI Threat Modeling Framework: MAESTRO 11
  • MITRE ATLAS (Adversarial Threat Landscape for Artificial Intelligence Systems) 12

The list of resources is growing; some address specific elements of agentic AI systems such as weakness in the model context protocol (MCP) itself.13

Individually these resources provide good guidance based on current capabilities of LLMs and their respective ecosystems. LLMs as a technology are rapidly growing in capability as a lot of effort is invested due to fierce competition among vendors in this space. The authors of these resources will have to keep pace with the emergent threat landscape as well in conjunction with LLM capabilities.

These guidelines and best practices are based on a common understanding of how systems work and the perceived impact. Please evaluate these in terms of your own use case and implementations to find an acceptable balance between risk mitigation and threat identification.

4https://genai.owasp.org/llm-top-10/
5https://genai.owasp.org/resource/securing-agentic-applications-guide-1-0/
6https://genai.owasp.org/resource/multi-agentic-system-threat-modeling-guide-v1-0/
7https://genai.owasp.org/resource/agentic-ai-threats-and-mitigations/
8https://csa-website-production.herokuapp.com/artifacts/agentic-ai-red-teaming-guide
9https://cloudsecurityalliance.org/artifacts/agentic-ai-identity-and-access-management-a-new- approach
10https://cloudsecurityalliance.org/artifacts/secure-agentic-system-design
11https://cloudsecurityalliance.org/blog/2025/02/06/agentic-ai-threat-modeling-framework-maestro
12https://atlas.mitre.org/
13https://adversa.ai/mcp-security-top-25-mcp-vulnerabilities/

Threat modeling

A security model, in simple terms, must consider identity and associated privileges, the sensitivity of the data accessed, the execution environment that will perform operations on said data, the sensitivity of the generated output and where this output is written to. This is not a complete model, but it serves as a reminder that there are many aspects involved when designing and implementing information systems.

The resources listed in the previous section share many overlapping ideas or concepts and read together these provide a strong initial reference point. Here is a condensed list of universal security themes from the OWASP AI Multi-Agent Threat Model Guide:

  • Untrusted input is dangerous and must be handled carefully.
  • Trust boundaries are fluid and span several systems.
  • Behaviors are influenced by self-organizing heterogeneous components without central control.
  • Information flows are dynamic.
  • Memory and learning can play a role in future context aware interactions.

The controls are necessary because of:

  • Expanded attack surfaces
  • Trust, bias, and adversarial exploitation
  • Agent coordination failure due to dynamic environments
  • Attacker-in-the-middle scenarios
  • Lack of accountability and traceability of decisions
  • Increased identity and access control complexity

The OWASP Agentic AI – Threats and Mitigations define 15 threats, as part of its Agentic Security Initiative (ASI), with appropriate mitigations that are mapped to the seven layers of the CSA’s Multi-Agent Security Threat Risk and Outcome (MAESTRO) framework.

 

MAESTRO LayerOWASP ASI Threat ID
1. Foundational ModelT1 – Memory poisoning
T7 – Misaligned and deceptive behavior
2. Data OperationsT1 – Memory poisoning
T12 – Agent Communication poisoning
3. Agent FrameworksT2 – Tool misuse
T6 – Intent breaking and goal manipulation
T5 – Cascading hallucinations
4. Deployment infrastructureT3 – Privilege compromise
T4 – Resource overload
T13 – Rogue agents
T14 – Human attack on multi-agent system
5. Evaluation & observabilityT8 – Repudiation and untraceability
T10 – Overwhelming human in the loop
6. Security & complianceT3 – Privilege compromise
T7 – Misaligned and deceptive behavior
7. Agent ecosystemT9 – Identity spoofing
T13 – Rogue agents
T14 – Human attacks on multi-agent systems
T15 – Human trust manipulation

 

The MAESTRO framework is much more flexible and goes beyond the ASI threat mappings. The cross-layer mapping highlights an important aspect of threat modeling with agentic AI systems. Designers and implementers must be aware that threats are fluid, and they might only instantiate over time through a gradual residual build or through contagion by exposure to certain weaknesses. These can also be described as cascading trust failures when one breakage or failure causes a break in the chain of trust. That is why trust relations can never be implicit and every interaction between systems must be explicitly verified in the context. This is especially true because of emergent behaviors.

 

MAESTRO Cross-layer ThreatOWASP Threat ID
Inter-agent data leakage cascade and goal misalignment cascadesT1 – Memory poisoning
T3 – Privilege compromise
T12 – Agent communication poisoning
Privilege compromise and lateral movementT3 – Privilege compromise
T14 – Human attacks on multi-agent system
Planning and reflection exploitationT6 – Intent breaking and goal manipulation
T7 – Misaligned and deceptive behaviors
Supply chain attacksCompromise component (library or module) at one layer to impact other layers

 

The MAESTRO framework describes a simple step-by-step approach that can be followed to implement it. The devil is in the details as each system has its peculiarities that need to be identified, and nothing is ever simple when it comes to enterprise level applications. Similarly, the list of mitigations is also high-level and depends entirely on each agent’s use case.

Trait-based approach

Evaluating these resources highlights one thing and that is there is no one way to solve any problem. As with classical “Software 1.0” applications we need to evaluate each environment and deployment based in the use case and define the scope. The emergent behavior property of generative AI makes this rather challenging in terms of existing perimeter-based security models and deterministic assumptions about legacy software.

CSA’s publication titled “Secure Agentic System Design: A Trait-Based Approach” addresses this by complementing existing security standards and approaches rather than replacing these. The approach is justified as it provides:

Improved decision-making through a transparent mental model

  • Proactive risk management
  • Scalability and adaptability
  • More flexible security design
  • Encouraging modular thinking without forcing reuse
  • Easier threat modeling and security audits.

The traits are broken down into the following 7 categories : 

  1. Controls & orchestration
    • How actions are managed in a centralized or decentralized manner.
  2. Interaction & communication
    • Direct or indirect agent-to-agent information exchange.
  3. Planning
    • Reactive or plan-based decision-making of agents and the origin of these decisions.
  4. Perception & context
    • Limited, contextual, or intent inferring interpretation based on an agent’s environment.
  5. Learning & knowledge sharing
    • Local or global improvement of agent’s capabilities.
  6. Trust
    • Is trust explicitly defined or is it assumed, and how is it enforced.
  7. Tools usage
    • Directed or supervised tool use.

These measures supplement existing cybersecurity frameworks and help to create a resilient and trustworthy environment by:

  • Establishing best practices for security architecture across a system
  • Addressing and eliminating common vulnerabilities and known attack vectors that act as prerequisites or amplifiers of more specific attacks
  • Complementing a defense-in-depth strategy through specific mitigations derived from trait-based analysis

In addition, the principal of least privilege must be enforced throughout the system. Decision transparency and traceability, along with explainability will be valuable when performing incident response and system triage. Input validation and sanitization will be as important as ever to ensure system wide protection. Resource monitoring and throttling of agent resource usage will be crucial to identify anomalies as well as identifying service abuse. Likewise, anomaly detection and behavioral monitoring will remain a crucial feature of any modern system.

Step-by-step approach for trait-based security analysis

Trait-based security analysis is an iterative process that allows system designers and implementers to identify requirements and initial scope by asking questions such as:

  • What is the purpose of the system?
  • What are the core capabilities?
  • Who are the stakeholders and users of the system?
  • What are the operational constraints of the system?
  • How sensitive is the data and what privacy concerns should be considered?
  • What are the high-level security goals?
  • What fail-safe mechanisms must be present?

Next the key agentic traits and their patterns must be identified. Select traits from the list of 7 categories.

Once the traits and patterns are identified, analyze interactions and assess the associated risks. Each pattern and trait will have known risks associated with the respective trait. Interactions that cross boundaries that impact other traits are also highlighted.

Once this task is completed, design and implement mitigations. Each trait pattern has a list of risks and applicable mitigation. Risks must be prioritized based on likelihood and potential impact of the risk with the specific system being analyzed.

Finally, the loop reaches the monitor, evaluate and adapt phase that acts to identify areas to correct or improve. Use this to close the feedback loop and adjust the security posture appropriately. This requires periodic review and analysis of the system.

Trait-based analysis and threat modeling

The trait-based analysis is well suited to working with existing security practices, software development lifecycles, etc. including threat modeling. It is possible to map the MAESTRO framework and trait-based categories to perform a system decomposition of agentic traits in terms of potential threats for a given system. The following is an example taken from the CSA Secure Agentic System Design: A Trait- Based Approach document mentioned in the “Efforts afoot” section:

 

Trait categoryRelevant MAESTRO layers
Control & OrchestrationLayer 3 (Agent Frameworks),
Layer 4 (Deployment)
Interaction & CommunicationLayer 7 (Agent Ecosystem),
Layer 5 (Evaluation & Observability)
PlanningLayer 1 (Foundation Models),
Layer 3 (Agent Frameworks)
Perception & ContextLayer 2 (Data Operations),
Layer 5 (Evaluation & Observability)
Learning & Knowledge SharingLayer 1 (Foundation Models),
Layer 2 (Data Operations)
TrustLayer 6 (Security & Compliance),
Layer 7 (Agent Ecosystem)

 

This mapping can also be useful when using other tools such as MITRE’s ATLAS matrix to identify specific tactics for a given trait category. For example, “Decentralized Control”, that falls under the Control & Orchestration trait, could be linked to MITRE ATLAS “Takeover Attacks” or “Policy Bypassing” tactics.

“Traits and Patterns” were not covered in much detail and could easily require its own blog post to discuss. The “Traits and Patterns” section describes risks that could originate from outside the system but also could be risks that materialize because of design or implementation faults.

Agentic AI and identity

Authentication (Authn) and Authorization (Authz) are fundamental security building blocks in cybersecurity. The ability to identify someone or something (Authn) allows us to establish a level of trust and enables us to grant permissions (Authz) to access certain resources. Having an identity is crucial to enforce zero-trust principles.

The OAuth 2.0 specification is the required best practice for multi context protocol (MCP) servers.14 This requirement is fine for simple end user interactions with clients-to- server interfaces. It does not cover use cases where an MCP server must interface with another agent. How does this first MCP server authenticate with the delegated MCP servers? Whose identity does it use? Are the users’ credentials passed onto the downstream servers?

Service accounts or non-human identities (NHI) are assigned to services that require certain permissions. A service’s NHI is either derived from the account that installed the service, or it is derived through explicit means. The CSA Agentic AI Identity & Access Management publication highlights several shortcomings with available options. OAuth 2.0 credentials can be used by NHI to authenticate with APIs, but this lacks behavioral awareness and session integrity. Additionally surrogate secrets or certificates can act as authentication material, but this lacks real-time behavior verification and requires special addons, especially in dynamic environments. Also, role-based access control (RBAC) linked to human accounts is ripe for abuse as this can lead to excessive
permissions and these roles are for the most part also static.

Existing identity and access management (IAM) must adapt to the unique demands of agentic AI. CSA motivates the need for NHI of agentic AI because of:

  • Emergent behavior of LLMs that leads to unpredictability of autonomous systems.
  • Ephemerality and dynamic lifecycles that are associated with the fluid environments, especially because of tool use.
  • New and evolving capabilities with unchecked intent.
  • The need for verifiable provenance and accountability when client impacting decisions are made by systems.
  • The risk of autonomous privilege escalation that must be prevented by design.
  • Over-scoped access and permissions combined with unpredictability will lead to potentially high impact incidents.
  • Secure, efficient cross-agent communication and collaboration must be explicitly enforced through cryptographically verifiable protocols such as mutual TLS (mTLS).
  • The risk that actions from autonomous systems may not originate from a human request.

Figure 1 Source: Cloud Security Alliance: Agentic AI Identity and Access Management: A New Approach15

The CSA is proposing a new standard and lists the following essential components for an Agent Identifier (Agent ID):

  • Cryptographic anchor & verifier
  • Core attributes & metadata
  • Capabilities, scope, and behavior
  • Operational & security parameters
  • Verifiable credentials (VCs) which is the key to dynamic attributes and trust

With this there exists Agent ID ownership and control that describes the principles of Self-Sovereign Identity (SSI). Finally, the ID generation, assignment, and lifecycle management process is also present to regulate how the Agent ID is used throughout the process and when it is revoked, renewed, etc.

15https://cloudsecurityalliance.org/artifacts/agentic-ai-identity-and-access-management-a-new- approach

All of this is required to support the new agentic AI Identity and Access Management framework architecture, which is comprised of:

  1. Zero-trust architecture for agentic AI.
  2. Decentralized identity management.
  3. Dynamic policy-based access control (PBAC).
  4. Authentication delegation framework.
  5. Continuous monitoring and behavioral analytics.
  6. Secure communication protocols.

There exists proof of concept code for these proposals 16 17, but at the time of writing none of this has been adopted as a standard, which suggests that there will be a lot of roll-your-own solutions, or none.

16https://github.com/akramIOT/Agentic-IAM
17https://github.com/kenhuangus/agent-id-sdk

Governance and agentic AI

Security without oversight will always be ineffective and clash with the business it serves. It is important for business leaders and security teams to work together to get the most out of a project that drives automation using generative AI. 18 At the same time, it is important to understand the risks and learn from mistakes but also learn from what works.

An agentic AI council or team must be established with each business. This team is responsible for the overall agentic AI strategy and oversight that allows the business to control and measure the impact of agentic AI on the business. Ideally this team must be led by stakeholders that report to the board and staffed with experts from the business to create a cross-functional team.

The agentic AI group must maintain a live register of agents and their use cases, track what data the agents have access to, and what controls are present that govern these use cases.

The agentic AI group is responsible for assessing the security posture of these agentic AI use cases, ideally before these go live.

The team must hold feedback and postmortems sessions to learn from any successes and failures to ensure the outcomes are shared with the rest of the organization, especially the board. These must be converted into benchmarks for future use cases.

The agentic AI team must establish guidelines for guardrails and when human intervention is required. The team must establish guidance for use cases and map these to desired outcomes. Any scenario that impacts customers must be highlighted and reviewed to ensure that timely and appropriate human level intervention is enacted to avoid unwanted outcomes.

Existing strategies and security principles can be leveraged to define policies for:

  • Human and non-human identities and how these are monitored and alerted on.
  • Least privileged accounts that limit access to and actions on data and systems.
  • Enforcing expiring credentials and access tokens.
  • Auditability and accountability for all activities associated with agentic AI driven systems.

It is this agentic AI team’s responsibility to ensure that zero-trust principles are baked into agent AI use cases from the outset.

18https://www.paloaltonetworks.com/blog/2025/09/agentic-ai-looming-security-crisis/

Conclusion

The flow of information is vital for economies and the need to process information is growing by the day. At the same time the need for deeper insight and faster synthesis of information is growing equally if not more. LLMs are a breakthrough that satisfy this need.

This results in more complex information flows, requiring richer and more sophisticated controls to ensure legitimate access to information for the purpose it was intended. Agentic identity will be crucial as this allows system designers to enforce concepts such as least privilege access and to hold relevant parties accountable.

Security models must evolve to accommodate a new type of risk that originates from non-deterministic behavior of agentic AI systems. Emergent behavior makes it difficult for system designers and administrators to limit the impact of unforeseen behavior or the manipulation of the system through malicious actors. Trait-based security models work with existing processes to best scope, define, mitigate, and possibly even eliminate potential unwanted events.

Secure by design and secure by default must be top of mind for all businesses now that everybody is effectively a programmer.

Incident Response Hotline

Facing cyber incidents right now?

Contact our 24/7/365 world wide service incident response hotline.

CSIRT