수요일, 6월 17, 2026
HomePersonal HealthIntroducing Agent Harness Testing in Cisco AI Protection

Introducing Agent Harness Testing in Cisco AI Protection


Right now, we’re excited to introduce Agent Validation as a brand new analysis functionality in AI Protection: Explorer Version, the free self-service model of Cisco AI Protection, that’s constructed particularly for agentic AI programs. Agent Validation builds on the agentic safety enhancements to Cisco AI Protection introduced at Cisco Stay, which launched adaptive pink teaming, Coverage Studio guardrails, and provide chain discovery for brokers. Agent Validation joins the prevailing suite of pink teaming options, extending Explorer Version’s protection to the surfaces which can be distinctive to agent harnesses: device routes, oblique content material channels, and chronic state throughout periods. 

Agent Validation is the primary functionality in what is going to turn into a broader portfolio of agent harness testing in Cisco AI Protection. We are going to proceed increasing protection as new agent patterns, frameworks, and assault lessons emerge within the menace panorama. 

Why Brokers Want Their Personal Crimson Teaming 

Chat-based pink teaming is important for evaluating how a mannequin handles adversarial prompts, jailbreaks, and multi-turn manipulation. It checks the conversational floor totally, as a result of it’s how most customers work together with most fashions. When a mannequin is wrapped in an agent harness, the scaffolding of instruments, reminiscence, retrieval, and orchestration logic that turns a standalone mannequin into an agent, new assault surfaces seem {that a} conversational evaluator was by no means designed to observe or exploit. 

Brokers learn help tickets, fetch documentation, set up abilities, and write to recordsdata. They might name instruments with arguments the consumer by no means typed or run multi-step workflows that span throughout a number of periods. An attacker who understands agent harnesses might concentrate on plant directions in content material the agent will retrieve, form device arguments in methods the consumer by no means typed, or coerce the agent into modifying persistent state that survives the present session. 

A conversational analysis won’t observe any of this. The chat transcript seems clear.  In the meantime, the precise exploit exists exterior the chat interplay itself. 

We constructed Agent Validation to check the surfaces that matter for agentic programs: 

  • Software routes: what the agent does when its personal professional instruments are invoked with malicious arguments
  • Oblique channels: directions hidden in retrieved paperwork, device outputs, help tickets, and different content material the agent treats as knowledge
  • Persistent state: modifications to coverage recordsdata, workflow definitions, approval state, and put in capabilities that survive previous the present session 

These threats map again to the Cisco AI Safety and Security Framework taxonomy, masking attacker aims like OB-001 Purpose Hijacking, OB-007 Sabotage / Integrity Degradation, and OB-009 Provide Chain Compromise, alongside agent-specific strategies like oblique immediate injection, device parameter abuse, and untrusted ability set up. The framework provides us a shared vocabulary for what we’re testing and why it issues. 

What Makes Our Strategy Completely different 

Each agent deployment has completely different instruments, content material sources, and coverage artifacts; the assault floor is formed by what’s wired into the harness itself. Agent Validation runs an autonomous attacker that performs reside reconnaissance in opposition to your particular agent, builds a structured profile of the assault floor, and adapts if preliminary assaults had been unsuccessful. 

A tough downside in agent pink teaming is understanding whether or not an assault really succeeded. If the agent says “I put in the ability” or “I fetched that URL,” that’s a declare, not proof. Agent Validation solves this with a verification method that produces unbiased floor reality by correlating the agent’s response with what the framework really noticed and with out-of-band telemetry the agent has no motive to deal with as important. A discovering is just marked confirmed when these unbiased indicators agree. 

The Agent Validation UX is three straightforward steps: join an agentic goal, choose Agent Validation because the validation sort, and click on Run. No goal picker, funds slider, or purpose textual content field. Determine 1 reveals this intimately. 

Determine 1. Beginning an Agent Validation Run


Each run executes a pre-defined protection matrix curated by Cisco’s AI Risk Intelligence & Safety Analysis crew—the identical crew that maintains the Cisco AI Safety and Security Framework. The aims cowl oblique immediate injection, system-prompt integrity, device argument abuse, exfiltration, persistence and coverage mutation, functionality chaining, untrusted code paths, and sensitive-data solicitation. 
 

What the Report Delivers 

Determine 2. Protection matrix and overview seen after run completion

 

Each Agent Validation run produces a report organized round what a safety chief must act on: 

  • Protection transparency: aims whole versus aims exercised, so prospects can see truthfully what was executed for any given run (Determine 2) 
  • Findings sorted by severity: every with the originating try, the agent’s response, the device calls noticed, the canary sign if any, the benign-control replay consequence, and a remediation notice (Determine 3) 
  • Found, attacked, and skipped instruments: what reconnaissance enumerated, what the attacker exercised, and what it skipped and why 
  • A full proof path: the immediate, the response, the baseline habits on a impartial floor, the management replay, and the generated “malicious” artifact 


Determine
3. Findings overview of an Agent Validation run

Wanting Forward

As agent frameworks, device ecosystems, and ability codecs evolve, the assault surfaces will evolve with them. The menace panorama will drive what we construct subsequent: new aims, new attacker techniques, and broader protection as agent patterns shift in actual deployments. 

To see Agent Validation in motion, go to Cisco AI Protection: Explorer Version immediately. 

Disclaimer: Agent Validation analysis outcomes mirror agent habits in opposition to the described methodology on the time of testing and don’t represent an endorsement, certification, or assure that any agent is protected, safe, or match for a particular use case. Clients are chargeable for conducting their very own assessments and for layering applicable runtime protections on high of validation outcomes. Cisco AI Protection: Explorer Version is offered as-is with out warranties of any type. 

RELATED ARTICLES
RELATED ARTICLES

Most Popular