Trust & Safety
Guardrails, red teaming, misuse detection and the measurable prevention of unsafe actions in production.
A structured, evidence based framework for evaluating autonomous AI agents operating inside European organisations. Seven dimensions. Weighted scoring. Independent assessment. One registry.
Europe is deploying autonomous agents into production faster than any governance framework can track. Agent Certified exists so that insurers, regulators, boards and counterparties have a common way to read whether an agent is safe to rely on.
A structured benchmark that produces a defensible position on governance, oversight, technical controls and operating readiness before an incident, not after.
A consistent signal of risk posture across portfolios. Certification maps to the governance, transparency and human oversight obligations emerging under the EU AI Act.
A clear artefact directors can reference in risk committees, vendor reviews and annual reports without having to construct one from scratch.
Every certified agent is evaluated against seven dimensions. Each dimension carries a weight reflecting its importance to operational safety and regulatory exposure. Together they total one hundred points.
Guardrails, red teaming, misuse detection and the measurable prevention of unsafe actions in production.
How the agent sources, verifies and keeps fresh the data it reasons over, including provenance and lineage.
Who can invoke the agent, under what authority, and how downstream actions are bounded.
Reliability of the agent as a product surface: uptime, regression discipline, evaluation coverage and versioning.
Board oversight, documented policies, risk registers, role accountability and audit trails at the operating level.
How responsibly the agent sits inside existing systems of record, identity, approval and escalation.
The explicit boundary between autonomous action and human confirmation, including revocation, rollback and hard stops. The single most important constraint on operational risk.
The weighted score across all seven dimensions places the agent into one of five recognised tiers. The band is the signal that counterparties read.
Baseline acknowledged. Governance, oversight and technical evidence are not yet sufficient for a certification call.
Operator has initiated formal controls. Recognised as an active candidate, not yet certified.
The agent meets the operating floor. Counterparties may rely on the mark for standard commercial use.
Materially above floor. Suitable for higher exposure deployments including regulated sectors.
Exemplar. Used as a reference profile for insurer underwriting models and for sector standards work.
The standard is built on existing, recognised instruments rather than in parallel to them. Every dimension traces back to at least one primary reference.
| Reference | Issuer | Relevance |
|---|---|---|
| ISO/IEC 42001:2023 | International Organization for Standardization | AI management system requirements. Informs Governance and Product Maturity dimensions. |
| NIST AI Risk Management Framework | US National Institute of Standards and Technology | Risk function model. Informs Trust & Safety and Context Integrity dimensions. |
| EU AI Act, Articles 9, 10, 14, 15 | European Parliament and Council | Risk management, data governance, human oversight and accuracy. Maps directly to dimension scoring. |
| EU AI Act, Article 26 | European Parliament and Council | Deployer obligations. Informs Distribution Control and Autonomy Envelope dimensions. |
| EIOPA supervisory statements on AI in insurance | European Insurance and Occupational Pensions Authority | Sector alignment for insurer reliance on the certification. |
Assessments are scheduled in quarterly cohorts. Q3 2026 slots are open to European enterprises and scale-ups operating agents in production environments. Submit a request to receive an intake briefing.