AI Agent Vetting Criteria 2026: The 7-Criterion Procurement Bar, Published
The vetting framework applied to every AI agent vendor on this site. Seven criteria, each defined, justified, and applied transparently. Pass bar: five of seven.
Why a procurement bar matters in 2026
The AI agent market as of mid-2026 has hundreds of vendors and minimal procurement infrastructure. Directories list thousands of tools. Vendor marketing claims deflection rates and accuracy numbers with no methodology. Enterprise buyers under board pressure to ship AI agent projects have nowhere to go for a defensible shortlist.
Production safety in the AI agent cluster is not yet solved. Industry reporting from 2026 indicates that 88% of enterprises experienced an AI agent security incident in the prior twelve months. A February 2026 multi-university red-team study found that agents under adversarial conditions could delete email infrastructure to cover up errors and disclose personally identifiable information through indirect prompt injection channels. A procurement bar that starts with SOC 2 Type II and ISO 42001 is not paranoia. It is a minimum for a board-defensible vendor shortlist.
The seven criteria below are the criteria applied to every vendor on this site. They are published here so that: (a) buyers can use them as a procurement checklist for any vendor, listed here or not; (b) vendors know exactly what bar they need to clear to qualify for listing; and (c) the editorial process is transparent and auditable.
Security and Compliance Certifications
Every listed vendor must carry SOC 2 Type II and at least one of: ISO 27001, ISO 42001, AIUC-1, HIPAA, PCI-DSS, or FedRAMP, selected as appropriate to the use case. SOC 2 Type II is the floor. It covers availability, security, confidentiality, processing integrity, and privacy controls, and requires an annual third-party audit.
ISO 42001 is the AI-specific governance standard, certifiable since late 2023. As of April 2026, Anthropic, Intercom Fin, Sierra, Glean, Decagon, Cognigy, Cresta, and Writer are certified. OpenAI has publicly committed. ISO 42001 is increasingly preferred in enterprise procurement RFPs as the AI-specific complement to ISO 27001.
AIUC-1 is the agent-specific reliability standard from the Artificial Intelligence Underwriting Company. Schellman was the first authorised auditor in early 2026. UiPath was the first enterprise certificant. AIUC-1 covers 50+ controls across Safety, Security, Reliability, Accountability, Data and Privacy, and Society, mapping to MITRE ATLAS and OWASP Top 10 for Agentic Applications. No vendor in the current v1 directory holds AIUC-1 certification, but it is tracked as an emerging criterion.
What appropriate to the use case means: HIPAA for healthcare-adjacent deployments, PCI-DSS for payment processing or anything touching cardholder data, FedRAMP for US federal government deployments. A vendor in a general enterprise CS or sales context that carries SOC 2 + ISO 27001 + ISO 42001 passes this criterion fully.
Public Reference Customers
At least three named customers publicly cited in a published source: vendor customer page, customer case study, blog post, press release, or analyst report. Anonymous logo walls do not count. A vendor may have 500 enterprise customers and list their logos with no names; that fails this criterion.
NDA references count if the vendor will broker a reference call. The procurement reality is that named references willing to be cited publicly give buyers evidence the product has succeeded in a comparable environment. The procurement memo question is: can I call someone who is using this in production?
Concrete examples from the v1 directory: Decagon names Notion, Duolingo, Substack, Bilt, Rippling, ClassPass, Eventbrite, and Figma on its customer page. Sierra names WeightWatchers, SiriusXM, and Sonos. Intercom Fin names Anthropic and Bark. Harvey names Allen and Overy, PwC, and Macfarlanes. Glean names Reddit, Pinterest, and Databricks.
The distinction between marketing and procurement: a vendor logo on a website is marketing. A named customer willing to be cited in a press release or case study is procurement evidence. The standard here is the latter.
Pricing Transparency
Either (a) published pricing tier on the vendor website, or (b) publicly disclosed starting price or pricing model in a credible third-party source: Vendr buyer guides, G2 pricing data, a public marketplace listing, or the vendor's own blog post. Contact sales alone with no triangulated public range fails this criterion.
Public pricing examples in the v1 directory: Intercom Fin at $0.99 per resolved conversation (public pricing page). Clay at $185 to $495 per month (public pricing page, reset March 2026). AiSDR at $900 to $2,500 per month (public pricing page). Apollo.io at $49 to $119 per user per month (public pricing page). Cognition (Devin) starting from $20 per month plus Agent Compute Units (public pricing page).
Triangulated ranges: Decagon at approximately $95,000 to $400,000 per year (Vendr buyer guide). Glean at approximately $97,500 per year median (Vendr buyer guide). Forethought at approximately $56,000 to $60,000 per year (Vendr buyer guide). Triangulated ranges are stamped vendor-claimed in amber where they derive from a single non-vendor source.
Why pricing transparency is itself a procurement signal: a vendor willing to publish pricing or triangulate a public range demonstrates confidence in their value proposition. A vendor with no public pricing and no triangulated range from any credible source is creating information asymmetry that disadvantages buyers.
Data Residency and Training Opt-Out
An explicit, written statement on: (a) where customer data is stored, naming the region or regions (US, EU, APAC, or multi-region), (b) whether the vendor opts customers out of training on their data by default and how to confirm, and (c) the data retention policy. The statement must be findable without reading the full privacy policy: a dedicated security or trust page, a data processing addendum template, or a publicly linked DPA template.
A buried clause in a 12,000-word privacy policy does not pass this criterion. The procurement reality in 2026 is that EU AI Act, GDPR, and data residency requirements in regulated industries mean a compliance lead will ask specifically about these three points in the vendor assessment questionnaire. A vendor that cannot produce a one-page summary is creating procurement friction.
A linked DPA template is a strong positive signal. It means the vendor has done the legal work, has a standard form, and can close a procurement process faster. Decagon, Sierra, Cognigy, Cresta, Anthropic, Glean, Writer, and Intercom Fin all publish accessible DPA templates or dedicated trust center pages with explicit answers to all three questions.
The EU AI Act context: as of April 2026, vendors using AI systems in high-risk categories (credit scoring, employment, healthcare, law enforcement, critical infrastructure) are subject to transparency and documentation requirements. Data residency in the EU for EU deployments is increasingly a hard requirement, not a preference.
Outcome Accountability
Either (a) outcome-based pricing: per resolved conversation, per qualified lead, per booked meeting, per closed ticket with a defined service-level agreement; or (b) published deflection, accuracy, reply, or quality benchmarks with methodology; or (c) a public service-level agreement on uptime, response time, and accuracy. Pure per-seat or per-user pricing with no outcome metric does not pass this criterion on its own.
Why outcome accountability matters: outcome-based pricing puts vendor skin in the game. A vendor charging per resolved conversation has a commercial incentive to actually resolve conversations. A vendor charging per seat has a commercial incentive to sign contracts, not to deliver outcomes. The criterion exists because buyers under procurement pressure need defensible evidence that the vendor expects to deliver measurable results.
Outcome model examples from the v1 directory: Intercom Fin charges $0.99 per resolved conversation only. Sierra uses pre-negotiated outcome-based pricing with custom SLA terms. Cresta publishes conversation quality and accuracy benchmarks with methodology. Cognigy publishes an SLA on uptime and response time. Glean publishes search relevance and knowledge-retrieval accuracy benchmarks. Clay publishes throughput and accuracy benchmarks for enrichment actions.
Vendor-claimed benchmarks (deflection rates, accuracy numbers, NPS improvements) are accepted as a pass with an amber flag indicating vendor-claimed status. Independently audited benchmarks from a third party are rated higher but are rare in the current market.
Time in Market
Greater than six months in production with greater than ten paying customers, with the customer count triangulated from publicly cited references. Pre-revenue vendors and design-partner-only vendors do not pass this gate regardless of how impressive the product looks. The AI agent cluster has hundreds of vendors that are zero to six months from demo to production; they are not procurement-ready for enterprises under board pressure.
The six-month bar is deliberately low. It is not a maturity bar; it is a minimum-viability bar. A vendor with six months in production has at minimum started to encounter real edge cases, has iterated on at least one production failure mode, and has at least some reference customers with real experience. A vendor with three months in production has none of those.
Founding date versus production-launch date: these are tracked separately. A vendor founded in 2022 but in closed beta until late 2024 passes on founding date but must be evaluated on public-launch date for the time-in-market criterion. The data schema on this site tracks both. Cognition (Devin) founded 2023, public launch March 2024, production since mid-2024: passes. A hypothetical vendor founded 2025, demo-only as of April 2026: does not pass.
Team founding date is sourced to LinkedIn, Crunchbase, or the vendor's own about page. Customer count is triangulated from publicly cited customer logos plus case studies plus press. Where the number cannot be triangulated below ten, the criterion is marked partial.
Team Composition
A public team page with named founders and at least one named senior engineering or research lead. Anonymous teams fail. LinkedIn-only presence without a vendor-published team page fails. The criterion exists because anonymous teams are a procurement red flag: enterprise procurement cannot do reference checks, background checks, or press verification on anonymous founders.
Founders' prior employment and relevant publications count toward credibility but are not gating. Sierra founded by Bret Taylor (former Salesforce CEO) and Clay Bavor (former Google VP): high credibility signal. Decagon founded by Jesse Zhang and Ashwin Sreenivas: public profiles, researchable histories. Glean founded by Arvind Jain (former Google search engineer): directly relevant prior experience. Harvey founded by Winston Weinberg and Gabriel Pereyra: named, researchable.
What counts as a public team page: the vendor's own about page or team page with photos, names, and at minimum brief bios. A LinkedIn company page without corresponding vendor-published profiles does not satisfy this criterion. The standard is that a procurement lead can navigate to the page without a LinkedIn account and find the information.
Named senior engineering or research lead: at least one person in a CTO, VP Engineering, or research lead role with a public presence. This does not need to be the founder. It is evidence that the technical organisation has public accountability.
The Pass Bar: Five of Seven
A vendor must pass at least five of the seven criteria to be listed. The two unmet criteria are flagged on the vendor's profile in amber, with the specific gap named. Passing all seven is the ideal; passing five or six qualifies.
Why five and not six: at six, the listed pool was too small for a useful directory at v1. Most enterprise AI agent vendors carry SOC 2 + ISO 27001 and have public references but are on the wrong side of ISO 42001 (which went live in late 2023) or pricing transparency (most enterprise vendors use contact-sales). A five-of-seven bar catches vendors that have done the procurement-readiness work in the areas that matter most to buyers, while acknowledging that the market is still maturing on ISO 42001 and pricing disclosure.
Below four criteria, the vendor is placed on the watchlist. The watchlist is a transparency mechanism: it shows buyers why a vendor they may have heard of is not yet on the main list, and what would qualify them. Four watchlist vendors appear at launch. See the watchlist page.
The Seven Criteria at a Glance
Each criterion, its nature, and what evidence is required.
Glossary of Standards
Eight terms used throughout this directory, defined for procurement teams.