The Integrity Framework

Public audit · Q2 2026

50 indie AI tools, audited against The Integrity Framework

We audited 50 publicly-listed indie AI tools against the six Layer 1 vetoes of The Integrity Framework v1.0. The headline findings: 74% do not disclose AI sub-processors in their privacy policy, 50% have verifiability gaps (missing or partial trust pages), 5 trip the artifact-vs-outcome veto by selling certification language disconnected from any audit motion, and 5 hide pricing behind sales gating. Methodology and per-tool scores below. CSV is downloadable. Published under CC BY 4.0.

Headline findings

74%

of audited tools don't disclose AI sub-processors (OpenAI, Anthropic, etc.) in their privacy policy.

50%

have verifiability gaps — missing one or more of /security, /trust, /privacy, /changelog.

5

trip the artifact-vs-outcome veto with badge/certification language disconnected from the audit motion.

5

gate pricing entirely behind "contact us" — a pricing-rigor mismatch signal under V5.

Methodology

Each tool's public marketing surface was fetched and scored heuristically against the Layer 1 vetoes:

  • V1 — Artifact vs outcome: homepage scanned for badge / certification language ("get certified", "SOC 2 in days", "audit-ready") disconnected from an actual audit motion. Tripped → risk.
  • V2 — Independence: requires understanding business model and revenue flow. Manual-review-required on every entry.
  • V3 — Verifiability: count of reachable trust pages at standard paths (/security, /trust, /privacy, /privacy-policy, /integrity, /changelog, /methodology, /.well-known/security.txt). ≥3 = pass, 1-2 = partial, 0 = fail.
  • V4 — AI accountability: privacy policy scanned for AI sub-processor mentions (OpenAI, Anthropic, Claude, Mistral, Cohere, Replicate, Together, Perplexity, etc.). Mention found = pass. No mention but privacy page exists = flag-no-mention.
  • V5 — Pricing-rigor alignment: pricing page scanned for visible dollar amounts vs "contact us / book a demo" gating. Visible = pass; gated only = hidden.
  • V6 — TechCrunch test: subjective by design. Manual-review-required on every entry.

Scoring is heuristic and conservative. A "fail" or "risk" mark on this page is a signal for human review, not a final verdict. The raw signals (response codes, matched terms, page URLs) are in the JSON behind this page so anyone can re-score with different heuristics. The audit script is published at scripts/audit-indie-ai-tools.cjs in the framework repo.

Per-tool scores

pass  ~partial  risk / fail / gap  manual review  ? not audited (bot-blocked)

ToolCategoryV1V2V3V4V5V6
Vantacompliance
Dratacompliance······
Secureframecompliance~
Sprintocompliance~?
Conveyorcompliance~~
Tugboat Logiccompliance
Thoropasscompliance~?
Probocompliance~?
AccessiBEaccessibility
AudioEyeaccessibility······
UserWayaccessibility~?
Recitemeaccessibility?
Cursorai-dev~
Codeium / Windsurfai-dev
Tabnineai-dev?
Continue.devai-dev
Heliconeai-ops~~
Langfuseai-ops~
Portkeyai-ops~?
LangChainai-ops~~
LlamaIndexai-ops~~
CrewAIai-agents~~
BaseAIai-agents??
Pineconevector-db~~
Weaviatevector-db~~
Qdrantvector-db??
Chromavector-db
Memknowledge?
Granolameetings
Fathommeetings~~
Firefliesmeetings
Ottermeetings
Copy.aicontent~
Jaspercontent
Rytrcontent~
Writesoniccontent~
Cannyproduct-feedback~
ProductBoardproduct-feedback~~
Sleekplanproduct-feedback~
Featurebaseproduct-feedback~?
Savioproduct-feedback~
Lindyai-agents~
Magic Loopsai-agents
Aria (Opera)ai-assistant?
Perplexityai-search······
You.comai-search~~
Phindai-search······
Hexdata?
Hyperarrowai-ops??
Aikidosecurity?

4 of 50 tools could not be audited (bot detection on homepage fetch): Drata, AudioEye, Perplexity, Phind.

What this tells us

  • AI sub-processor transparency is the biggest gap. 74% of audited tools don't name the AI vendors they pass customer data through. For a category whose products consume third-party LLMs by definition, that's a disclosure problem, not an oversight.
  • Verifiability is a binary skill. Tools that have invested in trust pages have all the trust pages; the rest have none. The middle "we have /privacy but nothing else" cohort is small. This is consistent with how teams build out trust artifacts — once one person on the team owns the work, it gets done. When no one owns it, nothing exists.
  • Pricing-rigor mismatch is a smaller issue than expected. 5 tools gate pricing entirely. For sub-enterprise AI tools, that's almost always a Veto 5 fail — the segment economics don't support per-customer contracts, so "contact us" usually masks pricing the operator is uncomfortable defending.
  • Most tools don't sell certificates as the product. Only 5 of 46 trip the V1 artifact-vs-outcome veto. The compliance-tool category dominates this list — the broader indie AI category is cleaner than the compliance-tooling sub-segment.

If you're on this list and want to publish an INTEGRITY.md

The fastest path to closing the V3 and V4 gaps surfaced here is to publish an INTEGRITY.md at your product's repo root or canonical marketing URL. The template is CC BY 4.0 and addresses each veto in plain language. After publication, submit your listing to the directory for a Bronze or Silver tier badge. Bronze is roughly a half-day of honest reflection; no audit-firm engagement is required.

Download

The raw audit data is available as a CSV. Published under CC BY 4.0; anyone is free to re-score the underlying signals with different heuristics.

Download CSV (50 rows)

Caveats

  • Heuristic scoring is a signal, not a verdict. A "fail" mark means the public artifact didn't match the heuristic; it does not mean the tool is untrustworthy.
  • V2 (independence) and V6 (TechCrunch test) cannot be automated and are flagged manual-review on every entry.
  • Cloudflare bot detection blocked 4 tools from being audited at all. Listed for transparency; not scored.
  • Some "no privacy page" results are genuine misses; others are pages at non-standard paths the heuristic didn't try (e.g., /legal). Reviewing flagged entries by hand is the recommended follow-up.
  • Published 2026-05-18. Trust artifacts change; re-running the script next quarter is expected to produce different numbers.