Tokenized proofs · Identity verification · Privacy-first commerce · Proof without raw data

Tokenized proofs for commerce and identity verification

Tokenization in payments was designed to protect card data, and it worked: PAN tokenization removed raw card numbers from most of the merchant stack. But the same logic has not been applied consistently to identity. Commerce and identity verification workflows still routinely transmit raw date of birth, national identifiers, and document images when the downstream system needs nothing more than a binary answer. The gap between what data actually moves and what data needs to move is the surface area for breach exposure, regulatory liability, and unnecessary retention obligations.

What tokenization solved in payments

PAN tokenization replaced the primary account number in merchant systems with a surrogate value that references the real card number in a secure vault operated by the card network or a third-party token service provider. The merchant's systems never see or store the real card number after initial tokenization. If the merchant is breached, the attacker finds tokens that are useless outside the context of the tokenization service that issued them.

Network tokens extended this further. Visa Token Service, Mastercard Digital Enablement Service, and equivalent schemes generate device-specific and merchant-specific tokens so that the same physical card has a different token at each merchant. A token stolen from one merchant cannot be replayed at another. The token carries transaction-specific cryptograms that validate the payment method's authenticity without exposing the underlying PAN to the receiving merchant.

The result was a dramatic reduction in the value of payment data stolen from merchants. Tokenization did not eliminate payment fraud, but it changed its economics: stolen card data became less reusable and therefore less valuable. The infrastructure investment required to implement tokenization was significant, but it was driven by a clear principle: transmit the minimum necessary data to accomplish the transaction.

Why the same logic has not reached identity verification

Identity verification has not followed the same trajectory for several reasons. Payment tokenization was driven by a combination of regulatory pressure (PCI DSS), card network mandates, and clear financial incentives for merchants to reduce their breach liability. Identity verification lacks an equivalent set of top-down mandates that drive toward data minimization at the infrastructure level. Most KYC and identity verification workflows were built before proof-based verification was commercially viable, and they were designed around the assumption that the verifying party needs to review the raw data.

The verification provider model reinforced this. Most KYC vendors are designed to ingest raw identity documents, match against databases, and return a confidence score along with extracted data. The calling system receives not just a pass/fail result but a package of extracted identity attributes: name, date of birth, document number, address. Even when the downstream system only needs to know whether the person passed KYC, it receives and must handle all the extracted data. That data then needs to be stored, secured, and eventually deleted, creating compliance obligations that could have been avoided if the downstream system only received a binary verification result.

The distinction between a token and a proof

Tokens and proofs both reduce raw data exposure, but they work differently. A token is a surrogate for a piece of data: it replaces the data in transit and in storage while the original remains in a vault. The token is meaningless without access to the vault that maps it back to the original. A proof answers a specific question without revealing the underlying data at all. The proof does not replace a data record; it confirms whether a condition is met based on a data record, without transmitting the record to the party asking the question.

For payment card verification, a token is the appropriate mechanism: the recipient needs to be able to process the payment, which requires routing information that the token provides through vault lookup. For identity verification, a proof is often the appropriate mechanism: the recipient needs to know whether the person meets a condition, which the proof answers directly without providing the underlying identity data. The distinction matters because tokens still require data vaults and vault access patterns; proofs do not require the recipient to have any access to the underlying data at all.

This creates a fundamentally different data architecture. A system receiving identity tokens must have access, directly or indirectly, to the vault that resolves those tokens. A system receiving identity proofs has no such dependency. The proof is self-contained evidence of the verification result, signed by the verifying infrastructure, with no requirement for the recipient to access the underlying identity record.

Where proofs provide stronger guarantees than tokens in commerce contexts

In multi-party commerce workflows, identity verification evidence must often cross organizational boundaries. A benefit eligibility check for a government program may be initiated by a service provider and verified against a government database, with the result shared with a payment processor to authorize a restricted payment. Each boundary crossing is an opportunity for raw data to leak, be mishandled, or be retained beyond its intended purpose.

A proof-based architecture handles this cleanly. The government database performs the eligibility check and issues a proof. The proof crosses organizational boundaries without carrying the underlying identity data. Each party in the chain receives only the binary result and the cryptographic proof that the result is genuine. None of the parties except the original data holder ever sees the raw identity record.

This is stronger than a token in this context because a token vault lookup requires someone in the chain to have vault access. A proof requires only the ability to verify the cryptographic signature on the result. The verification capability can be widely distributed without distributing vault access, and without distributing the underlying data.

Privacy implications of data minimization in verification

The cross-data-consent circuit provides a verifiable record that the subject of a verification has consented to that verification being performed. This matters because data minimization is not only about limiting what data moves, but about ensuring that the data that does move, or is checked, is used with the knowledge and consent of the person it belongs to. A proof-based verification system can include the consent record as part of the proof structure, so that the evidence of eligibility and the evidence of consent travel together.

GDPR's data minimization principle requires that personal data be adequate, relevant, and limited to what is necessary for the specified purpose. Binary verification directly implements this principle at the technical level: only the answer to the specific question is transmitted, not the underlying record. Organizations that build verification workflows on binary proof infrastructure are implementing data minimization in their architecture rather than relying on policy controls to limit downstream use of data that was transmitted in full.

Proof-backed KYC and AML without exposing raw identity data

KYC verification traditionally involves transmitting identity documents or structured identity data to a verification service, which performs checks against sanctions lists, politically exposed persons databases, and identity verification databases, then returns a pass/fail with extracted data and match details. The calling system then stores this result alongside the extracted identity data for audit purposes.

The finance-kyc-verification circuit performs the same underlying checks but returns only the binary result and a cryptographic proof. The calling system receives confirmation that KYC checks were performed and passed, along with a proof that can be produced in an audit context, without receiving or storing the extracted identity data. The finance-aml-screening circuit follows the same pattern for AML transaction monitoring: the underlying data is checked against relevant screening databases, and the result is a binary eligible signal with a proof, not a raw match report.

POST https://api.affix-io.com/v1/verify
Content-Type: application/json
Authorization: Bearer <api_key>

{
  "circuit_id": "finance-kyc-verification",
  "identifier": "user:usr_9kL2m...",
  "context": {
    "purpose": "account_onboarding",
    "jurisdiction": "US"
  }
}

// Response
{
  "eligible": true,
  "proof": "sha256:4c9f2e...",
  "circuit_id": "finance-kyc-verification",
  "latency_ms": 52,
  "logged": true
}

Verification result

circuit_id: finance-kyc-verification
eligible: true
proof: sha256:4c9f2e8b1a7d...
latency_ms: 52
logged: true

No PII returned. Proof is audit-logged independently of the identity record.

The binary verification model: YES or NO without transmitting the underlying record

The binary verification model is the architectural principle that verification answers should be binary and that the calling system should receive the answer, not the underlying data. This is already how payment authorization works at the network level: the issuer returns an approval or decline code, not the cardholder's account balance or transaction history. The question is why identity verification has not converged on the same pattern.

Part of the answer is that payment authorization is a well-defined closed question with a standard response format, while identity verification has historically been open-ended. The calling system might need to know whether the person passed KYC, but it might also need to know which specific documents were provided, what the match confidence was, or whether the address was verified separately from the identity check. Binary verification works when the caller is willing to accept a yes/no on a well-defined question. The value of the binary model is that it forces that precision, which in turn reduces the surface area for ambiguity, edge cases, and downstream data handling obligations.

Use cases: where binary proof-based verification applies

Age verification for content access, product purchase, or service enrollment is a natural fit for binary proof. The downstream system needs to know whether the person is over 18, or 21, or whatever the threshold is. It does not need the person's actual date of birth. A circuit that returns eligible: true for age-over-18 gives the downstream system exactly what it needs without creating a PII handling obligation for a date of birth.

Access control decisions, benefit eligibility determinations, and employment screening all follow the same pattern. The decision point requires a yes or no. In each case, transmitting the underlying record to enable the decision creates data handling obligations that the binary proof approach avoids. Sanctions screening for payment authorization is a particularly clear example: the payment processor needs to know whether the payee is on a sanctions list, not what information was checked or what the matched record contained.

Where AffixIO fits

AffixIO's identity verification infrastructure and eligibility API are built on the binary verification model. Every circuit in the AffixIO library returns a binary eligible field and a cryptographic proof. No raw identity data is stored by AffixIO. The proof is logged with a timestamp, circuit ID, and verification result, providing an audit trail without retaining the underlying data that was checked.

Relevant circuits

finance-kyc-verification: Binary KYC pass/fail with cryptographic proof, no raw identity data returned
finance-aml-screening: AML screening result as a binary eligible signal with proof
cross-data-consent: Verifiable consent record for the specific verification being performed
cross-biometric-match: Binary biometric verification result without transmitting raw biometric data

The principle: Tokenization moved card numbers out of merchant systems by replacing them with surrogates. Proof-based verification moves identity data out of downstream systems by replacing data transmission with binary answers. The attack surface shrinks when the data never travels.

Frequently asked questions

What is the difference between a token and a proof in payment and identity systems?

A token is a substitute for a piece of data. A PAN token replaces the card number with a reference that can be used in its place; the original data still exists in a secure vault. A proof answers a specific question without returning the underlying data. Tokens protect data in transit and storage. Proofs eliminate the need to transmit or store the data at all, for the purpose of answering a binary question. A system receiving a proof never needs access to the vault that holds the original data.

Why does KYC still use raw PII when a binary answer would work?

Most KYC systems were built when the technical infrastructure for proof-based verification did not exist at commercial scale. The workflow of collecting identity documents, transmitting them to a verification provider, and storing the results is well established and integrates with regulatory reporting requirements that themselves demand access to raw data in some jurisdictions. Moving to a proof-based model requires both technical infrastructure for generating and validating proofs and operational acceptance that a proof-of-verification is sufficient evidence for the specific compliance context.

How does binary verification work for KYC and AML?

Binary verification for KYC and AML runs the relevant checks, such as sanctions list matching, document validity, and identity confirmation, against the underlying data and returns only the result: eligible or not eligible. The AffixIO finance-kyc-verification and finance-aml-screening circuits perform these checks without transmitting the raw identity record to the calling system. The calling system receives a YES or NO with a cryptographic proof and a latency timestamp. The underlying data remains in the verified data store; only the result crosses the API boundary.

What is data minimization and why does it matter for verification?

Data minimization is the principle of collecting, transmitting, and storing only the data necessary for a specific purpose. In verification contexts, it means not sending raw identity records to every system that needs to know whether a person is eligible for something. Each system that receives raw PII is a potential breach point. Binary verification reduces the attack surface by ensuring that downstream systems never receive the underlying data, only the binary result of checking it, eliminating secondary retention obligations along the entire chain.

Which verification use cases are most suited to proof-based identity checks?

Age verification, sanctions screening, benefit eligibility, KYC onboarding, AML transaction monitoring, access control, and employment eligibility are all well-suited to proof-based verification. In each case, the downstream system needs a YES or NO, not access to the underlying identity record. The fewer systems that receive the raw record, the smaller the data breach exposure. Proof-based verification is particularly valuable in multi-party workflows where verification evidence must be shared across organizational boundaries without sharing the underlying PII.

How does AffixIO handle identity verification without storing PII?

AffixIO runs verification circuits against identity identifiers provided by the calling system. The circuit evaluates eligibility conditions and returns a binary result with a cryptographic proof. No raw identity data is stored by AffixIO. The proof is logged with a timestamp and the circuit result, but the underlying data that was checked is not retained. This means AffixIO can produce an audit trail of verification events without becoming a secondary repository of the identity data it was asked to verify.

Verify identity without moving identity data

See how AffixIO's binary verification model works for KYC, AML, and eligibility checks.

Identity verification Contact us