Skip to main content
The sanitize pre-hook scans input text for personally identifiable information (PII). It supports 50+ pattern types, 5 checksum validators, and 6 categories — with a focus on EU member state identifiers.

Three Modes

import { complior } from '@complior/sdk';
import OpenAI from 'openai';

// Mode 1: Replace (default) — redact PII with labels
const client = complior(new OpenAI(), { sanitizeMode: 'replace' });
// Input:  "My SSN is 123-45-6789"
// Passed: "My SSN is [PII:SSN]"

// Mode 2: Block — throw PIIDetectedError on first match
const client = complior(new OpenAI(), { sanitizeMode: 'block' });

// Mode 3: Warn — pass through, add metadata only
const client = complior(new OpenAI(), { sanitizeMode: 'warn' });

Categories & Types

National IDs (identity_national)

8 EU/US national identification numbers.
TypeCountryPatternValidatorRedaction Label
SSNUS\d{3}-\d{2}-\d{4}[PII:SSN]
BSNNL\d{9}11-check[PII:BSN]
NIRFR[12] dd dd dd ddd ddd ddmod-97[PII:NIR]
PESELPL\d{11}Weighted checksum[PII:PESEL]
CODICE_FISCALEIT[A-Z]{6}\d{2}[A-Z]\d{2}[A-Z]\d{3}[A-Z]Check character[PII:CODICE_FISCALE]
PERSONALAUSWEISDE[CFGHJKLMNPRTVWXYZ0-9]{9}\d[PII:PERSONALAUSWEIS]
DNIES\d{8}[A-Z][PII:DNI]
NIFPT\d{9}[A-Z]{2}[PII:NIF]

Passports (identity_passport)

14 EU country formats + generic international passport pattern.
TypesCountries
PASSPORT_DEGermany
PASSPORT_FRFrance
PASSPORT_NLNetherlands
PASSPORT_PLPoland
PASSPORT_ITItaly
PASSPORT_ESSpain
PASSPORT_PTPortugal
PASSPORT_BEBelgium
PASSPORT_ATAustria
PASSPORT_SESweden
PASSPORT_DKDenmark
PASSPORT_FIFinland
PASSPORT_IEIreland
PASSPORT_CZCzech Republic
PASSPORT_GENERICInternational

Financial (financial)

TypeDescriptionValidatorArticle
IBANInternational Bank Account Number (ISO 13616)mod-97GDPR Art.6
CREDIT_CARDCard numbers (spaced and contiguous)GDPR Art.6
SWIFT_BICSWIFT/BIC bank codesGDPR Art.6

Contact (contact)

TypeDescriptionArticle
EMAILEmail addressesGDPR Art.6
PHONEInternational + EU domestic phone numbersGDPR Art.6
IP_ADDRESSIPv4 and IPv6 addressesGDPR Recital 30

Medical (medical)

TypeCountryDescriptionArticle
EHICEUEuropean Health Insurance CardGDPR Art.9(2)(h)
HEALTH_ID_DEDEKrankenversichertennummer (KVNR)GDPR Art.9(2)(h)
HEALTH_ID_FRFRCarte Vitale (NIR-based, with mod-97)GDPR Art.9(2)(h)
HEALTH_ID_UKUKNHS numberGDPR Art.9(2)(h)

GDPR Art.9 Special Categories (gdpr_art9)

8 context-dependent patterns for special category data. These use keyword context matching — a pattern must appear alongside relevant context keywords to trigger.
TypeCategoryContext Keywords
ART9_RACIALRacial/ethnic originrace, ethnicity, heritage, nationality
ART9_POLITICALPolitical opinionspolitical, party, vote, election, ideology
ART9_RELIGIOUSReligious beliefsreligion, faith, church, mosque, temple
ART9_TRADE_UNIONTrade union membershipunion, labor, collective bargaining
ART9_GENETICGenetic dataDNA, genome, gene, hereditary
ART9_BIOMETRICBiometric datafingerprint, facial recognition, iris
ART9_HEALTHHealth datamedical, diagnosis, patient, treatment
ART9_SEXUALSexual orientationorientation, gender identity, LGBTQ

Checksum Validators

Five types use algorithmic validation to reduce false positives:
ValidatorAlgorithmUsed By
IBANISO 7064 mod-97IBAN
BSN11-proof (Dutch)BSN
NIRmod-97 key checkNIR, HEALTH_ID_FR
PESELWeighted digit sumPESEL
Codice FiscaleCheck character lookupCODICE_FISCALE

Pattern Priority

Patterns are ordered for matching priority — specific patterns with validators run first, generic patterns last:
1

National IDs

Most specific: SSN, BSN (with 11-check), NIR (with mod-97), PESEL, Codice Fiscale
2

Financial

IBAN (with mod-97), credit cards, SWIFT/BIC
3

Medical

EHIC, KVNR, Carte Vitale, NHS
4

GDPR Art.9

Context-dependent special categories
5

Contact

Generic: email, phone, IP addresses
6

Passports

Most generic: country-specific and generic passport formats

Configuration

Set sanitizeMode and other options.

Error Handling

PIIDetectedError reference.