PII Redaction

Data & Privacy

Detect and mask personally identifiable information in healthcare text.

PIIRedactionDe-identificationPrivacy

About

Scans free text for personally identifiable information — names, SSNs, dates, contact details — and returns the text with each item masked, alongside an inventory of what was found and its category.

Use it as a privacy gate before text leaves a controlled context: logging, analytics, sharing chart excerpts, or feeding documents into general-purpose LLM calls.

How it works

  1. 1Raw text submitted as JSON
  2. 2LLM-based PII detection across categories (NAME, SSN, dates, contacts, …)
  3. 3Masked text assembly + per-item category inventory

Intended use

  • De-identifying note excerpts before display, export, or analytics
  • Pre-processing text for downstream LLM calls outside the clinical pipelines
  • Building redaction review UIs (show original vs masked with the pii_items list)

Key outputs

  • redacted_text — input with PII replaced by [MASK]
  • pii_items[] — each detected item with its original value and category

Endpoints

Try each endpoint with your signed-in session — usage counts toward your monthly budget.

Use synthetic data only. Do not submit real patient records or PHI when testing endpoints.

Limitations & caveats

  • Model-based detection — recall is high but not guaranteed; not by itself a HIPAA Safe Harbor certification
  • Text-only: documents must be OCR'd first
  • Returns detected originals in pii_items — treat the response itself as sensitive