Apr 22, 2026

OpenAI Releases Privacy Filter: A 1.5B Open-Weight Model for PII Detection

OpenAI released Privacy Filter, a 1.5B parameter bidirectional token-classification model for detecting and redacting personally identifiable information. Released under Apache 2.0 on HuggingFace and GitHub.

OpenAI on April 22, 2026, released Privacy Filter, a lightweight open-weight model designed to detect and redact personally identifiable information (PII) in unstructured text¹. The model is positioned as a building block for privacy-by-design workflows, enabling developers to sanitize training data, knowledge-base indexes, system logs, and user-facing content without sending sensitive text to third-party APIs¹.

Technical Architecture

Privacy Filter is a bidirectional token-classification model with span decoding¹. It starts from an autoregressive pretrained checkpoint and replaces the language-modeling head with a token-classification head, then post-trains with a supervised classification objective¹. Unlike a generative LLM, it does not produce text token by token; instead, it labels an entire input sequence in a single forward pass and decodes coherent spans using a constrained Viterbi procedure¹.

The model has 1.5 billion total parameters with 50 million active parameters¹, making it practical to run in production pipelines with modest compute. It supports a 128,000-token context window¹, allowing it to process long documents in a single pass.

Detection Taxonomy

The model classifies tokens into an 8-label privacy taxonomy¹:

Label	Coverage
`private_person`	Personal identity (names, usernames)
`private_address`	Physical addresses
`private_email`	Email addresses
`private_phone`	Phone numbers
`private_url`	Personal URLs
`private_date`	Personal dates (birthdays, anniversaries)
`account_number`	Financial account numbers, credit cards
`secret`	Secrets (passwords, API keys)

Redaction output replaces detected spans with their label in uppercase (e.g., [PRIVATE_PERSON], [ACCOUNT_NUMBER]) to preserve structural information while removing sensitive content¹.

Performance

On the PI**I-Masking-300k benchmark, Privacy Filter achieves an F1 of 96% (precision 94.04%, recall 98.04%)¹. After correcting annotation errors in the benchmark, the F1 rises to 97.43% (precision 96.79%, recall 98.08%), placing it at state-of-the-art on this benchmark¹.

The model is designed to be domain-adaptable: with a small amount of domain-specific labeled data for fine-tuning, the F1 on a target domain can jump from 54% to 96%¹.

Availability and Licensing

Privacy Filter is released under the Apache 2.0 license¹. It is available immediately from:

HuggingFace: openai/privacy-filter¹
GitHub: openai/privacy-filter¹
Model Card: a detailed PDF covering architecture, label schema, decoding rules, intended use cases, evaluation setup, and known limitations¹

OpenAI reports that it has already deployed a fine-tuned version of Privacy Filter internally within its own privacy workflows¹.

Limitations

OpenAI explicitly states that Privacy Filter is not a compliance certification tool and does not replace domain-specific policy review or human oversight in high-sensitivity contexts (legal, healthcare, finance)¹. Detection performance depends on the training label schema and decision boundaries; organizations with privacy policies that differ materially from the training distribution may need additional domain evaluation or fine-tuning¹.

The model may miss uncommon identifiers or ambiguous personal references, and in low-context settings (especially short text sequences), it may over-redact or under-redact¹.

Significance

Privacy Filter represents OpenAI’s effort to contribute practical privacy infrastructure to the open-source community. By releasing a capable, lightweight, Apache-licensed PII detection model, OpenAI lowers the barrier for developers building privacy-respecting AI applications — particularly those who cannot or will not send user data to third-party APIs for sanitization.

The 50M active parameter count is notably small, making the model viable for high-throughput production pipelines where latency and cost matter. The bidirectional architecture (unusual for a model derived from an autoregressive pretrained checkpoint) demonstrates a deliberate design choice: classification accuracy benefits from bidirectional context, even when the base model is autoregressive.

References

OpenAI Official Blog — Introducing OpenAI Privacy Filter, model architecture, performance data, taxonomy, availability, and limitations.
https://openai.com/index/introducing-openai-privacy-filter/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶ ↩¹⁷ ↩¹⁸ ↩¹⁹ ↩²⁰