OpenAI Releases Open-Source Weight Model Privacy Filter for PII Detection and De-identification in Text
According to an official announcement, OpenAI has released the open-source weight model Privacy Filter, designed to detect and redact personally identifiable information (PII) in text. The model supports local execution and can identify and anonymize long texts in a single forward pass, handling contexts of up to 128,000 tokens. Privacy Filter has a total parameter count of 1.5 billion, with 50 million active parameters, and can identify sensitive information including private names, addresses, email addresses, phone numbers, URLs, dates, account numbers, passwords, and API keys. OpenAI states that the model is released under the Apache 2.0 license on Hugging Face and GitHub, and is suitable for privacy-preserving workflows such as training, indexing, logging, and auditing.