Skip to content

Open Redact PDF

Open Redact PDF is a browser-first PDF redaction engine implemented in Rust and exposed to browsers through WebAssembly. The project operates on PDF structure instead of flattening pages into images, removes targeted content for a constrained but real subset of PDFs, and preserves unredacted text where the supported subset allows it.

Start here

Reference

Design and security

Guides

Engine Internals

Deep technical documentation covering PDF spec concepts, implementation decisions, tradeoffs, and code-level explanations. Start with the reading order guide.

Current MVP scope

  • Unencrypted PDFs, plus Standard Security Handler decryption at V = 1/2 (RC4), V = 4 (AES-128), and V = 5 (AES-256 / R = 5 or R = 6) under either the user or owner password — classic xref tables, PDF 1.5+ cross-reference streams, object streams, and the hybrid XRefStm form are all handled
  • Unfiltered or FlateDecode streams, including PNG and TIFF DecodeParms predictors
  • Deterministic full-document rewrites with FlateDecode-compressed content streams
  • Form XObjects traversed for text extraction, search, and copy-on-write redaction (text, vector paint, and Image Do invocations inside the Form), with nested Forms handled recursively
  • Type1, TrueType, and Type0 / Identity-H text with ToUnicode, WinAnsiEncoding, MacRomanEncoding, StandardEncoding, and /Encoding /Differences decoding
  • Rectangle, quad, and quad-group redaction targets in canonical page space
  • Three redaction modes: strip, redact (default), and erase, with optional overlayText labels in redact mode
  • Conservative image redaction at invocation level
  • Hidden-by-default Optional Content Groups are refused by default; callers can opt in via sanitizeHiddenOcgs: true to strip BDC /OC /<name> ... EMC content gated by hidden layers before redaction

Fail-explicit design

Unsupported features return an explicit error instead of being silently ignored or producing incorrect output.