Unicode Normalization Detector
Detects text that changes under Unicode normalization (NFKC).
The normalization detector identifies text that changes when converted to Unicode NFKC (Normalization Form Compatibility Composition).
This helps surface content where what you see is not exactly what the system interprets.
Why this matters
Unicode allows multiple representations for visually similar characters.
Normalization can silently transform text such as:
- compatibility characters
- presentation forms
- mathematical alphabets
- full-width variants
This creates risk in:
- prompts
- system instructions
- authentication text
- code
- policy validation logic
If normalized text differs from the original, the content may behave differently than expected.
Detection rule
PSN001
Normalization-sensitive text
Flags spans where characters change under NFKC normalization.
Severity: HIGH
Example
Input
ℌello adminNormalized
Hello adminResult
The character ℌ (black-letter capital H) normalizes to H.
PromptShield reports the span as normalization-sensitive.
Detection model
The detector:
- Normalizes text using
NFKC - Compares each character to its normalized form
- Groups adjacent differences into spans
- Emits one threat per span
Span semantics:
offendingText = original span
decodedPayload = normalized spanWhat this detector does NOT do
It does not:
- block multilingual content
- enforce normalization automatically
- attempt semantic interpretation
It only surfaces differences that may affect interpretation.
When to care
Normalization differences are most important in:
- system prompts
- tool instructions
- code generation inputs
- security-sensitive text
- identity strings
- configuration files
Less important in:
- natural multilingual prose
- UI copy
- documentation text
Remediation
Recommended fix:
- Replace characters with their normalized equivalents
- Avoid compatibility Unicode forms in prompts or code
References
Unicode Standard Annex #15 — Unicode Normalization Forms https://unicode.org/reports/tr15/
PromptShield rule reference https://promptshield.js.org/docs/detectors/normalization#PSN001