Incorrect behavior order: validate before canonicalize
Description
The application was found matching a variable during a regular expression pattern match, and then calling a Unicode normalize function after validation has occurred. This is usually indicative of a poor input validation strategy as an adversary may attempt to exploit the normalization process.
Examples
Insecure Code
java
$Y = java.util.regex.Pattern.compile("[<>]");
...
$Y.matcher($VAR);
...
java.text.Normalizer.normalize($VAR,...);Secure Code
java
String userInput = "\uFE64" + "tag" + "\uFE65";
userInput = Normalizer.normalize(userInput, Normalizer.Form.NFKC);
Pattern pattern = Pattern.compile("[<>]");
Matcher matcher = pattern.matcher(userInput);
if (matcher.find()) {
throw new Exception("found banned characters in input");
}Remediation
Always perform Unicode normalization before any validation of a string.
Rule Details
| Field | Value |
|---|---|
| ID | CODE-0727 |
| Category | Injection |
| Severity | MEDIUM |
| CWE | CWE-180 |
| Confidence | HIGH |
| Impact | MEDIUM |
| Likelihood | MEDIUM |
| Exploitability | MODERATE |
| Tags | input validation, unicode normalization |
| OWASP | A1:2017-Injection, A03:2021-Injection |