Class Sanitizer
This class provides static methods for escaping content in various contexts:
escapeHtml(String)- For HTML text contentescapeHtmlAttribute(String)- For HTML attribute valuesescapeJavaScript(String)- For JavaScript string literalsescapeUrl(String)- For URL parametersescapeCss(String)- For CSS valuesstripHtml(String)- For removing HTML tags
Usage Examples:
// Escaping user input for display in HTML
String userComment = "<script>alert('xss')</script>";
div.setText(Sanitizer.escapeHtml(userComment));
// Output: <script>alert('xss')</script>
// Safe attribute values
String title = "Click \"here\" for more";
element.addAttribute("title", Sanitizer.escapeHtmlAttribute(title));
// Safe JavaScript strings
String message = "User's \"special\" message";
page.executeJs("alert('" + Sanitizer.escapeJavaScript(message) + "')");
// URL parameters
String searchTerm = "foo&bar=baz";
String url = "/search?q=" + Sanitizer.escapeUrl(searchTerm);
Security Note: Always use the appropriate escape method for the context. Using the wrong escape method may leave your application vulnerable to injection attacks.
- Since:
- 2025
- Version:
- 1.0
- Author:
- Marvin P. Warble Jr.
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptionstatic booleancontainsDangerousHtml(String input) Checks if a string contains potentially dangerous HTML content.static StringEscapes a string for safe use in CSS values.static StringescapeHtml(String input) Escapes HTML special characters to prevent XSS attacks in HTML text content.static StringescapeHtmlAttribute(String input) Escapes a string for safe use in HTML attribute values.static StringescapeJavaScript(String input) Escapes a string for safe use in JavaScript string literals.static StringURL-encodes a string for safe use in URL parameters.static StringRemoves all HTML tags from the input string, leaving only plain text content.
-
Method Details
-
escapeHtml
Escapes HTML special characters to prevent XSS attacks in HTML text content.This method escapes the following characters:
&→&<→<>→>"→"'→'
Use this method when: Inserting user-provided content into HTML text nodes.
- Parameters:
input- The string to escape. May be null.- Returns:
- The escaped string, or an empty string if input is null.
-
escapeHtmlAttribute
Escapes a string for safe use in HTML attribute values.This method performs the same escaping as
escapeHtml(String)plus additional characters that could break out of attribute context:`→`(backtick - prevents template literal injection)=→=(equals sign - prevents attribute injection in some contexts)
Use this method when: Setting HTML attribute values with user-provided content.
- Parameters:
input- The string to escape. May be null.- Returns:
- The escaped string, or an empty string if input is null.
-
escapeJavaScript
Escapes a string for safe use in JavaScript string literals.This method escapes the following characters:
\→\\'→\'"→\"/→\/(prevents</script>from breaking out)- Newline →
\n - Carriage return →
\r - Tab →
\t - Line separator (U+2028) →
- Paragraph separator (U+2029) →
Use this method when: Inserting user-provided content into JavaScript string literals, whether single-quoted, double-quoted, or template literals.
- Parameters:
input- The string to escape. May be null.- Returns:
- The escaped string, or an empty string if input is null.
-
escapeUrl
URL-encodes a string for safe use in URL parameters.This method uses UTF-8 encoding to convert the input string to a URL-safe format. All characters except alphanumeric characters and
-_.~are percent-encoded.Use this method when: Building URLs with user-provided query parameters or path segments.
- Parameters:
input- The string to encode. May be null.- Returns:
- The URL-encoded string, or an empty string if input is null.
-
escapeCss
Escapes a string for safe use in CSS values.This method escapes characters that could break out of CSS context or enable CSS injection attacks:
- Backslash, quotes, parentheses, semicolons, colons, etc.
- Characters that could enable
expression()orurl()injection
Use this method when: Setting CSS property values with user-provided content, such as in inline styles or dynamic stylesheets.
Note: For maximum security, consider using a whitelist approach for CSS values rather than escaping.
- Parameters:
input- The string to escape. May be null.- Returns:
- The escaped string, or an empty string if input is null.
-
stripHtml
Removes all HTML tags from the input string, leaving only plain text content.This method removes:
- All HTML/XML tags (including attributes)
- HTML comments
After tag removal, common HTML entities are decoded:
&→&<→<>→>"→"'and'→' → space
Use this method when: Extracting plain text content from HTML, such as for search indexing or plain text display.
Warning: This method should NOT be used as a security measure to sanitize untrusted HTML for display. Use
escapeHtml(String)instead.- Parameters:
input- The HTML string to strip. May be null.- Returns:
- The plain text content, or an empty string if input is null.
-
containsDangerousHtml
Checks if a string contains potentially dangerous HTML content.This method checks for the presence of:
- Script tags
- Event handler attributes (onclick, onerror, etc.)
- JavaScript URLs
- Data URLs
- Other potentially dangerous patterns
Use this method when: You need to validate user input before allowing it in contexts where full escaping is not possible.
- Parameters:
input- The string to check. May be null.- Returns:
trueif the string contains potentially dangerous content,falseotherwise (including when input is null).
-