HTML Entity Encode / Decode

Convert special characters to HTML entities and decode them back.

Privacy First

This tool runs entirely in your browser. No data is sent to any server. Your input remains completely private.

What are HTML Entities?

HTML entities are special codes used to represent characters that have special meaning in HTML or that cannot be easily typed on a keyboard. They begin with an ampersand (&) and end with a semicolon (;), with either a name or number in between.

For example, the less-than sign (<) cannot be written directly in HTML because browsers would interpret it as the start of a tag. Instead, we use < to display a literal less-than sign.

Why HTML Entity Encoding Matters

HTML entity encoding serves several critical purposes in web development:

Security (XSS Prevention)

Cross-Site Scripting (XSS) attacks occur when malicious scripts are injected into web pages. By encoding special characters like < and > in user input, you prevent attackers from injecting executable HTML or JavaScript code.

Displaying Reserved Characters

HTML reserves certain characters for its syntax. To display these characters as text (like < > & "), you must encode them so browsers don't misinterpret them as HTML markup.

Special Symbols and Typography

HTML entities provide access to symbols not found on standard keyboards: copyright (©), trademark (TM), mathematical symbols (±, ×, ÷), currency symbols (€, £, ¥), and typographic marks (, , ).

Character Encoding Safety

When you're unsure about the character encoding of a document, using numeric entities ensures characters display correctly regardless of encoding settings.

Types of HTML Entities

Named Entities

Human-readable names for common characters: &amp; for &, &lt; for <, &copy; for ©. There are over 2,000 named entities defined in HTML5.

Decimal Numeric Entities

Use the decimal Unicode code point: &#60; for < (code point 60 decimal). Works for any Unicode character.

Hexadecimal Numeric Entities

Use the hexadecimal code point: &#x3C; for < (code point 3C hex). Often preferred for Unicode values which are commonly written in hex.

Essential HTML Entities

The five characters that must always be encoded in HTML content:

  • & - Ampersand: &amp; or &#38;
  • < - Less than: &lt; or &#60;
  • > - Greater than: &gt; or &#62;
  • " - Double quote: &quot; or &#34;
  • ' - Single quote: &#39; or &apos;

Common Mistakes

Double Encoding

Encoding an already-encoded string turns &lt; into &amp;lt;, which displays as &lt; instead of <.

Missing Semicolons

Entities must end with a semicolon. &copy without the semicolon may not be interpreted correctly.

Inconsistent Encoding

Encoding some special characters but not others can still leave security vulnerabilities.

Privacy and Security

All encoding and decoding happens entirely in your browser. Your content never leaves your computer, making this tool safe for processing sensitive HTML content.

Common Use Cases

XSS Prevention

Encode user-generated content before displaying it in HTML to prevent cross-site scripting attacks.

Displaying Code Snippets

Encode HTML code examples so they display as text rather than being rendered as HTML.

Email Template Safety

Encode special characters in email templates to ensure they render correctly across different email clients.

CMS Content Cleanup

Decode entity-encoded content from CMS systems that over-encode text for editing.

Special Symbol Insertion

Convert symbols like copyright, trademark, or currency signs to their HTML entity equivalents.

Character Encoding Debugging

Decode garbled text that contains HTML entities to see the original intended characters.

Worked Examples

Encode for HTML Display

Input

<script>alert("XSS")</script>

Output

&lt;script&gt;alert(&quot;XSS&quot;)&lt;/script&gt;

The script tags and quotes are encoded, making this safe to display in HTML. The browser will show the text rather than execute it.

Decode Entities

Input

&copy; 2024 &mdash; All Rights Reserved &trade;

Output

© 2024 — All Rights Reserved TM

Named entities are converted back to their character equivalents: copyright symbol, em dash, and trademark.

Frequently Asked Questions

What is the difference between named and numeric entities?

Named entities use readable names (&copy; for ©) while numeric entities use Unicode code points (&#169; or &#xA9;). Both produce the same result, but named entities are more readable and numeric entities work for any Unicode character.

Should I encode all text for HTML?

Only characters with special HTML meaning need encoding: & < > " '. Regular letters, numbers, and most punctuation are safe. Over-encoding makes content harder to read and maintain.

What about non-breaking spaces?

Non-breaking spaces (&nbsp;) prevent line breaks between words and add visible space. They are useful for formatting but should not be used excessively as they can cause accessibility issues.

How do I choose between named and numeric entities?

Use named entities for common characters (they are more readable). Use numeric entities for characters without named equivalents or when you need guaranteed compatibility across all systems.

Is my content sent to any server?

No, all encoding and decoding happens locally in your browser using JavaScript. Your content never leaves your device, making it safe to process sensitive data.

Can this tool help prevent XSS attacks?

Encoding user input is one layer of XSS prevention. Use this tool to encode content before displaying it in HTML. However, proper security requires encoding at the right point in your application, not just as a one-time conversion.