ASCII & Unicode Lookup

ASCII & Unicode Character Lookup Tool

The definitive reference for character encoding. Look up any character's decimal, hexadecimal, binary, HTML entity, CSS escape, JavaScript string, UTF-8 byte sequence and UTF-16 representation instantly. Includes the full ASCII table, extended character sets and Unicode block reference.

Features

12 Properties Per Char

Decimal, hex, binary, HTML entity, CSS escape, JS string, UTF-8 bytes, UTF-16, Unicode code point and more.

String Breakdown

Enter any string to see a complete encoding table for every individual character.

Full ASCII Table

Browse the complete ASCII table (0–127) and extended set (128–255) organised by type.

Unicode Blocks

Reference guide to 12 major Unicode blocks with clickable navigation.

CSV Export

Export your string's character breakdown as a CSV file for offline reference.

Dual Input

Look up by typing a character, pasting text, or entering a decimal or hex code point.

Who Uses This Tool?

Web DevelopersFind HTML entities for special characters and verify UTF-8 encoding of content.

WritersIdentify the correct code for typographic characters: em dash, curly quotes, ellipsis.

Security ResearchersAnalyse Unicode homoglyphs and control characters used in phishing and injection attacks.

CS StudentsStudy character encoding, UTF-8 encoding algorithm and Unicode code point structure.

Frequently Asked Questions

What is the difference between ASCII and Unicode?

ASCII is a 7-bit encoding standard for 128 characters (English letters, digits and control codes). Unicode is a universal standard covering over 140,000 characters from all the world's writing systems. UTF-8 is the most common way of encoding Unicode.

What is UTF-8 and why is it standard?

UTF-8 is a variable-length encoding: ASCII characters use 1 byte, most European characters use 2 bytes, and other scripts use 3–4 bytes. Its backward compatibility with ASCII and space efficiency made it the dominant encoding on the web (98%+ of websites).

What is an HTML entity?

An HTML entity is a text representation of a character using an ampersand prefix and semicolon suffix (e.g., & for &, < for <). They are used to display reserved HTML characters and characters not easily typed on a keyboard.

What are control characters (0–31)?

Control characters are non-printable ASCII codes that control text flow: Tab (9), Line Feed/newline (10), Carriage Return (13), Null (0), Bell (7) etc. They are invisible but affect how text is processed and displayed.

Pro Tip

When debugging text encoding issues, check for the UTF-8 BOM (Byte Order Mark, U+FEFF) at the start of files — it can cause "mystery characters" to appear. Also watch for the difference between a regular hyphen (-), en dash (–) and em dash (—), which look similar but have different code points.

Did You Know?

128

Original ASCII Characters

ASCII (American Standard Code for Information Interchange) defined only 128 characters in 1963 — sufficient for English. When computers went global, 128 characters proved woefully inadequate for the world's 6,500+ languages. This limitation created chaos: every country invented their own "extended ASCII" incompatibly.

149,813

Unicode Characters (v15.1)

Unicode 15.1 (2023) defines 149,813 characters covering virtually every writing system on Earth, plus historical scripts, mathematical symbols, musical notation, emoji and more. The standard targets 1,114,112 total possible code points (U+0000 to U+10FFFF).

98%

of Websites Use UTF-8

UTF-8 is used by over 98% of websites — it is the dominant text encoding on the internet. UTF-8 is brilliant: it is backward-compatible with ASCII (the first 128 characters are identical), and efficiently encodes most common characters in 1–2 bytes while supporting all 1.1 million Unicode code points.

ASCII Control Characters Quick Reference

Dec	Hex	Abbrev	Meaning	Usage
0	0x00	NUL	Null	String terminator in C
7	0x07	BEL	Bell	Audio alert (rarely used)
8	0x08	BS	Backspace	Delete previous character
9	0x09	HT	Horizontal Tab	Indentation, TSV files
10	0x0A	LF	Line Feed	Unix/Mac newline (\n)
13	0x0D	CR	Carriage Return	Windows uses CR+LF (\r\n)
27	0x1B	ESC	Escape	ANSI terminal sequences
32	0x20	SP	Space	Word separator
127	0x7F	DEL	Delete	Originally punched tape erasure

Common Mistakes

Not setting charset="UTF-8" in HTML

Without explicit charset declaration, browsers use heuristics that can misinterpret encoding, causing garbled text (mojibake) for non-ASCII characters.

Always include as the first element inside .

Using Latin-1 database columns for user content

Latin-1 (latin1 in MySQL) cannot store emoji, Chinese characters or most non-European languages. Attempting to store them causes silent data truncation or errors.

Use utf8mb4 (not utf8!) in MySQL/MariaDB — "utf8" in MySQL is broken and only supports 3-byte characters.

Hardcoding character comparisons without normalisation

The same visual character (e.g., "é") can be encoded as a single code point (U+00E9) or as "e" + combining accent (U+0065 + U+0301). These look identical but are byte-different.

Apply Unicode normalisation (NFC or NFD) before comparing or storing user text.

Character Lookup

String Breakdown

ASCII Table

Unicode Block Reference

Quick Reference

Encodings

Related Tools

ASCII & Unicode Character Lookup Tool

Features

12 Properties Per Char

String Breakdown

Full ASCII Table

Unicode Blocks

CSV Export

Dual Input

Who Uses This Tool?

Frequently Asked Questions

Pro Tip

Did You Know?

ASCII Control Characters Quick Reference

More Questions

Common Mistakes