A Compendium of Characters, Scripts, Ideographs, Symbols, and Glyphs of the World

Generating and Rendering Unicode™ Characters in Your Browser Through HTML Decimal References

A Primer on Unicode and the Generator

The Generator

If you scroll down you'll come across what I've dubbed the Unicode Character Generator. By clicking on the Unicode categories a new page in a new browser window or tab is loaded that reveals (actually tests your browser's capacity to display) the Unicode characters.

Go ahead and do some clicking right now. Check out the category Basic Latin (it contains the English alphabet) for starters to get a feel for what the Generator does and should do. Keep in mind the Generator is JavaScript-based, so if the JavaScript option in your browser is switched off then clicking on the categories won't trigger anything.

Note that this document is designed to run off-line. There's no need to connect to the Internet. Bear in mind as well that some categories, like the Oriental language sets for instance, have hundreds or even tens of thousands of characters. Therefore, depending on your hardware and browser, the Generator may make take a while before displaying anything and finishing its task. Be patient. Else, run this page on a supercomputer.

What is the Unicode standard and what are HTML Decimal References?

Simply put Unicode is the set of numbers that identifies each and every character a computer can understand. In the words of the Unicode Consortium: "Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language." To see these numbers (in hexadecimal, i.e., base 16) and their associated glyphs you can download the Consortium's Unibook.

The HTML decimal reference system is an instance of implementation of the Unicode standard for Hypertext Markup Language (HTML). For example the capital letter A has been assigned the Unicode number 65 (decimal or base 10). It's HTML decimal reference is A When your browser encounters that code it will print A on your screen. The ampersand and hash symbol prefix are mandatory, so is the semicolon suffix. Therefore, an HTML decimal reference is simply the Unicode number (in base 10) accompanied by the prefix and suffix mentioned.

So What?

Need Greek characters like pi (π), sigma (Σ), or omega (Ω) without relying on Microsoft-specific Greek true type fonts? Or, want an authentic en-dash (–) or em-dash (—) instead of two (--) or three (---) hyphens? Any use for this pair: ♀ ♂ ? Converting your francs to euro? Then you'll need this euro symbol. That'll be €10.00 please.

You can display all these symbols without relying on any special font face. Incidentally, you can highlight and copy any of the foregoing characters and paste them on your email or website. Unfortunately, word processors are a different lot, so test if they can recognize the characters by copying from your email/browser and tacking them on your word processor document. If a rectangle or a question mark appears, or if nothing happens at all, then your software urgently needs to take a remedial language course.

Here's how to benefit from the Generator: Make sure you have an HTML enabled email software. Create a new message. Entitle it "Special Characters and Symbols" or some descriptive name that'll make sense to you. Run the generator and browse through the characters generated. When you come across a character you need or just fancy, highlight and copy both it and its decimal reference. Paste them onto the email. When you're done edit if need be, then save the 'message' in your drafts box. You can always add to this list in the future. Now, whenever you need a special character you can just go to the email and copy the character. Including the decimal reference is a good idea just in case you need to generate the character again. You can also print the page displayed by the Generator. Beware though–some pages may contain hundreds or thousands of lines!

The Catch

Unfortunately, not all browsers and email software will ever be equal. Old (pre-millennium) editions have probably poor or no support for decimal references beyond 255 (ASCII code). The latest versions (as of 2016) of Firefox, Opera, Chrome are able to display a very large set of symbols and special characters that can be accessed through character references (numeric character references or character entity references).

As you will discover most of the categories listed below will not generate any characters. No browser to date displays the Cherokee and Tagbanwa set. The bottom line is that the more decimal references your browser recognizes the more characters it can output to screen. In time as the Unicode Consortium adds more character sets and as browsers become more powerful the number of characters that can be displayed by the Generator will certainly grow.

Tip

You can save this page on your computer. It's self-contained. All the Cascading Style Sheet rules and JavaScript code are contained in this single html document. No external files are accessed.

The Unicode Character Generator

How to Use

  1. Choose a font face the characters will appear in by clicking one of the radio buttons below. Default font is sans serif.
  2. Choose a Unicode category by clicking the name of the category. Placing your mouse over the categories will highlight them.
  3. A new page in a new window/tab will be created containing three columns: the Unicode numbers (in hexadecimal), the HTML decimal references, and the corresponding characters as rendered by your browser. Your browser will display either a blank space, a question mark or an empty square or rectangle for those character codes which it does not (yet) understand or for which it has no character/glyph for.
  4. Return to this page if you wish to choose another Unicode category to view.
sans serif serif monospace cursive fantasy