HTML character entities

Superscript:

http://code.google.com/p/doctype/wiki/Sup2CharacterEntity

http://code.google.com/p/doctype/wiki/CharacterEntitiesS

Unicode

Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world’s writing systems.

UTF-8

UTF-8 is a multibyte character encoding for Unicode.

UTF-8 is like UTF-16 and UTF-32, because it can represent every character in the Unicode character set.

Unlike UTF-16 and UTF-32, UTF-8 possesses the advantages of being backward-compatible with ASCII.

UTF-8 has become the dominant character encoding for the World-Wide Web, accounting for more than half of all Web pages.