HTML character entities
Superscript:
http://code.google.com/p/doctype/wiki/Sup2CharacterEntity
http://code.google.com/p/doctype/wiki/CharacterEntitiesS
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world’s writing systems.
UTF-8
UTF-8 is a multibyte character encoding for Unicode.
UTF-8 is like UTF-16 and UTF-32, because it can represent every character in the Unicode character set.
Unlike UTF-16 and UTF-32, UTF-8 possesses the advantages of being backward-compatible with ASCII.
UTF-8 has become the dominant character encoding for the World-Wide Web, accounting for more than half of all Web pages.