Character encoding
From Just Solve the File Format Problem
(Difference between revisions)
Dan Tobias (Talk | contribs) |
Dan Tobias (Talk | contribs) |
||
Line 152: | Line 152: | ||
* [http://www.kreativekorp.com/software/recode/ Kreative Recode: software to convert character encodings] | * [http://www.kreativekorp.com/software/recode/ Kreative Recode: software to convert character encodings] | ||
− | == | + | == Commentary and satire == |
+ | * [http://www.joelonsoftware.com/articles/Unicode.html The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)] by [http://en.wikipedia.org/wiki/Joel_Spolsky Joel Spolsky] | ||
+ | [[Category:Character Encodings]] | ||
+ | * [http://geoff.greer.fm/2012/08/12/character-encoding-bugs-are-%F0%9D%92%9Cwesome/ Character encoding bugs are 𝒜wesome!] | ||
+ | * [http://xkcd.com/1209/ xkcd: Encoding] | ||
+ | == Other external links == | ||
* [http://www.kreativekorp.com/charset/ Lots of character encoding charts] | * [http://www.kreativekorp.com/charset/ Lots of character encoding charts] | ||
* [http://www.transbay.net/~enf/ascii/ascii.pdf The Evolution of Character Codes, 1874–1968] | * [http://www.transbay.net/~enf/ascii/ascii.pdf The Evolution of Character Codes, 1874–1968] | ||
* [http://www.kreativekorp.com/charset/ Collection of character encodings] | * [http://www.kreativekorp.com/charset/ Collection of character encodings] | ||
− | |||
− | |||
− | |||
== References == | == References == | ||
* Ken Lunde, ''CJKV Information Processing'', O'Reilly 2008, ISBN 978-0-596-51447-1 (has lots of information on encodings and Unicode in general, not only for CJKV locales) | * Ken Lunde, ''CJKV Information Processing'', O'Reilly 2008, ISBN 978-0-596-51447-1 (has lots of information on encodings and Unicode in general, not only for CJKV locales) | ||
* [http://archive.org/details/bitsavers_ibm3270GA2SetReferenceApr87_34686991 IBM 3270 character set reference (1987)] | * [http://archive.org/details/bitsavers_ibm3270GA2SetReferenceApr87_34686991 IBM 3270 character set reference (1987)] |
Revision as of 16:12, 8 May 2013
See Fonts for their renditions as seen on screens and printouts.
- Adobe Standard Encoding
- ANSEL
- APL code page
- ARMSCII
- ASCII
- ATASCII (used by Atari computers)
- Baudot code
- Braille
- Compucolor character set
- EBCDIC
- IBM PC code pages
- ISO 646
- ISO 646-CA (Canada / French)
- ISO 646-CA-2 (Canada / French)
- ISO 646-CH (Switzerland)
- ISO 646-CN (China / Basic Latin)
- ISO 646-CU (Cuba / Spanish)
- ISO 646-DE (Germany)
- ISO 646-DK (Denmark)
- ISO 646-FI (Finland)
- ISO 646-FR (France)
- ISO 646-GB (Great Britain)
- ISO 646-HU (Hungary)
- ISO 646-IRV (International Reference Version)
- ISO 646-IT (Italy)
- ISO 646-JP (Japan / Romaji)
- ISO 646-JP OCR-B (Japan / Romaji)
- ISO 646-KR (Korea / Latin)
- ISO 646-MT (Malta)
- ISO 646-NL (Netherlands)
- ISO 646-NO (Norway)
- ISO 646-NO-2 (Norway)
- ISO 646-PT (Portugal)
- ISO 646-SE (Sweden)
- ISO 646-SE-2 (Sweden)
- ISO 646-US (Same as ASCII)
- ISO 646-YU (Yugoslavia)
- ISO 2022
- ISO 8859
- ISO 8859-1 (Latin-1)
- ISO 8859-2 (Latin-2, Central/East European)
- ISO 8859-3 (Latin-3, Esperanto, Galician, Maltese, and Turkish)
- ISO 8859-4 (Latin-4, Scandinavian and Baltic)
- ISO 8859-5 (Cyrillic)
- ISO 8859-6 (Arabic)
- ISO 8859-7 (Modern Greek)
- ISO 8859-8 (Hebrew)
- ISO 8859-9 (Latin-5, Turkish)
- ISO 8859-10 (Latin-6, Lappish, Nordic, and Inuit)
- ISO 8859-11 (Thai)
- ISO 8859-13 (Latin-7, Baltic Rim)
- ISO 8859-14 (Celtic)
- ISO 8859-15 (Latin-9, Latin-1 with a Euro sign)
- ISO 8859-16 (Romanian)
- JIS
- KOI8
- Macintosh encodings
- Morse code
- MS-DOS encodings
- PETSCII (or PET ASCII or CBM ASCII; used by Commodore computers)
- Unicode
- VISCII
- Windows encodings
- Windows 1252 (ISO 8859-1 plus additional characters)
- Windows 1255 (Hebrew)
- Windows 1256 (Arabic, Farsi, Urdu)
- Windows 1257 (Baltic Rim)
- Windows 1258 (Vietnamese)
Contents |
Format details
- Byte Order Mark
- C0 controls (ASCII control characters, 7 bit)
- C1 controls (extended control characters, 8 bit)
Character escape codes
(used to enter characters in various systems and formats)
- Alt codes (DOS/Windows)
- Backslash escapes (used in various programming and markup languages)
- HTML character references (entities and numeric values)
Tools
Commentary and satire
- The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky
- Character encoding bugs are 𝒜wesome!
- xkcd: Encoding
Other external links
- Lots of character encoding charts
- The Evolution of Character Codes, 1874–1968
- Collection of character encodings
References
- Ken Lunde, CJKV Information Processing, O'Reilly 2008, ISBN 978-0-596-51447-1 (has lots of information on encodings and Unicode in general, not only for CJKV locales)
- IBM 3270 character set reference (1987)