Character sets. LETTERS, TOKENS and CODES
By Johan W van Wingen, Leiden, the Netherlands

From Standards for the electronic
exchange of personal data. Manual, Part 5.
 

Foreword
Introduction
1. Data, units and notations
2. Scripts and their constituents
3. Communication and coding, ASCII
4. International standardization of 7-bit codes, ISO 646
5. IBM and EBCDIC
6. International standardization of 8-bit codes, ISO 8859
7. Control functions, ISO 6429 and 10538
8. Code extensions, ISO 2022 and 2375, ISO 4873 and 10367
9. CCITT, Teletex, Videotex and ISO 6937
10. European languages and scripts and their letters
11. The International Register according to ISO 2375
12. Industry codes for printers or personal computers
13. Multiple-octet code, ISO 10646
14. Bibliographical codes, ISO 5426
15. Character as object, identification methods, ISO 7350
16. Conversion and transliteration
17. Input and output of characters
18. Ordering problems
19. Conformance and testing
20. Recommendations

Annexes

Annex 1. Members advisory board character sets
Annex 2. Rules for the use of the IJ
Annex 3. List of ISO standards in the field of coded character sets
Annex 4. Letters For Europe, report on the characters used in the various European languages
Annex 5. Contents of the character sets ISO/IEC 8859-1/6, with code and names of characters, ordered to code
Annex 6. Characters from ISO 6937 not included in ISO 8859
Annex 7. The Atlantic Subset of ISO/IEC 10646-1, in the form of a standard, based on the Repertoire of ISO 6937, with code in ISO/IEC 10646.

  • Table 1. Complete repertoire of letters and digits required for latin written European languages
  • Table 2. Complete repertoire of special characters required for  latin written European languages
  • Annex 8. A Transformation Scheme
    Annex 9. Map of Europe, with indication of the areas of application of LATIN-1 (vertical grading) en LATIN-2 (horizontal grading)
    Annex 10. Some keyboards (due to M. Claerhout, IBM Belgium)
    Annex 11. Codetables from ISO/IEC 8859-1/9 and 6937 and some industrial codetables (IBM EBCDIC, DEC, HP, IBM PC CP 437, CP 850 and Apple)
     

    Test pages

    1. To Annex 7. The Atlantic Subset of ISO/IEC 10646-1

  • Letters and digits defined by  SGML public entries
  • Letters and digits defined by  UCS codes
  • Special charcters defined by  SGML public entries
  • Special charcters defined by  UCS codes
  • 2. Selected Multilingual test pages
  • Complete Unicode two-bytes character set by Unicode pages

  • (source: Unicode Database) by UCS codes
  • Example multilingual text from webpages
  • Converted from Word 97 (with Language specification)
  • Converted from Netscape Composer (in UTF-8)


  • Copyright © 1999. J.W. van Wingen
    Design 1999. Yu. Demchenko, TERENA