With the rise of computers based on 8-bit structures with IBM and its followers, since 1965, gradually the need disappeared to restrict one to 7-bit coding systems from cost considerations. It had no sense to leave the 8th bit unused. As a result of the spreading of EBCDIC on the large machines, it lasted to 1987 before the first 8-bit code based on ASCII was adopted by ISO.
The structure of ISO 8859 is that of ISO 646, but "doubled". The codetable uses the columns 2-7, A-F for graphic characters. The filling of 2-7, the "left half" (GL), is identical to that of ASCII (which is equal to that of ISO 646:1991 IRV, not to that of the old IRV, from ISO 646:1983). For the "right half" (GR) a selection had to be made, the total number of of letters required for European languages using Latin script being too large for 96 positions. The selections that were made are indicated with LATIN-1, LATIN-2, LATIN-3, LATIN-4. With each of these is corresponding a "Part" of ISO 8859. After approval of these four, the combinations of ASCII with Cyrillic, Arabic, Greek and Hebrew as "parts" were adopted. Because LATIN-3 was little satisfactory for Turkish, LATIN-5 has been added, and later LATIN-6 for Scandinavian languages as well.
ISO 8859 8-bit single byte coded graphic character sets, in Parts:
A list of all characters in ISO 8859-1, -2, -3, -4, with the official names (cf. Chap. 15), the Latin alphabet in which they occur and their code position, is included in the Annexes. (The meaning of the column under SID is explained in Chapter 15.)
The tables contain apart from extra letters also an additional set of special signs and marks, which has, however, not the same extent in all Parts. The copyright sign was obviously only relevant to the capitalist part of Europe. The accents included are to be used free-standing only. With NBS is meant No-Break Space. This is a space that is distinct from the normal one in that it is not interpreted as a delimiter at word-processing , but is part of the word itself. SHY means SOFT HYPHEN, it is just a hyphen like the HYPHEN-MINUS, but it is only displayed if a certain condition is satisfied, for example, at the end of a line when a word is hyphenated.
ISO 8859 is a considerable progress in the sense that it allows coding of the more important Western European languages, such as English, French and German, without loss of the accents. That the French ÷ × are not available in LATIN-1 is still a matter of regret. Historic mistakes in standards can hardly be repaired anymore.
1For the user in the Netherlands the six Latin alphabets are the more interesting. At implementation, however, one has to select one of the six, and letters needed for writing languages used in other parts of Europe are not available. For this reason it is not possible to combine certain languages in the same text. A quotation from French in a Czech text (see in Table 2) cannot be coded correctly if applying LATIN-2. The barrier created by ISO 8859 between languages geographically coincides with the Iron Curtain, which prolongs its existence here.
The design of ISO 8859 covered initially:
It soon became clear that the Turks were dissatified with LATIN-3, with the result that a new LATIN-5 was created. Unfortunately, the effect is now that Turkish and Icelandic exclude each other in a text.
The Scandinavians wanted Sami (Lappish) to be included, resulting in LATIN-6. With these additions the design of ISO 8859, initially so clean, became increasingly untidied. The present situation is therefore rather confusing with respect to North and South.
In Northern Europe LATIN-1 is well suited to the needs of the main languages. LATIN-6 appeared not to be acceptable to the Baltic countries. For these ("Baltic Rim") a codetable was designed, which has at present only a ISO-IR registration number (IR 179), and contains Polish as well, but not Sami. How things will develop in actual practice we have to await, now that we may select from four codetables:
After what has been said above, the choice to be made by the Netherlands was not difficult anymore. Proceeding from the needs of Western European languages, only LATIN-1 and LATIN-5 could be candidates. Turkish had priority over Icelandic, thus Latin Alphabet nr. 5 (ISO 8859-9) has been selected as the national Netherlands standard for an 8-bit coded character set.
1Technical developments show that LATIN-1 has already been implemented on a large scale, at a significant distance followed by LATIN-2. But the differences between LATIN-1 an LATIN-5 are not great. In the case that LATIN-1 can be supplied by a producer for an application, but not LATIN-5, only at a few places modifications have to made to adapt a system to the requirements.
The following correspondences exist between the coding of the six letters
that make up the differences in the repertoires of LATIN-1 and LATIN-5,
(LATIN is omitted from the names):
|13/00||Ð||CAPITAL LETTER ETH||"||CAPITAL LETTER G WITH BREVE|
|13/13||Ý||CAPITAL LETTER Y WITH ACUTE||CAPITAL LETTER S WITH CEDILLA|
|13/14||Þ||CAPITAL LETTER THORN||•||CAPITAL LETTER I WITH DOT ABOVE|
|15/00||ð||SMALL LETTER ETH||"||SMALL LETTER G WITH BREVE|
|15/13||ý||SMALL LETTER Y WITH ACUTE||‡||SMALL LETTER S WITH CEDILLA|
|15/14||þ||SMALL LETTER THORN||–||SMALL LETTER DOTLESS I|
If it is a matter of display only, then modification of keyboard or printer is the thing to do in the first place, but text-processing presents more problems, if only we think of the handling of the "I", ("I" is no longer the capital letter of the "i", because Turkish has a DOTLESS I and a CAPITAL LATIN LETTER I WITH DOT ABOVE).
ISO 8859-1 LATIN-1
ISO 8859-2, LATIN-2
ISO 8859-9, LATIN-5
List of languages with number of the Latin alphabet that contains
the required letters
|Latin 1||Latin 2||Latin 3||Latin 4||Latin 5||Latin 6||Latin 1||Latin 2||Latin 3||Latin 4||Latin 5||Latin 6|
* without ÷ ×