Request for Comments: *****

Registration of a Ukrainian Cyrillic Character Set KOI8-U/Internet
(as extention to Russian KOI8-R and ISO-IR-111)


APRIL 1997



Status of this Memo

This memo provides information for the Internet community. It does not specify an Internet standard. Distribution of this memo is unlimited.

Introduction

Though the proposed character set "KOI8-U" is not currently an international standard, there is large Internet user community (including Ukraine and worldwide Ukrainian speaking community) supporting it.
"KOI8-RU" is de-facto standard accepted by all Ukrainian community in the Internet and unofficially published at many sites (F.E., ftp://ftp.ua.net/pub/info/encodings/koi8-u/ukr_chars_in_koi8-u_and_others.txt; ftp://ftp.gu.kiev.ua/pub/koi8-u/ukr_chars_in_koi8-u_and_others.txt; http://cad.ntu-kpi.kiev.ua/~demch/multiling/KOI8-U.html).

Ukrainian language is the 20th among the world's languages (http:// www.isoc.org:8080/langues/iso639.htm) and supported not only in Ukraine as national state but in many Ukrainian communities over the world.

KOI8-RU should be registered to support and facilitate general and cultural infromation content. Support of Ukrainian language in new software product is restrained by absent of oficially registered and widely published de-facto used Ukrainian charset.

One of the problem now is that all old codepages ISO-IR-111, ISO 8859-5 doesn't include new Ukr. letter KGE (with upturn). Now it's registered in UNICODE 2.0.14 as Cyrillic GHE with upturn (0490 - capital, 0491 - small). It is used in more than 25 ukrainian words and carry in some cases specific national features.

New standard have to state specifics of this letter that sounds as KGE in differnce of ordinary letter G that sounds as GHE. It's from linguistical research as long as from the time of it's introduction in 1818, reintroduction in 1924 and rehabilitation (after Stalin's linguistical researches) in 1992. So, such correction in spelling/transliteration of this letter have to be made in proposed and other standards.

MIME character set name: koi8-ru

Published specification:

This standard is unpublished but based on several published standards: first of all, RFC1489 (it is fully complaint in all russian letters), ISO 8859-5, ISO-IR-111, UNICODE 2.0.14.

Appendix contains coding/conversion tables for upper half of code table from KOI8-U to UNICODE, CP1251, ISO8859-5.

KOI8-U/Internet is compatible with KOI8-R in all Cyrillic Letters and completes it with all Cyrillic letters from ISO-IR-111 in positions #160-#191. FORMS in positions #128-#146, Bullets in positions #148, #149, #158 coincide with KOI8-R.

Positions #147, #150-#153, #155-#157 are used for important characters which are currently missing from ISO-IR-111 and KOI8-R replacing some special Mathemathical symbols.

KOI8-U completely compatible with ISO-IR-111 but differ in positions of one additional Ukrainian letters KGE (WITH UPTURN) which replaces characters shy (soft hyphen) and "currency sign". Positions #128-#159 are used for FORMS elements from KOI8-R and other important characters which are currently missing from ISO-IR-111.

Code of Belorussian letter SHORT U is complaint with ISO-IR-111.

The description of all characters from the upper half of the table in compliance with ISO 10646 (Unicode) with correction of Ukrainian letter KGE with upturn (UNICODE #0490, 0491). All Russian letters places have left at their original KOI8-R places. Introduced new ukrainian letters ocupy positions where they are used as standard-de-facto in Ukrainian language applications and newsgroups exchange accepted all Ukrainian language community.

<hex-code> <description>

80 FORMS LIGHT HORIZONTAL
81 FORMS LIGHT VERTICAL
82 FORMS LIGHT DOWN AND RIGHT
83 FORMS LIGHT DOWN AND LEFT
84 FORMS LIGHT UP AND RIGHT
85 FORMS LIGHT UP AND LEFT
86 FORMS LIGHT VERTICAL AND RIGHT
87 FORMS LIGHT VERTICAL AND LEFT
88 FORMS LIGHT DOWN AND HORIZONTAL
89 FORMS LIGHT UP AND HORIZONTAL
8A FORMS LIGHT VERTICAL AND HORIZONTAL
8B UPPER HALF BLOCK
8C LOWER HALF BLOCK
8D FULL BLOCK
8E LEFT HALF BLOCK
8F RIGHT HALF BLOCK
90 LIGHT SHADE
91 MEDIUM SHADE
92 DARK SHADE
93 LEFT DOUBLE QUOTATION MARK
94 BLACK SMALL SQUARE
95 BULLET OPERATOR
96 RIGHT DOUBLE QUOTATION MARK
97 EM DASH
98 COPYRIGHT SIGN
99 TRADE MARK SIGN
9A NOT USED
9B DOUBLE RIGHT-POINTING ANGLE QUOTATION MARK
9C REGISTERED SIGN
9D DOUBLE LEFT-POINTING ANGLE QUOTATION MARK
9E MIDDLE DOT
9F CURRENCY SIGN
A0 NON-BREAKING SPACE
A1 CYRILLIC SMALL LETTER DJE (Serbocroatian)
A2 CYRILLIC SMALL LETTER GJE (Macedonian)
A3 CYRILLIC SMALL LETTER IO
A4 CYRILLIC SMALL LETTER UKRAINIAN IE Ukrainian
A5 CYRILLIC SMALL LETTER DZE (Macedonian)
A6 CYRILLIC SMALL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
A7 CYRILLIC SMALL LETTER YI (UKRAINIAN) Ukrainian
A8 CYRILLIC SMALL LETTER JE
A9 CYRILLIC SMALL LETTER LJE
AA CYRILLIC SMALL LETTER NJE
AB CYRILLIC SMALL LETTER TSHE (Serbocroatian)
AC CYRILLIC SMALL LETTER KJE (Macedonian)
AD CYRILLIC SMALL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
AE CYRILLIC SMALL LETTER BELORUSSIAN SHORT U Belorusian
AF CYRILLIC SMALL LETTER DZHE
B0 NUMERO SIGN
B1 CYRILLIC CAPITAL LETTER DJE (Serbocroatian)
B2 CYRILLIC CAPITAL LETTER GJE (Macedonian)
B3 CYRILLIC CAPITAL LETTER IO
B4 CYRILLIC CAPITAL LETTER UKRAINIAN IE Ukrainian
B5 CYRILLIC CAPITAL LETTER DZE (Macedonian)
B6 CYRILLIC CAPITAL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
B7 CYRILLIC CAPITAL LETTER YI (UKRAINIAN) Ukrainian
B8 CYRILLIC CAPITAL LETTER JE
B9 CYRILLIC CAPITAL LETTER LJE
BA CYRILLIC CAPITAL LETTER NJE
BB CYRILLIC CAPITAL LETTER TSHE (Serbocroatian)
BC CYRILLIC CAPITAL LETTER KJE (Macedonian)
BD CYRILLIC CAPITAL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
BE CYRILLIC CAPITAL LETTER BELORUSSIAN SHORT U Belorussian
BF CYRILLIC CAPITAL LETTER DZHE COPYRIGHT SIGN (KOI8-R)
C0 CYRILLIC SMALL LETTER IU
C1 CYRILLIC SMALL LETTER A
C2 CYRILLIC SMALL LETTER BE
C3 CYRILLIC SMALL LETTER TSE
C4 CYRILLIC SMALL LETTER DE
C5 CYRILLIC SMALL LETTER IE
C6 CYRILLIC SMALL LETTER EF
C7 CYRILLIC SMALL LETTER GE (UKRAINIAN GHE) Ukrainian (spelling)
C8 CYRILLIC SMALL LETTER KHA
C9 CYRILLIC SMALL LETTER II
CA CYRILLIC SMALL LETTER SHORT II
CB CYRILLIC SMALL LETTER KA
CC CYRILLIC SMALL LETTER EL
CD CYRILLIC SMALL LETTER EM
CE CYRILLIC SMALL LETTER EN
CF CYRILLIC SMALL LETTER O
D0 CYRILLIC SMALL LETTER PE
D1 CYRILLIC SMALL LETTER IA
D2 CYRILLIC SMALL LETTER ER
D3 CYRILLIC SMALL LETTER ES
D4 CYRILLIC SMALL LETTER TE
D5 CYRILLIC SMALL LETTER U
D6 CYRILLIC SMALL LETTER ZHE
D7 CYRILLIC SMALL LETTER VE
D8 CYRILLIC SMALL LETTER SOFT SIGN
D9 CYRILLIC SMALL LETTER YERI
DA CYRILLIC SMALL LETTER ZE
DB CYRILLIC SMALL LETTER SHA
DC CYRILLIC SMALL LETTER REVERSED E
DD CYRILLIC SMALL LETTER SHCHA
DE CYRILLIC SMALL LETTER CHE
DF CYRILLIC SMALL LETTER HARD SIGN
E0 CYRILLIC CAPITAL LETTER IU
E1 CYRILLIC CAPITAL LETTER A
E2 CYRILLIC CAPITAL LETTER BE
E3 CYRILLIC CAPITAL LETTER TSE
E4 CYRILLIC CAPITAL LETTER DE
E5 CYRILLIC CAPITAL LETTER IE
E6 CYRILLIC CAPITAL LETTER EF
E7 CYRILLIC CAPITAL LETTER GE (UKRAINIAN GHE) Ukrainian (spelling)
E8 CYRILLIC CAPITAL LETTER KHA
E9 CYRILLIC CAPITAL LETTER II
EA CYRILLIC CAPITAL LETTER SHORT II
EB CYRILLIC CAPITAL LETTER KA
EC CYRILLIC CAPITAL LETTER EL
ED CYRILLIC CAPITAL LETTER EM
EE CYRILLIC CAPITAL LETTER EN
EF CYRILLIC CAPITAL LETTER O
F0 CYRILLIC CAPITAL LETTER PE
F1 CYRILLIC CAPITAL LETTER IA
F2 CYRILLIC CAPITAL LETTER ER
F3 CYRILLIC CAPITAL LETTER ES
F4 CYRILLIC CAPITAL LETTER TE
F5 CYRILLIC CAPITAL LETTER U
F6 CYRILLIC CAPITAL LETTER ZHE
F7 CYRILLIC CAPITAL LETTER VE
F8 CYRILLIC CAPITAL LETTER SOFT SIGN
F9 CYRILLIC CAPITAL LETTER YERI
FA CYRILLIC CAPITAL LETTER ZE
FB CYRILLIC CAPITAL LETTER SHA
FC CYRILLIC CAPITAL LETTER REVERSED E
FD CYRILLIC CAPITAL LETTER SHCHA
FE CYRILLIC CAPITAL LETTER CHE
FF CYRILLIC CAPITAL LETTER HARD SIGN


Legend

* New letters introduced

+ Change in name for Ukrainian letter

Security Considerations

Security issues are not discussed in this memo.


APPENDIX A

DIFFERENCE OF KOI8-U/Internet from EXISTING KOI8-R and ISO-IR-111


KOI8-U/Internet is compatible with KOI8-R in all Cyrillic Letters and completes it with all Cyrillic letters from ISO-IR-111 in positions #160-#191.

FORMS in positions #128-#146, Bullets in positions #148, #149, #158 and Mathemathical symbols in positions #151, #159 coincide with KOI8-R.

Positions #147, #150, #152, #153, #!55-#157, #159 are used for important characters which are currently missing from ISO-IR-111

93 LEFT DOUBLE QUOTATION MARK
96 RIGHT DOUBLE QUOTATION MARK
97 EM DASH
98 COPYRIGHT SIGN
99 TRADE MARK SIGN
9B DOUBLE RIGHT-POINTING ANGLE QUOTATION MARK
9C REGISTERED SIGN
9D DOUBLE LEFT-POINTING ANGLE QUOTATION MARK
9F CURRENCY SIGN
A0 NON-BREAKING SPACE
A1 CYRILLIC SMALL LETTER DJE (Serbocroatian)
A2 CYRILLIC SMALL LETTER GJE (Macedonian)
A3 CYRILLIC SMALL LETTER IO
A4 CYRILLIC SMALL LETTER UKRAINIAN IE Ukrainian
A5 CYRILLIC SMALL LETTER DZE (Macedonian)
A6 CYRILLIC SMALL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
A7 CYRILLIC SMALL LETTER YI (UKRAINIAN) Ukrainian
A8 CYRILLIC SMALL LETTER JE
A9 CYRILLIC SMALL LETTER LJE
AA CYRILLIC SMALL LETTER NJE
AB CYRILLIC SMALL LETTER TSHE (Serbocroatian)
AC CYRILLIC SMALL LETTER KJE (Macedonian)
AD CYRILLIC SMALL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
AE CYRILLIC SMALL LETTER BELORUSSIAN SHORT U Belorusian
AF CYRILLIC SMALL LETTER DZHE
B0 NUMERO SIGN
B1 CYRILLIC CAPITAL LETTER DJE (Serbocroatian)
B2 CYRILLIC CAPITAL LETTER GJE (Macedonian)
B3 CYRILLIC CAPITAL LETTER IO
B4 CYRILLIC CAPITAL LETTER UKRAINIAN IE Ukrainian
B5 CYRILLIC CAPITAL LETTER DZE (Macedonian)
B6 CYRILLIC CAPITAL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
B7 CYRILLIC CAPITAL LETTER YI (UKRAINIAN) Ukrainian
B8 CYRILLIC CAPITAL LETTER JE
B9 CYRILLIC CAPITAL LETTER LJE
BA CYRILLIC CAPITAL LETTER NJE
BB CYRILLIC CAPITAL LETTER TSHE (Serbocroatian)
BC CYRILLIC CAPITAL LETTER KJE (Macedonian)
BD CYRILLIC CAPITAL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
BE CYRILLIC CAPITAL LETTER BELORUSSIAN SHORT U Belorussian
BF CYRILLIC CAPITAL LETTER DZHE COPYRIGHT SIGN (KOI8-R)


KOI8-U/Internet completely compatible with ISO-IR-111 but differs in positions of one additional Ukrainian letter KGE WITH UPTURN and positions #128-#159 are used for FORMS elements from KOI8-R and other important characters Which are currently missing from ISO-IR-111.

80 FORMS LIGHT HORIZONTAL
81 FORMS LIGHT VERTICAL
82 FORMS LIGHT DOWN AND RIGHT
83 FORMS LIGHT DOWN AND LEFT
84 FORMS LIGHT UP AND RIGHT
85 FORMS LIGHT UP AND LEFT
86 FORMS LIGHT VERTICAL AND RIGHT
87 FORMS LIGHT VERTICAL AND LEFT
88 FORMS LIGHT DOWN AND HORIZONTAL
89 FORMS LIGHT UP AND HORIZONTAL
8A FORMS LIGHT VERTICAL AND HORIZONTAL
8B UPPER HALF BLOCK
8C LOWER HALF BLOCK
8D FULL BLOCK
8E LEFT HALF BLOCK
8F RIGHT HALF BLOCK
90 LIGHT SHADE
91 MEDIUM SHADE
92 DARK SHADE
93 LEFT DOUBLE QUOTATION MARK
94 BLACK SMALL SQUARE
95 BULLET OPERATOR
96 RIGHT DOUBLE QUOTATION MARK
97 EM DASH
98 COPYRIGHT SIGN
99 TRADE MARK SIGN
9A NOT USED
9B DOUBLE RIGHT-POINTING ANGLE QUOTATION MARK
9C REGISTERED SIGN
9D DOUBLE LEFT-POINTING ANGLE QUOTATION MARK
9E MIDDLE DOT
9F CURRENCY SIGN
AD CYRILLIC SMALL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
BD CYRILLIC CAPITAL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian