Request for Comments: *****

Registration of a Ukrainian Cyrillic Character Set KOI8-U/RUB
(as extention to Russian KOI8-R and ISO-IR-111)


APRIL 1997



Status of this Memo

This memo provides information for the Internet community. It does not specify an Internet standard. Distribution of this memo is unlimited.

Introduction

Though the proposed character set "KOI8-U" is not currently an international standard, there is large Internet user community (including Ukraine and worldwide Ukrainian speaking community) supporting it.
"KOI8-RU" is de-facto standard accepted by all Ukrainian community in the Internet and unofficially published at many sites (F.E., ftp://ftp.ua.net/pub/info/encodings/koi8-u/ukr_chars_in_koi8-u_and_others.txt; ftp://ftp.gu.kiev.ua/pub/koi8-u/ukr_chars_in_koi8-u_and_others.txt; http://cad.ntu-kpi.kiev.ua/~demch/multiling/KOI8-U.html).

Ukrainian language is the 20th among the world's languages (http:// www.isoc.org:8080/langues/iso639.htm) and supported not only in Ukraine as national state but in many Ukrainian communities over the world.

KOI8-U should be registered to support and facilitate general and cultural infromation content. Support of Ukrainian language in new software product is restrained by absent of oficially registered and widely published de-facto used Ukrainian charset.

One of the problem now is that all old codepages ISO-IR-111, ISO 8859-5 doesn't include new Ukr. letter KGE (with upturn). Now it's registered in UNICODE 2.0.14 as Cyrillic GHE with upturn (0490 - capital, 0491 - small). It is used in more than 25 ukrainian words and carry in some cases specific national features.

New standard have to state specifics of this letter that sounds as KGE in differnce of ordinary letter G that sounds as GHE. It's from linguistical research as long as from the time of it's introduction in 1818, reintroduction in 1924 and rehabilitation (after Stalin's linguistical researches) in 1992. So, such correction in spelling/transliteration of this letter have to be made in proposed and other standards.

MIME character set name: koi8-u

Published specification:

This standard is unpublished but based on several published standards: first of all, RFC1489 (it is fully complaint in all russian letters), ISO 8859-5, ISO-IR-111, UNICODE 2.0.14.

Appendix contains coding/conversion tables for upper half of code table from KOI8-U to UNICODE, CP1251, ISO8859-5.

KOI8-U is compatible with KOI8-R in all Cyrillic Letters and completes it with four Ukrainian (#164, #180 - ukr. ie, #166, #182 - ukr. i, #167, #183 - ukr. yi, #173, #189 - ukr. ghe with upturn) and one Byelorussian (#174, #190 - byelorussian short u) letters are complaint with ISO-IR-111.

All FORMS except positions ocupied by Ukrainian and Byelorussian letters and Bullets in positions #148, #149, #158 coincide with KOI8-R.

Positions #147, #150-153, #155-#157, #159 are used for important characters which are currently missing from ISO-IR-111.

Code of Belorussian letter SHORT U is complaint with ISO-IR-111.

The description of all characters from the upper half of the table in compliance with ISO 10646 (Unicode) with correction of Ukrainian letter KGE with upturn (UNICODE #0490, 0491). All Russian letters places have left at their original KOI8-R places. Introduced new ukrainian letters ocupy positions where they are used as standard-de-facto in Ukrainian language applications and newsgroups exchange accepted all Ukrainian language community.

<hex-code> <description>

80 FORMS LIGHT HORIZONTAL
81 FORMS LIGHT VERTICAL
82 FORMS LIGHT DOWN AND RIGHT
83 FORMS LIGHT DOWN AND LEFT
84 FORMS LIGHT UP AND RIGHT
85 FORMS LIGHT UP AND LEFT
86 FORMS LIGHT VERTICAL AND RIGHT
87 FORMS LIGHT VERTICAL AND LEFT
88 FORMS LIGHT DOWN AND HORIZONTAL
89 FORMS LIGHT UP AND HORIZONTAL
8A FORMS LIGHT VERTICAL AND HORIZONTAL
8B UPPER HALF BLOCK
8C LOWER HALF BLOCK
8D FULL BLOCK
8E LEFT HALF BLOCK
8F RIGHT HALF BLOCK
90 LIGHT SHADE
91 MEDIUM SHADE
92 DARK SHADE
93 LEFT DOUBLE QUOTATION MARK
94 BLACK SMALL SQUARE
95 BULLET OPERATOR
96 RIGHT DOUBLE QUOTATION MARK
97 EM DASH
98 COPYRIGHT SIGN
99 TRADE MARK SIGN
9A NOT USED
9B DOUBLE RIGHT-POINTING ANGLE QUOTATION MARK
9C REGISTERED SIGN
9D DOUBLE LEFT-POINTING ANGLE QUOTATION MARK
9E MIDDLE DOT
9F CURRENCY SIGN
A0 FORMS DOUBLE HORIZONTAL
A1 FORMS DOUBLE VERTICAL
A2 FORMS DOWN SINGLE AND RIGHT DOUBLE
A3 CYRILLIC SMALL LETTER IO
A4 CYRILLIC SMALL LETTER UKRAINIAN IE Ukrainian
A5 FORMS DOUBLE DOWN AND RIGHT
A6 CYRILLIC SMALL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
A7 CYRILLIC SMALL LETTER YI (UKRAINIAN) Ukrainian
A8 FORMS DOUBLE DOWN AND LEFT
A9 FORMS UP SINGLE AND RIGHT DOUBLE
AA FORMS UP DOUBLE AND RIGHT SINGLE
AB FORMS DOUBLE UP AND RIGHT
AC FORMS UP SINGLE AND LEFT DOUBLE
AD CYRILLIC SMALL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
AE CYRILLIC SMALL LETTER BELORUSSIAN SHORT U Belorusian
AF FORMS VERTICAL SINGLE AND RIGHT DOUBLE
B0 FORMS VERTICAL DOUBLE AND RIGHT SINGLE
B1 FORMS DOUBLE VERTICAL AND RIGHT
B2 FORMS VERTICAL SINGLE AND LEFT DOUBLE
B3 CYRILLIC CAPITAL LETTER IO
B4 CYRILLIC CAPITAL LETTER UKRAINIAN IE Ukrainian
B5 FORMS DOUBLE VERTICAL AND LEFT
B6 CYRILLIC CAPITAL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
B7 CYRILLIC CAPITAL LETTER YI (UKRAINIAN) Ukrainian
B8 FORMS DOUBLE DOWN AND HORIZONTAL
B9 FORMS UP SINGLE AND HORIZONTAL DOUBLE
BA FORMS UP DOUBLE AND HORIZONTAL SINGLE
BB FORMS DOUBLE UP AND HORIZONTAL
BC FORMS VERTICAL SINGLE AND HORIZONTAL DOUBLE
BD CYRILLIC CAPITAL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
BE CYRILLIC CAPITAL LETTER BELORUSSIAN SHORT U Belorussian
BF COPYRIGHT SIGN
C0 CYRILLIC SMALL LETTER IU
C1 CYRILLIC SMALL LETTER A
C2 CYRILLIC SMALL LETTER BE
C3 CYRILLIC SMALL LETTER TSE
C4 CYRILLIC SMALL LETTER DE
C5 CYRILLIC SMALL LETTER IE
C6 CYRILLIC SMALL LETTER EF
C7 CYRILLIC SMALL LETTER GE (UKRAINIAN GHE) Ukrainian (spelling)
C8 CYRILLIC SMALL LETTER KHA
C9 CYRILLIC SMALL LETTER II
CA CYRILLIC SMALL LETTER SHORT II
CB CYRILLIC SMALL LETTER KA
CC CYRILLIC SMALL LETTER EL
CD CYRILLIC SMALL LETTER EM
CE CYRILLIC SMALL LETTER EN
CF CYRILLIC SMALL LETTER O
D0 CYRILLIC SMALL LETTER PE
D1 CYRILLIC SMALL LETTER IA
D2 CYRILLIC SMALL LETTER ER
D3 CYRILLIC SMALL LETTER ES
D4 CYRILLIC SMALL LETTER TE
D5 CYRILLIC SMALL LETTER U
D6 CYRILLIC SMALL LETTER ZHE
D7 CYRILLIC SMALL LETTER VE
D8 CYRILLIC SMALL LETTER SOFT SIGN
D9 CYRILLIC SMALL LETTER YERI
DA CYRILLIC SMALL LETTER ZE
DB CYRILLIC SMALL LETTER SHA
DC CYRILLIC SMALL LETTER REVERSED E
DD CYRILLIC SMALL LETTER SHCHA
DE CYRILLIC SMALL LETTER CHE
DF CYRILLIC SMALL LETTER HARD SIGN
E0 CYRILLIC CAPITAL LETTER IU
E1 CYRILLIC CAPITAL LETTER A
E2 CYRILLIC CAPITAL LETTER BE
E3 CYRILLIC CAPITAL LETTER TSE
E4 CYRILLIC CAPITAL LETTER DE
E5 CYRILLIC CAPITAL LETTER IE
E6 CYRILLIC CAPITAL LETTER EF
E7 CYRILLIC CAPITAL LETTER GE (UKRAINIAN GHE) Ukrainian (spelling)
E8 CYRILLIC CAPITAL LETTER KHA
E9 CYRILLIC CAPITAL LETTER II
EA CYRILLIC CAPITAL LETTER SHORT II
EB CYRILLIC CAPITAL LETTER KA
EC CYRILLIC CAPITAL LETTER EL
ED CYRILLIC CAPITAL LETTER EM
EE CYRILLIC CAPITAL LETTER EN
EF CYRILLIC CAPITAL LETTER O
F0 CYRILLIC CAPITAL LETTER PE
F1 CYRILLIC CAPITAL LETTER IA
F2 CYRILLIC CAPITAL LETTER ER
F3 CYRILLIC CAPITAL LETTER ES
F4 CYRILLIC CAPITAL LETTER TE
F5 CYRILLIC CAPITAL LETTER U
F6 CYRILLIC CAPITAL LETTER ZHE
F7 CYRILLIC CAPITAL LETTER VE
F8 CYRILLIC CAPITAL LETTER SOFT SIGN
F9 CYRILLIC CAPITAL LETTER YERI
FA CYRILLIC CAPITAL LETTER ZE
FB CYRILLIC CAPITAL LETTER SHA
FC CYRILLIC CAPITAL LETTER REVERSED E
FD CYRILLIC CAPITAL LETTER SHCHA
FE CYRILLIC CAPITAL LETTER CHE
FF CYRILLIC CAPITAL LETTER HARD SIGN


Legend

* New letters introduced

+ Change in name for Ukrainian letter

Security Considerations

Security issues are not discussed in this memo.


APPENDIX A

DIFFERENCE OF KOI8-U/Internet from EXISTING KOI8-R and ISO-IR-111


KOI8-U is compatible with KOI8-R in all Cyrillic Letters and completes it with four Ukrainian letters and one Byelorussian which positions coincide with ISO-IR-111.

FORMS in positions #128-#146, Bullets in positions #148, #149, #158 and Mathemathical symbols in positions #151, #159 coincide with KOI8-R.

Positions #147, #150, #152, #153, #155-#157, #159 are used for important characters which are currently missing from ISO-IR-111

93 LEFT DOUBLE QUOTATION MARK
96 RIGHT DOUBLE QUOTATION MARK
97 EM DASH
98 COPYRIGHT SIGN
99 TRADE MARK SIGN
9B DOUBLE RIGHT-POINTING ANGLE QUOTATION MARK
9C REGISTERED SIGN
9D DOUBLE LEFT-POINTING ANGLE QUOTATION MARK
9F CURRENCY SIGN
A4 CYRILLIC SMALL LETTER UKRAINIAN IE Ukrainian
A6 CYRILLIC SMALL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
A7 CYRILLIC SMALL LETTER YI (UKRAINIAN) Ukrainian
AD CYRILLIC SMALL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
AE CYRILLIC SMALL LETTER BELORUSSIAN SHORT U Belorusian
B4 CYRILLIC CAPITAL LETTER UKRAINIAN IE Ukrainian
B6 CYRILLIC CAPITAL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
B7 CYRILLIC CAPITAL LETTER YI (UKRAINIAN) Ukrainian
BD CYRILLIC CAPITAL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
BE CYRILLIC CAPITAL LETTER BELORUSSIAN SHORT U Belorusian


KOI8-U compatible with ISO-IR-111 in positions of all Russian, Ukrainian and Byelorussian letters but completed with one additional Ukrainian letter KGE WITH UPTURN. Other Cyrillic letters excluded from codetable and replaced with FORM elements from KOI8-R. Positions #128-#159 are used for FORMS elements from KOI8-R and other important characters Which are currently missing from ISO-IR-111.

80 FORMS LIGHT HORIZONTAL
81 FORMS LIGHT VERTICAL
82 FORMS LIGHT DOWN AND RIGHT
83 FORMS LIGHT DOWN AND LEFT
84 FORMS LIGHT UP AND RIGHT
85 FORMS LIGHT UP AND LEFT
86 FORMS LIGHT VERTICAL AND RIGHT
87 FORMS LIGHT VERTICAL AND LEFT
88 FORMS LIGHT DOWN AND HORIZONTAL
89 FORMS LIGHT UP AND HORIZONTAL
8A FORMS LIGHT VERTICAL AND HORIZONTAL
8B UPPER HALF BLOCK
8C LOWER HALF BLOCK
8D FULL BLOCK
8E LEFT HALF BLOCK
8F RIGHT HALF BLOCK
90 LIGHT SHADE
91 MEDIUM SHADE
92 DARK SHADE
93 LEFT DOUBLE QUOTATION MARK
94 BLACK SMALL SQUARE
95 BULLET OPERATOR
96 RIGHT DOUBLE QUOTATION MARK
97 EM DASH
98 COPYRIGHT SIGN
99 TRADE MARK SIGN
9A NOT USED
9B DOUBLE RIGHT-POINTING ANGLE QUOTATION MARK
9C REGISTERED SIGN
9D DOUBLE LEFT-POINTING ANGLE QUOTATION MARK
9E MIDDLE DOT
9F CURRENCY SIGN
A0 FORMS DOUBLE HORIZONTAL
A1 FORMS DOUBLE VERTICAL
A2 FORMS DOWN SINGLE AND RIGHT DOUBLE
A5 FORMS DOUBLE DOWN AND RIGHT
A8 FORMS DOUBLE DOWN AND LEFT
A9 FORMS UP SINGLE AND RIGHT DOUBLE
AA FORMS UP DOUBLE AND RIGHT SINGLE
AB FORMS DOUBLE UP AND RIGHT
AC FORMS UP SINGLE AND LEFT DOUBLE
AD CYRILLIC SMALL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
AE CYRILLIC SMALL LETTER BELORUSSIAN SHORT U Belorusian
AF FORMS VERTICAL SINGLE AND RIGHT DOUBLE
B0 FORMS VERTICAL DOUBLE AND RIGHT SINGLE
B1 FORMS DOUBLE VERTICAL AND RIGHT
B2 FORMS VERTICAL SINGLE AND LEFT DOUBLE
B3 CYRILLIC CAPITAL LETTER IO
B4 CYRILLIC CAPITAL LETTER UKRAINIAN IE Ukrainian
B5 FORMS DOUBLE VERTICAL AND LEFT
B6 CYRILLIC CAPITAL LETTER BELORUSSIAN-UKRAINIAN I Ukrainian
B7 CYRILLIC CAPITAL LETTER YI (UKRAINIAN) Ukrainian
B8 FORMS DOUBLE DOWN AND HORIZONTAL
B9 FORMS UP SINGLE AND HORIZONTAL DOUBLE
BA FORMS UP DOUBLE AND HORIZONTAL SINGLE
BB FORMS DOUBLE UP AND HORIZONTAL
BC FORMS VERTICAL SINGLE AND HORIZONTAL DOUBLE
BD CYRILLIC CAPITAL LETTER UKRAINIAN KGE (WITH UPTURN) Ukrainian
BE CYRILLIC CAPITAL LETTER BELORUSSIAN SHORT U Belorussian
BF COPYRIGHT SIGN