Multilingual Support of WWW Applications in Ukraine/CIS
Yuri Demchenko, Kiev Polytechnic Institute (demch@cad.polytech.kiev.ua)
Igor Sinitsyn, Volt Computer
Services (a-igors@microsoft.com)
Development of Internet networking
infrastructure and active creation of Internet information resources
in Ukraine demand support of multiple languages in presentation
of some meaningful information about cultural and historical heritage
of Ukraine.
Multilingual support of WWW applications have to include the following issues
Many of mentioned problems are
solved now by popular browsers such as Netscape Navigator 3.0,
MS Internet Explorer 3.0, Alis Tango 1.5 that support different
document encoding. Support of different charsets and codepages
of the same document is realized now by new HTTP Servers and browsers
that use HTTP/1.0 "charset=" directive during client-server
negotiation for choosing document encoding. But real implementation
of this approach meets some problems during document retrieving,
browsing and storing.
It is a common practice for
Ukraine to support two (Russian and English) or three (Ukrainian,
Russian and English) languages. With involving educational and
humanitarian organisations into the process of creating WWW information
resources usage of Ukrainian language will be increased. But
wide use of Ukrainian charset for Internet exchange meets two
problems - standardisation of Ukrainian codepage for different
computer platforms (MS DOS, MS Windows, ISO8859, UNIX KOI8 or
Macintosh) and standardisation of keyboard mapping.
Ukrainian charset encoding known
as KOI8-RU and commonly used as standard "de-facto"
for mail, news exchange and WWW publishing. is not officially
standardised yet. It use standardised KOI8-R charset [1] as base
and add three Ukrainian characters "ukr. i", "ukr.
yi", "ukr. ie", "ukr. kge (with upturn)"
[2, 3] that replace some pseudographics symbols not used in common
practice of mail and news exchange. The official standardisation
of this Ukrainian net charset is planned to be considered by working
group of Committee of Standardisation of Ukraine (CSU).
In general, it is expediently
to extend KOI8-R for support all three FSU Cyrillic languages
- Russian, Ukrainian and Belorussian. This charset KOI8-RUB could
be regarded as meta (base) encoding and other encoding types should
be derivative from it. The only problem to be discussed is regarding
support of full Ukrainian charset in ISO8859-5 that miss now
one Ukrainian character "gapa" (strong "g").
Discussed now in CSU standard
draft on Ukrainian keyboard was prepared without wide consultation
with IT experts and hardware and software developers. It does
not allow to use convenient switched combination of Ukrainian,
Russian and English keyboards.
More efficient solution can be
proposed to combine all three languages - Ukrainian (as national)
, Russian (as common CIS language) and English (as Internet metalanguage),
- applying recommendations of ISO/IEC CD2 14755 [4] to input
some non-ASCII characters switching standard keys by depressing
one of the control key (CTRL, ALT or ESC). This approach will
not change commonly used in Ukraine as standard "de-facto"
MS Windows Ukrainian keyboard and can be used to input all "missing"
(or seldomly used) characters: "rus. io", "ukr.
gapa", "belorus. short u".
This work was sufficiently pushed ahead by mutual consultation with the
members of the Microsoft international test team testing multi-language
functionality , in particular, for Ukrainian and Belorussian languages ,
in Internet products like IE
3.0 and Internet Mail and News 1.0.
Reference
[1] RFC 1489. Registration of Cyrillic Character Set. - July 1993.
[2] Shevchenko L., Rizun V., Lysenko Yu. Modern Ukrainian Language. - Kiev. - Lybid'. - 1993. - 336 pp.
[3] Nadine Kano. Developing International software for Windows 95 and Windows NT:a handbook for software design. - Microsoft Press.
[4] Second CD - ISO/IEC CD2 14755
- Input methods to enter characters from the repertoire of ISO/IEC
10646 with a keyboard or other input devices. - http://www-rocq.inria.fr/~deschamp/www/divers/ALB-CD.html