KOI8-U
Get KOI8-U essential facts below. View Videos or join the KOI8-U discussion. Add KOI8-U to your PopFlock.com topic list for future reference or share this resource on social media.
KOI8-U
KOI8-U
Language(s)Ukrainian, Russian, Bulgarian
Classification8-bit KOI, extended ASCII
ExtendsKOI8-B
Based onKOI8-R
Other related encoding(s)KOI8-RU, KOI8-F

KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian and Bulgarian, but replaces eight box drawing characters with four Ukrainian letters ?, ?, ?, and ? in both upper case and lower case.

KOI8-RU is closely related, but adds ? for Belarusian. In both, the letter allocations match those in KOI8-E, except for ? which is added to KOI8-F.

In Microsoft Windows, KOI8-U is assigned the code page number 21866. In IBM, KOI8-U is assigned code page/CCSID 1168.[1][2][3]

KOI8 remains much more commonly used than ISO 8859-5, which never really caught on. Another common Cyrillic character encoding is Windows-1251. In the future, both may eventually give way to Unicode.

KOI8 stands for Kod Obmena Informatsiey, 8 bit (Russian: , 8 ) which means "Code for Information Exchange, 8 bit".

The KOI8 character sets have the property that the Russian Cyrillic letters are in pseudo-Roman order rather than the natural Cyrillic alphabetical order as in ISO 8859-5. Although this may seem unnatural, it has the useful property that if the eighth bit is stripped, the text can still be read (or at least deciphered) in case-reversed transliteration on an ordinary ASCII terminal. For instance, "? " in KOI8-U becomes rUSSKIJ tEKST ("Russian Text") if the 8th bit is stripped.

Character set

The following table shows the KOI8-U encoding.[1][4] Each character is shown with its equivalent Unicode code point.

  Letter  Number  Punctuation  Symbol  Other  Undefined  Differences with KOI8-R (non-Russian letters)


Although RFC 2319 says that character 0x95 should be U+2219 (?), it may also be U+2022 (o) to match the bullet character in Windows-1251.

Some references have a typo and incorrectly state that character 0xB4 is U+0403, rather than the correct U+0404. This typo is present in Appendix A of RFC 2319 (but the table in the main text of the RFC gives the correct mapping).

See also

References

  1. ^ a b "SBCS code page information - CPGID: 01168 / Name: Ukrainian KOI8-U". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. IBM. C-H 3-3220-050. Archived from the original on 2017-02-18. Retrieved . [1] [2]
  2. ^ "CCSID information document; CCSID 1168; KOI8-U". IBM. Archived from the original on 2017-02-18. Retrieved .
  3. ^ International Components for Unicode (ICU), ibm-1168_P100-2002.ucm, 2002-12-03
  4. ^ Verdy, Philippe; Richter, Helmut (2016-01-04) [2008-10-13]. "KOI8-U.TXT". 2.0. Retrieved .

Further reading

External links


  This article uses material from the Wikipedia page available here. It is released under the Creative Commons Attribution-Share-Alike License 3.0.

KOI8-U
 



 



 
Music Scenes