KOI8-R (RFC 1489) is an 8-bit character encoding derived from the KOI-8 encoding by the programmer Andrei Chernov in 1993 and designed to cover Russian, which uses the Russian subset of a Cyrillic script. KOI-8, in turn, is an 8-bit extension of the KOI-7 encoding, which inherited a phonetic correspondence of Russian and Latin letters from the MTK-2 teletype code. As a result, Russian Cyrillic letters in KOI8-R are in pseudo-Latin alphabetical order rather than the normal Cyrillic one like in ISO 8859-5. Although this may seem unnatural, this has the useful effect that if the 8th bit is stripped, the text remains partially readable in any ASCII-based encoding (including KOI8-R itself) as a case-reversed transliteration. For example, "Код для обмена и обработки информации" (the Russian meaning of the "KOI" acronym) becomes kOD DLQ OBMENA I OBRABOTKI INFORMACII.

KOI-8 stands for 8-bitnyy kod dlya obmena i obrabotki informatsii (Russian: 8-битный код для обмена и обработки информации) which means "8-Bit Code for Information Interchange". In Microsoft Windows, KOI8-R is assigned the code page number 20866. In IBM, KOI8-R is assigned code page 878. KOI8-R also happens to cover Bulgarian.

It lacks proper quotation marks for these languages: both «...» and the Bulgarian „...“. Windows-1251 does support these, as well as more letters, and has thus become more popular. KOI8-R is used by less than 0.004% of websites, mostly Russian and Bulgarian.[citation needed] Unicode and UTF-8 is preferred to single-byte Cyrillic encodings in modern applications, Unicode contains 436 Cyrillic letters including for Old Cyrillic.

Character set

The following table shows the KOI8-R encoding. Each character is shown with its equivalent Unicode code point.

KOI8-R
0123456789ABCDEF
0x
1x
2xSP!"#$%&'()*+,-./
3x0123456789:;<=>?
4x@ABCDEFGHIJKLMNO
5xPQRSTUVWXYZ[\]^_
6x`abcdefghijklmno
7xpqrstuvwxyz{|}~
8x─2500│2502┌250C┐2510└2514┘2518├251C┤2524┬252C┴2534┼253C▀2580▄2584█2588▌258C▐2590
9x░2591▒2592▓2593⌠2320■25A0∙2219√221A≈2248≤2264≥2265NBSP⌡2321°00B0²00B2·00B7÷00F7
Ax═2550║2551╒2552ё0451╓2553╔2554╕2555╖2556╗2557╘2558╙2559╚255A╛255B╜255C╝255D╞255E
Bx╟255F╠2560╡2561Ё0401╢2562╣2563╤2564╥2565╦2566╧2567╨2568╩2569╪256A╫256B╬256C©00A9
Cxю044Eа0430б0431ц0446д0434е0435ф0444г0433х0445и0438й0439к043Aл043Bм043Cн043Dо043E
Dxп043Fя044Fр0440с0441т0442у0443ж0436в0432ь044Cы044Bз0437ш0448э044Dщ0449ч0447ъ044A
ExЮ042EА0410Б0411Ц0426Д0414Е0415Ф0424Г0413Х0425И0418Й0419К041AЛ041BМ041CН041DО041E
FxП041FЯ042FР0420С0421Т0422У0423Ж0416В0412Ь042CЫ042BЗ0417Ш0428Э042DЩ0429Ч0427Ъ042A

See also

Further reading

  • Flohr, Guido; Kiss, Gabor; Chernov, Andrey A. (2016) [2006]. . CPAN libintl-perl. 1.0. from the original on 2017-01-15.
  • Kostis, Kosta. . 1.20. from the original on 2017-01-16.
  • RFC
  • . Kermit. Columbia University.
  • Kornai, Andras; Birnbaum, David J.; da Cruz, Frank; Davis, Bur; Fowler, George; Paine, Richard B.; Paperno, Slava; Simonsen, Keld J.; Thobe, Glenn E.; Vulis, Dimitri; van Wingen, Johan W. (1993-03-13). . 1.3.

External links

  • , an online program that may help recovering Cyrillic texts with broken KOI8-R or other character encodings.
  • . 1995.
  • Czyborra, Roman (1998-11-30) [1998-05-25]. . from the original on 2016-12-03.
  • Hohlov, Yu. E. . from the original on 2016-12-05.
  • Nechayev, Valentin (2013) [2001]. . from the original on 2016-12-05.