Return to BSD News archive
Path: sserve!manuel.anu.edu.au!munnari.oz.au!hp9000.csc.cuhk.hk!saimiri.primate.wisc.edu!sdd.hp.com!swrinde!cs.utexas.edu!uwm.edu!spool.mu.edu!sol.ctr.columbia.edu!ira.uka.de!math.fu-berlin.de!unidui!du9ds3!veit From: veit@du9ds3 (Holger Veit) Newsgroups: comp.unix.bsd Subject: Re: [386BSD] What about localisation? Date: 9 Dec 92 07:46:27 GMT Organization: Uni-Duisburg FB9 Datenverarbeitung Lines: 85 Message-ID: <veit.723887187@du9ds3> References: <1992Dec7.182103.1799@rdrel.relcom.msk.su> <1992Dec8.214215.24804@fcom.cc.utah.edu> Reply-To: veit@du9ds3.fb9dv.uni-duisburg.de NNTP-Posting-Host: du9ds3.fb9dv.uni-duisburg.de In <1992Dec8.214215.24804@fcom.cc.utah.edu> terry@cs.weber.edu (A Wizard of Earth C) writes: >In article <1992Dec7.182103.1799@rdrel.relcom.msk.su> sir@rdrel.relcom.msk.su (Sergey I.Ryzhkov) writes: >>Gentelmens! >In any case, I would prefer any Unicode standard, however badly implemented, >to XPG3, which would fail to deal with anything but Western Europe and >North and South America, in my opinion. I would also advocate for Unicode, or in its more general representation, ISO-10646-DIS 1.2. Unicode is basically the first ("basic multilingual") plane of ISO 10646, and standardization is in progress. ISO may be extended to 64K plane with 64K characters each (a real investment into the future, including all common and uncommon intergalactic languages :-)) >Another thing which requires consideration is a set of standardized >messages translated into all supported languages through whatever >localization mechanism we will use for messages in the shell, programs, >and etc. for perror and family. This will tend to go a long way towards >usability in an international forum -- and probably constitutes our best >bet for high return on the effort we invest, guaranteeing at least base >functionality in supported languages. This would require a large "resource data base". >I think that this eventually assumes an X environment and a full Unicode >"fixed" font; this is ~250K for a 5x8, <1M for a 10x20 (non-default). >Does such a font already exist? 5x8 is probably too small for several asian languages, 10x20 looks feasible. I have looked for such a monster for myself, without much success. >The other fundamental assumption is multibyte data stream to the tty, and >appropriate localization by the tty itself. This is an easy mod for >xterm, but requires spanning sets within a given non-PC-ASCII driver >for (for instance) a downloaded Cyrillic font in a VGA/EGS card. This >would be, fundamentally, a 16-bit to 8-bit "mapchan". The console driver should, if not purely ASCII, understand and display all available characters. There should be no asymmetry between any modes; the "text mode" should behave like a single, fullscreen "xterm", with a fixed font only. It is not very difficult to make codrv accept and emit 16bit chars, the real problems start with the restricted VGA scheme when we still remain in text mode. There are two available font slots only (some SVGAs understand eight), so constantly remapping of characters may occur (BTW: A screen may display more different characters, 80x25, than available in the font slots). So we need to run the console in graphics mode. A really bad problem we may encounter is the different direction of writing. Some languages write right-to-left; it requires "change direction" prefixes to handle this, and it is difficult to handle mixed texts under all circumstances. >I suggest we attack it in this order: >1) Pick a standard for encoding (I vote Unicode). >2) Pick a standard for storage (I vote character set attributed files > to avoid stream encoding while maintaining the benefits of 8-bit > storage for most languages). >3) Create an X environment capable of supporting all languages by > default (Again, I vote Unicode). >4) Build some tools for running two character sets simultaneously > (requires combination of Anglicized/Localized encoding and > adat entry mechanisms). >5) Provide basic error message and prompting translations (requires > fluently bilingual volunteers). >6) Perform a code integration (probably at the 0.2 level, although > this may drag on until 0.3). >Anything else that needs to be handled? I could agree with this. Volunteers :-) ? >Unlike most OS products (with a possible exception for NT, which is Unicode >aware), we have a chance to do this right before the product is too >mature to let us do things "the right way". We should take the opportunity >while it still presents itself. > Terry Lambert Holger -- | | / Dr. Holger Veit | INTERNET: veit@du9ds3.fb9dv.uni-duisburg.de |__| / University of Duisburg | "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | | / Dept. of Electr. Eng. | Sorry, the above really good fortune has | |/ Inst. f. Dataprocessing | been CENSORED because of obscenity"