Return to BSD News archive
Received: by minnie.vk1xwt.ampr.org with NNTP id AA5982 ; Sat, 02 Jan 93 13:02:10 EST Newsgroups: comp.unix.bsd Path: sserve!manuel.anu.edu.au!munnari.oz.au!spool.mu.edu!caen!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry From: terry@cs.weber.edu (A Wizard of Earth C) Subject: Re: INTERNATIONALIZATION: JAPAN, FAR EAST Message-ID: <1993Jan5.093059.29631@fcom.cc.utah.edu> Sender: news@fcom.cc.utah.edu Organization: Weber State University (Ogden, UT) References: <2565@titccy.cc.titech.ac.jp> <1992Dec28.064029.24421@fcom.cc.utah.edu> <2616@titccy.cc.titech.ac.jp> Date: Tue, 5 Jan 93 09:30:59 GMT Lines: 106 In article <2616@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes: >In article <1992Dec28.064029.24421@fcom.cc.utah.edu> > terry@cs.weber.edu (A Wizard of Earth C) writes: > >>|> True. But, it should be noted that they don't fit even in 16 bits. >> >>Work is already under way to adapt Unicode to 32 bits. I would be interested >>in any similar work you know of in progress for XPG4/JIS. > >Anyway, a program written for 16 bit Unicode can not be usable with 32 bit >Unicode. Not true. Attribution of files/users/compound data within the files/etc. allows easy identification of version changes. >>I am *not* >>interested in proposing or attempting to provide yet another standard, if >>that is what you believe is necessary. > >Then, why you are interested in using yet non-existent standard? Unicode has been codified. It exists: The Unicode Standard Worldwide Character Encoding Version 1.0, Volume 1 _The Unicode Consortium_ Addison-Wesley Publishing Company, Inc. ISBN 0-201-56788-1 The Unicode Standard Worldwide Character Encoding Version 1.0, Volume 2 _The Unicode Consortium_ Addison-Wesley Publishing Company, Inc. ISBN 0-201-60845-6 >BTW, can you explain what XPG4 is? The internationalization mechanism following XPG3, the SVR4.2 standard for internationalization. XPG4 is XPG3 with East Asian language support. Standards documents are currently avilable, but a reference implementation (to the best of my knowledge) is not. I can look up and post the publication information if you are truly interested. >>Again, I want to stress that we are about the identification and adoption >>of an existing standard rather than the specification and ratification of >>a new one. > >Then, the only standard available now for internationalization is ISO 2022. > >It can, at least, differentiate Chinese and Japanese character. > >Do you want to use it? ISO 2022 places the unacceptable burden of Runic encoding for monolingual environments (post localization). While is is an "OK" standard for internationalization (multinationalization, really, since it deals with the concept of multilingual documents directly), the penalties of a change in apparent environment for the purely localized user are unacceptable, since the purely localized user is the majority case. The differentiation of Chinese and Japanese characters is the job of the input mechanism, which would, in any case, be required to change between the job of inputting Chinese and the job of inputing Japanese. This switch is an acceptable tagging mechanism for multilingual use. Tagging for monolingual use can be done on a per-system or per-user basis, since it is unlikely that all localaization databases for each language (message catalogs, etc) will be kept around. As an English-speaker only (hypothetically speaking, since I can get along in Japanese, German, Latin, and Spanish and have a very little Greek, Gaelic, French and Swahili on the side), I am unlikely to localize my machine to anything other than English, and the only thing served by carrying around localization sets for 20 other languages is my disk drive vendor. The primary use for an interntaionalization mechanism will be localization; anything on top of that (and yes, we can build multilingual applications on top of that with little effort) is gravy. >>We may invoke "tricks" to reduce storage requirements or to >>retrofit existing input mechanisms, but we are not attempting a new standard. > >Again, existing input mechanism for Japanese is so large that even very >complex tranlation does not affect its performance. All the more reason to not quibble about storage mechanisms and get down to the job of coding a reference implementation. Unicode is a storage mechansim; it does *not* disctate the display font any more than a plain text file dictates the default Postscript font that will be used when you type "lpr". Both input and display are functions of the localization (or multinationalization) mechanisms we choose to employ on top of the storage mechanism. Terry Lambert terry@icarus.weber.edu terry_lambert@novell.com --- Any opinions in this posting are my own and not those of my present or previous employers. -- ------------------------------------------------------------------------------- "I have an 8 user poetic license" - me Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial -------------------------------------------------------------------------------