Return to BSD News archive
Received: by minnie.vk1xwt.ampr.org with NNTP id AA6110 ; Mon, 04 Jan 93 23:11:58 EST Xref: sserve comp.unix.bsd:9674 comp.std.internat:1621 Path: sserve!manuel.anu.edu.au!munnari.oz.au!sgiblab!nec-gw!nec-tyo!wnoc-tyo-news!cs.titech!titccy.cc.titech!necom830!mohta From: mohta@necom830.cc.titech.ac.jp (Masataka Ohta) Newsgroups: comp.unix.bsd,comp.std.internat Subject: Dumb Terry Keywords: Han Kanji Katakana Hirugana ISO10646 Unicode Codepages Message-ID: <2637@titccy.cc.titech.ac.jp> Date: 7 Jan 93 06:41:06 GMT References: <2615@titccy.cc.titech.ac.jp> <1993Jan5.090747.29232@fcom.cc.utah.edu> <2628@titccy.cc.titech.ac.jp> <1993Jan7.045612.13244@fcom.cc.utah.edu> Sender: news@titccy.cc.titech.ac.jp Followup-To: comp.unix.bsd Organization: Tokyo Institute of Technology Lines: 118 In article <1993Jan7.045612.13244@fcom.cc.utah.edu> terry@cs.weber.edu (A Wizard of Earth C) writes: >Before I proceed, I will [ once again ] remove the "dumb Americans" from my >original topic line. I changed the subject to reflect the content better. >>>>>This I don't understand. The maximum translation table from one 16 bit value >>>>>to another is 16k. >>>>WHAAAAT? It's 128KB, not 16k. >It is still a translation of one 16 bit value to another. In is *not* an >*arbitrary* translation we are talking about, since the spanning sets will >be known. You wrote MAXIMUM. >>>>>This means 2 16k tables for translation into/out of >>>>>Unicode for Input/Output devices, >Sorry; I misspoke (mistyped?) here. You are dumb. >I meant to refer to any arbitrary 8-bit >set for which a localization set is available (example: and ISO 8859-x set). Do you know what HASHING is? If not, read Knuth. >Obviously, by this response, you meant "cat two files to a third file" rather >than what you stated, You don't have to create a third file, as the output might be piped. >what you stated, which would have resulted in the files going to the >screen. Display device attribution based on supported character While you may not know UNIX at all, "cat" has nothing to do with display. Instead, some device drivers and terminal emulators might. >Obviously what you are asking is "how do I make two monolingual/bilingual/ >multilingual files of different language attribution into a single bilingual/ >multilingual file using cat" -- not the question as you have phrased it, nor >as I have answered it, but in the context of the discussion, clearly the >intended tack. "How to "cat" files with different attributes" is the classic question to piss off attribute-lovers, which all UNIX lovers know. Of course, there are several other reasons why not to use file attributes, which yuu don't know. But, I'm tired. >Rather than pretending I don't know what you are getting at, Then, don't post anymore. >The answer is "you don't use 'cat'". The "cat" command does not deal with OK, say it in comp.unix.misc and see what happens. >What this means is that all files which are multilingual in nature require >a compound document architecture. No thank you. I do want to grep my multilingual files. >What this means is that a utility to combine documents (let's call it >"combine") must have the ability to either generate language attributed >files (if the source files are all of a single language attribution) or >our default compound document format (TBD). You are making simple problem unsolvable. >The correct approach is to note that since Unicode does not provide a >mechanism directly for language attribution, and that file attribution >is only a partial soloution, So, the correct aproach is not to use Unicode as it is. >What this means is that a utility to combine documents (let's call it >"combine") Wow! >Does this answer your "cat" question sufficiently? Conglaturations! You are now prepared to accept the second question. Under internationalized environment, we often create a file with Japanese name. At the same time, 1) we might have a file having Chinese name in the same directory. 2) we might have a file having Chinese name in the different directory. 3) the Japanses file's full pathname might contain Chinese at its intermediate directory name. Could you design a replacement of "ls" for such a situation? Then, the third: >Attribution of output and clever construction of out output device drivers >would even allow us to switch fonts as dictated by the compound document >architecture controls embedded in the file and/or the attribution of the >file descriptor (the absence of such attribution being an indicator of a >compund document). Given the above situation for "ls", I'm afraid that "argv" to any command be the compound document. Am I correct? Is it still have a type "char"? Do you think the entire OS still UNIX? >The problem seemed to >be that there was not a means around the problem from your point of view. Just include language information in character code, and the problem disappears. Masataka Ohta