Return to BSD News archive
Xref: sserve comp.os.386bsd.bugs:1671 alt.sources:6578 Newsgroups: comp.os.386bsd.bugs,alt.sources Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!spool.mu.edu!torn!newshub.ccs.yorku.ca!oz From: oz@ursa.sis.yorku.ca (Ozan S. Yigit) Subject: a much improved version of pd/bsd m4. Message-ID: <OZ.93Oct29000532@ursa.sis.yorku.ca> Sender: news@newshub.ccs.yorku.ca (USENET News System) Organization: York U. Student Information Systems Project Date: Fri, 29 Oct 1993 05:05:32 GMT Lines: 3295 here is a much improved, 8-bit-clean version of the pd m4 which should work with all versions of sendmail cfs. this is the base version for a new release that is in preperation. I would like to thank Richard A. O'Keefe for all his support. any comments, bug reports [+/-fixes] and other improvements would be appreciated. ... oz --- this upper bound on 24 hours in one | electric: oz@sis.yorku.ca day gets to me.... -- Udi Manber | or [416] 736 2100 x 33976 ---------------------------------------------------------------------- # to unbundle, sh this file echo expr.c 1>&2 sed 's/.//' >expr.c <<'//GO.SYSIN DD expr.c' -/* File : expr.c - Authors: Mike Lutz & Bob Harper - Editors: Ozan Yigit & Richard A. O'Keefe - Updated: %G% - Purpose: arithmetic expression evaluator. - - expr() performs a standard recursive descent parse to evaluate any - expression permitted byf the following grammar: - - expr : query EOS - query : lor - | lor "?" query ":" query - lor : land { "||" land } or OR, for Pascal - land : bor { "&&" bor } or AND, for Pascal - bor : bxor { "|" bxor } - bxor : band { "^" band } - band : eql { "&" eql } - eql : relat { eqrel relat } - relat : shift { rel shift } - shift : primary { shop primary } - primary : term { addop term } - term : unary { mulop unary } - unary : factor - | unop unary - factor : constant - | "(" query ")" - constant: num - | "'" CHAR "'" or '"' CHAR '"' - num : DIGIT full ANSI C syntax - | DIGIT num - shop : "<<" - | ">>" - eqlrel : "=" - | "==" - | "!=" - rel : "<" or <>, Pascal not-equal - | ">" - | "<=" or =<, for Prolog users. - | ">=" - - This expression evaluator was lifted from a public-domain - C Pre-Processor included with the DECUS C Compiler distribution. - It has been hacked somewhat to be suitable for m4. - - 26-Mar-1993 Changed to work in any of EBCDIC, ASCII, DEC MNCS, - or ISO 8859/n. - - 26-Mar-1993 Changed to use "long int" rather than int, so that - we get the same 32-bit arithmetic on a PC as on a Sun. - It isn't fully portable, of course, but then on a 64- - bit machine we _want_ 64-bit arithmetic... - Shifting rewritten (using LONG_BIT) to give signed - shifts even when (long) >> (long) is unsigned. - - 26-Mar-1993 I finally got sick of the fact that &&, ||, and ?: - don't do conditional evaluation. What is the good - of having eval(0&&(1/0)) crash and dump core? Now - every function has a doit? argument. - - 26-Mar-1993 charcon() didn't actually accept 'abcd', which it - should have. Fixed it. - - 20-Apr-1993 eval(1/0) and eval(1%0) dumped core and crashed. - This is also true of the System V r 3.2 m4, but - it isn't good enough for ours! Changed it so that - x % 0 => x as per Concrete Mathematics - x / 0 => error and return 0 from expr(). -*/ - -#define FALSE 0 -#define TRUE 1 - -#include <stdio.h> -#include <setjmp.h> -static jmp_buf expjump; /* Error exit point for expr() */ - -static unsigned char *nxtchr; /* Parser scan pointer */ - -#define deblank0 while ((unsigned)(*nxtchr-1) < ' ') nxtchr++ -#define deblank1 while ((unsigned)(*++nxtchr - 1) < ' ') -#define deblank2 nxtchr++; deblank1 - -#include "ourlims.h" -static char digval[1+UCHAR_MAX]; - -/* This file should work in any C implementation that doesn't have too - many characters to fit in one table. We use a table to convert - (unsigned) characters to numeric codes: - 0 to 9 for '0' to '9' - 10 to 35 for 'a' to 'z' - 10 to 35 for 'A' to 'Z' - 36 for '_' - Instead of asking whether tolower(c) == 'a' we ask whether - digval[c] == DIGIT_A, and so on. This essentially duplicates the - chtype[] table in main.c; we should use just one table. -*/ -#define DIGIT_A 10 -#define DIGIT_B 11 -#define DIGIT_C 12 -#define DIGIT_D 13 -#define DIGIT_E 14 -#define DIGIT_F 15 -#define DIGIT_G 16 -#define DIGIT_H 17 -#define DIGIT_I 18 -#define DIGIT_J 19 -#define DIGIT_K 20 -#define DIGIT_L 21 -#define DIGIT_M 22 -#define DIGIT_N 23 -#define DIGIT_O 24 -#define DIGIT_P 25 -#define DIGIT_Q 26 -#define DIGIT_R 27 -#define DIGIT_S 28 -#define DIGIT_T 29 -#define DIGIT_U 30 -#define DIGIT_V 31 -#define DIGIT_W 32 -#define DIGIT_X 33 -#define DIGIT_Y 34 -#define DIGIT_Z 35 - - -#ifdef __STDC__ -static long int query(int); -#else -static long int query(); -#endif - - -/* experr(msg) - prints an error message, resets environment to expr(), and - forces expr() to return FALSE. -*/ -void experr(msg) - char *msg; - { - (void) fprintf(stderr, "m4: %s\n", msg); - longjmp(expjump, -1); /* Force expr() to return FALSE */ - } - - -/* <numcon> ::= '0x' <hex> | '0X' <hex> | '0' <oct> | <dec> - For ANSI C, an integer may be followed by u, l, ul, or lu, - in any mix of cases. We accept and ignore those letters; - all the numbers are treated as long. -*/ -static long int numcon(doit) - int doit; - { - register long int v; /* current value */ - register int b; /* base (radix) */ - register int c; /* character or digit value */ - - if (!doit) { - do nxtchr++; while (digval[*nxtchr] <= 36); - deblank0; - return 0; - } - - v = digval[*nxtchr++]; /* We already know it's a digit */ - if (v != 0) { - b = 10; /* decimal number */ - } else - if (digval[*nxtchr] == DIGIT_X) { - nxtchr++; - b = 16; /* hexadecimal number */ - } else { - b = 8; /* octal number */ - } - do { - while (digval[c = *nxtchr++] < b) v = v*b + digval[c]; - } while (c == '_'); - while (digval[c] == DIGIT_L || digval[c] == DIGIT_U) c = *nxtchr++; - nxtchr--; /* unread c */ - if ((unsigned)(c-1) < ' ') { deblank1; } - return v; - } - - -/* <charcon> ::= <qt> { <char> } <qt> - Note: multibyte constants are accepted. - Note: BEL (\a) and ESC (\e) have the same values in EBCDIC and ASCII. -*/ -static long int charcon(doit) - int doit; - { - register int i; - long int value; - register int c; - int q; - int v[sizeof value]; - - q = *nxtchr++; /* the quote character */ - for (i = 0; ; i++) { - c = *nxtchr++; - if (c == q) { /* end of literal, or doubled quote */ - if (*nxtchr != c) break; - nxtchr++; /* doubled quote stands for one quote */ - } - if (i == sizeof value) experr("Unterminated character constant"); - if (c == '\\') { - switch (c = *nxtchr++) { - case '0': case '1': case '2': case '3': - case '4': case '5': case '6': case '7': - c -= '0'; - if ((unsigned)(*nxtchr - '0') < 8) - c = (c << 3) | (*nxtchr++ - '0'); - if ((unsigned)(*nxtchr - '0') < 8) - c = (c << 3) | (*nxtchr++ - '0'); - break; - case 'n': case 'N': c = '\n'; break; - case 'r': case 'R': c = '\r'; break; - case 't': case 'T': c = '\t'; break; - case 'b': case 'B': c = '\b'; break; - case 'f': case 'F': c = '\f'; break; - case 'a': case 'A': c = 007; break; - case 'e': case 'E': c = 033; break; -#if ' ' == 64 - case 'd': case 'D': c = 045; break; /*EBCDIC DEL */ -#else - case 'd': case 'D': c = 127; break; /* ASCII DEL */ -#endif - default : break; - } - } - v[i] = c; - } - deblank0; - if (!doit) return 0; - for (value = 0; --i >= 0; ) value = (value << CHAR_BIT) | v[i]; - return value; - } - - -/* <unary> ::= <unop> <unary> | <factor> - <unop> ::= '!' || '~' | '-' - <factor> ::= '(' <query> ')' | <'> <char> <'> | <"> <char> <"> | <num> -*/ -static long int unary(doit) - int doit; - { - long int v; - - switch (nxtchr[0]) { - case 'n': case 'N': - if (digval[nxtchr[1]] != DIGIT_O - || digval[nxtchr[2]] != DIGIT_T) - experr("Bad 'not'"); - nxtchr += 2; - case '!': deblank1; return !unary(doit); - case '~': deblank1; return ~unary(doit); - case '-': deblank1; return -unary(doit); - case '+': deblank1; return unary(doit); - case '(': deblank1; v = query(doit); - if (nxtchr[0] != ')') experr("Bad factor"); - deblank1; return v; - case '\'': - case '\"': return charcon(doit); - case '0': case '1': case '2': - case '3': case '4': case '5': - case '6': case '7': case '8': - case '9': return numcon(doit); - default : experr("Bad constant"); - } - return 0; /*NOTREACHED*/ - } - - -/* <term> ::= <unary> { <mulop> <unary> } - <mulop> ::= '*' | '/' || '%' -*/ -static long int term(doit) - int doit; - { - register long int vl, vr; - - vl = unary(doit); - for (;;) - switch (nxtchr[0]) { - case '*': - deblank1; - vr = unary(doit); - if (doit) vl *= vr; - break; - case 'd': case 'D': - if (digval[nxtchr[1]] != DIGIT_I - || digval[nxtchr[2]] != DIGIT_V) - experr("Bad 'div'"); - nxtchr += 2; - /*FALLTHROUGH*/ - case '/': - deblank1; - vr = unary(doit); - if (doit) { - if (vr == 0) experr("Division by 0"); - vl /= vr; - } - break; - case 'm': case 'M': - if (digval[nxtchr[1]] != DIGIT_O - || digval[nxtchr[2]] != DIGIT_D) - experr("Bad 'mod'"); - nxtchr += 2; - /*FALLTHROUGH*/ - case '%': - deblank1; - vr = unary(doit); - if (doit) { - if (vr != 0) vl %= vr; - } - break; - default: - return vl; - } - /*NOTREACHED*/ - } - -/* <primary> ::= <term> { <addop> <term> } - <addop> ::= '+' | '-' -*/ -static long int primary(doit) - int doit; - { - register long int vl; - - vl = term(doit); - for (;;) - if (nxtchr[0] == '+') { - deblank1; - if (doit) vl += term(doit); else (void)term(doit); - } else - if (nxtchr[0] == '-') { - deblank1; - if (doit) vl -= term(doit); else (void)term(doit); - } else - return vl; - /*NOTREACHED*/ - } - - -/* <shift> ::= <primary> { <shop> <primary> } - <shop> ::= '<<' | '>>' -*/ -static long int shift(doit) - int doit; - { - register long int vl, vr; - - vl = primary(doit); - for (;;) { - if (nxtchr[0] == '<' && nxtchr[1] == '<') { - deblank2; - vr = primary(doit); - } else - if (nxtchr[0] == '>' && nxtchr[1] == '>') { - deblank2; - vr = -primary(doit); - } else { - return vl; - } - /* The following code implements shifts portably */ - /* Shifts are signed shifts, and the shift count */ - /* acts like repeated one-bit shifts, not modulo anything */ - if (doit) { - if (vr >= LONG_BIT) { - vl = 0; - } else - if (vr <= -LONG_BIT) { - vl = -(vl < 0); - } else - if (vr > 0) { - vl <<= vr; - } else - if (vr < 0) { - vl = (vl >> -vr) | (-(vl < 0) << (LONG_BIT + vr)); - } - } - } - /*NOTREACHED*/ - } - - -/* <relat> ::= <shift> { <rel> <shift> } - <rel> ::= '<=' | '>=' | '=<' | '=>' | '<' | '>' - Here I rely on the fact that '<<' and '>>' are swallowed by <shift> -*/ -static long int relat(doit) - int doit; - { - register long int vl; - - vl = shift(doit); - for (;;) - switch (nxtchr[0]) { - case '=': - switch (nxtchr[1]) { - case '<': /* =<, take as <= */ - deblank2; - vl = vl <= shift(doit); - break; - case '>': /* =>, take as >= */ - deblank2; - vl = vl >= shift(doit); - break; - default: /* == or =; OOPS */ - return vl; - } - break; - case '<': - if (nxtchr[1] == '=') { /* <= */ - deblank2; - vl = vl <= shift(doit); - } else - if (nxtchr[1] == '>') { /* <> (Pascal) */ - deblank2; - vl = vl != shift(doit); - } else { /* < */ - deblank1; - vl = vl < shift(doit); - } - break; - case '>': - if (nxtchr[1] == '=') { /* >= */ - deblank2; - vl = vl >= shift(doit); - } else { /* > */ - deblank1; - vl = vl > shift(doit); - } - break; - default: - return vl; - } - /*NOTREACHED*/ - } - - -/* <eql> ::= <relat> { <eqrel> <relat> } - <eqlrel> ::= '!=' | '==' | '=' -*/ -static long int eql(doit) - int doit; - { - register long int vl; - - vl = relat(doit); - for (;;) - if (nxtchr[0] == '!' && nxtchr[1] == '=') { - deblank2; - vl = vl != relat(doit); - } else - if (nxtchr[0] == '=' && nxtchr[1] == '=') { - deblank2; - vl = vl == relat(doit); - } else - if (nxtchr[0] == '=') { - deblank1; - vl = vl == relat(doit); - } else - return vl; - /*NOTREACHED*/ - } - - -/* <band> ::= <eql> { '&' <eql> } -*/ -static long int band(doit) - int doit; - { - register long int vl; - - vl = eql(doit); - while (nxtchr[0] == '&' && nxtchr[1] != '&') { - deblank1; - if (doit) vl &= eql(doit); else (void)eql(doit); - } - return vl; - } - - -/* <bxor> ::= <band> { '^' <band> } -*/ -static long int bxor(doit) - int doit; - { - register long int vl; - - vl = band(doit); - while (nxtchr[0] == '^') { - deblank1; - if (doit) vl ^= band(doit); else (void)band(doit); - } - return vl; - } - - -/* <bor> ::= <bxor> { '|' <bxor> } -*/ -static long int bor(doit) - int doit; - { - register long int vl; - - vl = bxor(doit); - while (nxtchr[0] == '|' && nxtchr[1] != '|') { - deblank1; - if (doit) vl |= bxor(doit); else (void)bxor(doit); - } - return vl; - } - - -/* <land> ::= <bor> { '&&' <bor> } -*/ -static long int land(doit) - int doit; - { - register long int vl; - - vl = bor(doit); - for (;;) { - if (nxtchr[0] == '&') { - if (nxtchr[1] != '&') break; - deblank2; - } else - if (digval[nxtchr[0]] == DIGIT_A) { - if (digval[nxtchr[1]] != DIGIT_N) break; - if (digval[nxtchr[2]] != DIGIT_D) break; - nxtchr += 2; deblank1; - } else { - /* neither && nor and */ - break; - } - vl = bor(doit && vl) != 0; - } - return vl; - } - - -/* <lor> ::= <land> { '||' <land> } -*/ -static long int lor(doit) - int doit; - { - register long int vl; - - vl = land(doit); - for (;;) { - if (nxtchr[0] == '|') { - if (nxtchr[1] != '|') break; - } else - if (digval[nxtchr[0]] == DIGIT_O) { - if (digval[nxtchr[1]] != DIGIT_R) break; - } else { - /* neither || nor or */ - break; - } - deblank2; - vl = land(doit && !vl) != 0; - } - return vl; - } - - -/* <query> ::= <lor> [ '?' <query> ':' <query> ] -*/ -static long int query(doit) - int doit; - { - register long int bool, true_val, false_val; - - bool = lor(doit); - if (*nxtchr != '?') return bool; - deblank1; - true_val = query(doit && bool); - if (*nxtchr != ':') experr("Bad query"); - deblank1; - false_val = query(doit && !bool); - return bool ? true_val : false_val; - } - - -static void initialise_digval() - { - register unsigned char *s; - register int c; - - for (c = 0; c <= UCHAR_MAX; c++) digval[c] = 99; - for (c = 0, s = (unsigned char *)"0123456789"; - /*while*/ *s; - /*doing*/ digval[*s++] = c++) /* skip */; - for (c = 10, s = (unsigned char *)"ABCDEFGHIJKLMNOPQRSTUVWXYZ"; - /*while*/ *s; - /*doing*/ digval[*s++] = c++) /* skip */; - for (c = 10, s = (unsigned char *)"abcdefghijklmnopqrstuvwxyz"; - /*while*/ *s; - /*doing*/ digval[*s++] = c++) /* skip */; - digval['_'] = 36; - } - - -long int expr(expbuf) - char *expbuf; - { - register int rval; - - if (digval['1'] == 0) initialise_digval(); - nxtchr = (unsigned char *)expbuf; - deblank0; - if (setjmp(expjump) != 0) return FALSE; - rval = query(TRUE); - if (*nxtchr) experr("Ill-formed expression"); - return rval; - } - //GO.SYSIN DD expr.c echo int2str.c 1>&2 sed 's/.//' >int2str.c <<'//GO.SYSIN DD int2str.c' -/* File : int2str.c - Author : Richard A. O'Keefe - Updated: 6 February 1993 - Defines: int2str() - - int2str(dst, radix, val) - converts the (long) integer "val" to character form and moves it to - the destination string "dst" followed by a terminating NUL. The - result is normally a pointer to this NUL character, but if the radix - is dud the result will be NullS and nothing will be changed. - - If radix is -2..-36, val is taken to be SIGNED. - If radix is 2.. 36, val is taken to be UNSIGNED. - That is, val is signed if and only if radix is. You will normally - use radix -10 only through itoa and ltoa, for radix 2, 8, or 16 - unsigned is what you generally want. -*/ - -static char dig_vec[] = - "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"; - - -char *int2str(dst, radix, val) - register char *dst; - register int radix; - register long val; - { - char buffer[65]; /* Ready for 64-bit machines */ - register char *p; - - if (radix < 2 || radix > 36) { /* Not 2..36 */ - if (radix > -2 || radix < -36) return (char *)0; - if (val < 0) { - *dst++ = '-'; - val = -val; - } - radix = -radix; - } - /* The slightly contorted code which follows is due to the - fact that few machines directly support unsigned long / and %. - Certainly the VAX C compiler generates a subroutine call. In - the interests of efficiency (hollow laugh) I let this happen - for the first digit only; after that "val" will be in range so - that signed integer division will do. Sorry 'bout that. - CHECK THE CODE PRODUCED BY YOUR C COMPILER. The first % and / - should be unsigned, the second % and / signed, but C compilers - tend to be extraordinarily sensitive to minor details of style. - This works on a VAX, that's all I claim for it. - */ - p = &buffer[sizeof buffer]; - *--p = '\0'; - *--p = dig_vec[(unsigned long)val%(unsigned long)radix]; - val = (unsigned long)val/(unsigned long)radix; - while (val != 0) *--p = dig_vec[val%radix], val /= radix; - while (*dst++ = *p++) ; - return dst-1; - } - //GO.SYSIN DD int2str.c echo look.c 1>&2 sed 's/.//' >look.c <<'//GO.SYSIN DD look.c' -/* File : look.c - Author : Ozan Yigit - Updated: 4 May 1992 - Purpose: Hash table for M4 -*/ - -#include "mdef.h" -#include "extr.h" - -ndptr hashtab[HASHSIZE]; - - -/* - * hash - get a hash value for string s - */ -int -hash(name) -char *name; -{ - register unsigned long h = 0; - - while (*name) - h = (h << 5) + h + *name++; - - return h % HASHSIZE; -} - -/* - * lookup(name) - find name in the hash table - */ -ndptr lookup(name) - char *name; - { - register ndptr p; - - for (p = hashtab[hash(name)]; p != nil; p = p->nxtptr) - if (strcmp(name, p->name) == 0) - break; - return p; - } - -/* - * addent(name) - hash and create an entry in the hash table. - * The new entry is added at the front of a hash bucket. - * BEWARE: the type and defn fields are UNDEFINED. - */ -ndptr addent(name) - char *name; - { - register ndptr p, *h; - - p = (ndptr)malloc(sizeof *p); - if (p == NULL) error("m4: no more memory."); - h = &hashtab[hash(name)]; - p->name = strsave(name); - p->defn = null; - p->nxtptr = *h; - *h = p; - return p; - } - - -/* - * addkywd(name, type) - stores a keyword in the hash table. - */ -void addkywd(name, type) - char *name; - int type; - { - register ndptr p = addent(name); - p->type = type | STATIC; - } - - -/* - * remhash(name, all) - * remove one entry (all==0) or all entries (all!=0) for a given name - * from the hash table. All hash table entries must have been obtained - * from malloc(), so it is safe to free the records themselves. - * However, the ->name and ->defn fields might point to storage which - * was obtained from strsave() -- in which case they may be freed -- or - * to static storage -- in which case they must not be freed. If the - * STATIC bit is set, the fields are not to be freed. - */ -void remhash(name, all) - char *name; - int all; - { - register ndptr p, *h; - /* h always points to the pointer to p */ - - h = &hashtab[hash(name)]; - while ((p = *h) != nil) { - if (strcmp(p->name, name) == 0) { - *h = p->nxtptr; /* delink this record */ - if (!(p->type & STATIC)) { /* free the name and defn */ - free(p->name); /* if they came from strsave */ - if (p->defn != null) free(p->defn); - } /* otherwise leave them */ - free(p); /* free the record itself */ - if (!all) return; /* first occurrence has gone */ - } else { - h = &(p->nxtptr); - } - } - } - //GO.SYSIN DD look.c echo main.c 1>&2 sed 's/.//' >main.c <<'//GO.SYSIN DD main.c' -/* File : main.c - Author : Ozan Yigit - Updated: 4 May 1992 - Defines: M4 macro processor. -*/ - -#include "mdef.h" -#include "extr.h" -#include "ourlims.h" - -char chtype[1 - EOF + UCHAR_MAX]; - -#define is_sym1(c) (chtype[(c)-EOF] > 10) -#define is_sym2(c) (chtype[(c)-EOF] > 0) -#define is_blnk(c) ((unsigned)((c)-1) < ' ') - -/* - * m4 - macro processor - * - * PD m4 is based on the macro tool distributed with the software - * tools (VOS) package, and described in the "SOFTWARE TOOLS" and - * "SOFTWARE TOOLS IN PASCAL" books. It has been expanded to include - * most of the command set of SysV m4, the standard UN*X macro processor. - * - * Since both PD m4 and UN*X m4 are based on SOFTWARE TOOLS macro, - * there may be certain implementation similarities between - * the two. The PD m4 was produced without ANY references to m4 - * sources. - * - * References: - * - * Software Tools distribution: macro - * - * Kernighan, Brian W. and P. J. Plauger, SOFTWARE - * TOOLS IN PASCAL, Addison-Wesley, Mass. 1981 - * - * Kernighan, Brian W. and P. J. Plauger, SOFTWARE - * TOOLS, Addison-Wesley, Mass. 1976 - * - * Kernighan, Brian W. and Dennis M. Ritchie, - * THE M4 MACRO PROCESSOR, Unix Programmer's Manual, - * Seventh Edition, Vol. 2, Bell Telephone Labs, 1979 - * - * System V man page for M4 - * - * Modification History: - * - * Mar 26 1992 RAOK 1. Eliminated magic numbers 8, 255, 256 in favour - * of the standard limits CHAR_BIT, UCHAR_MAX, which - * are in the new header ourlims.h. This is part of - * the "8-bit-clean M4" project. To the best of my - * belief, all of the code should work in EBCDIC, - * ASCII, DEC MNCS, ISO 8859/n, or the Mac character - * set, as long as chars are unsigned. There are - * still some places where signed bytes can cause - * trouble. - * - * 2. Changed expr() to use long int rather than int. - * This is so that we'd get 32-bit arithmetic on a Sun, - * Encore, PC, Mac &c. As part of this, the code for - * shifts has been elaborated to yield signed shifts - * on all machines. The charcon() function didn't work - * with multi-character literals, although it was meant - * to. Now it does. pbrad() has been changed so that - * eval('abcd',0) => abcd, not dcba, which was useless. - * - * 3. I finally got sick of the fact that &&, ||, and - * ?: always evaluate all their arguments. This is - * consistent with UNIX System V Release 3, but I for - * one don't see anything to gain by having eval(0&&1/0) - * crash when it would simply yield 0 in C. Now these - * operators are more consistent with the C preprocessor. - * - * Nov 13 1992 RAOK Added the quoter facility. The purpose of this is - * to make it easier to generate data for a variety of - * programming languages, including sh, awk, Lisp, C. - * There are two holes in the implementation: dumpdef - * prints junk and undefine doesn't release everything. - * This was mainly intended as a prototype to show that - * it could be done. - * - * Jun 16 1992 RAOK Added vquote and gave changequote a 3rd argument. - * The idea of this is to make it possible to quote - * ANY string, including one with unbalanced ` or '. - * I also made eval(c,0) convert decimal->ASCII, so - * that eval(39,0) yields ' and eval(96,0) yields `. - * - * Apr 28 1992 RAOK Used gcc to find and fix ANSI clashes, so that - * PD M4 could be ported to MS-DOS (Turbo C 3). - * Main known remaining problem: use of mktemp(). - * Also, command line handling needs to be worked out. - * - * Mar 26 1992 RAOK PD M4 now accepts file names on the command line - * just like UNIX M4. Warning: macro calls must NOT - * cross file boundaries. UNIX M4 doesn't mind; - * (m4 a b c) and (cat a b c | m4) are just the same - * except for error messages. PD M4 will report an - * unexpected EOF if a file ends while a macro call or - * string is still being parsed. When there is one - * file name argument, or none, you can't tell the - * difference, and that's all I need. - * - * May 15 1991 RAOK DIVNAM was a string constant, but was changed! - * Fixed that and a couple of other things to make - * GCC happy. (Also made "foo$bar" get through.) - * - * Apr 17 1991 RAOK There was a major mistake. If you did - * define(foo, `1 include(bar) 2') where - * file bar held "-bar-" you would naturally - * expect "1 -bar- 2" as the output, but you - * got "1 2-bar-". That is, include file - * processing was postponed until all macros - * had been expanded. The macro gpbc() was - * at fault. I added bb, bbstack[], and the - * code in main.c and serv.c that maintains - * them, in order to work around this bug. - * - * Apr 12 1991 RAOK inspect() didn't handle overflow well. - * Added the automatically maintained macro - * __FILE__, just as in C. To suppress it, - * define NO__FILE. At some point, $# had - * been made to return a value that was off - * by one; it now agrees with SysV M4. - * - * Aug 13 1990 RAOK The System V expr() has three arguments: - * expression [, radix:10 [, mindigits: 1]] - * Brought in my int2str() and wrote pbrad() - * to make this work here. With the wrong # - * of args, acts like System V. - * - * Aug 11 1990 RAOK Told expr.c about the Pascal operators - * not, div, mod, and, or - * so that Pascal constant expressions could - * be evaluated. (It still doesn't handle - * floats.) Fixed a mistake in 'character's. - * - * Apr 23 1988 RAOK Sped it up, mainly by making putback() and - * chrsave() into macros. - * Finished the -o option (was half done). - * Added the System V -e (interactive) option. - * - * Jan 28 1986 Oz Break the whole thing into little - * pieces, for easier (?) maintenance. - * - * Dec 12 1985 Oz Optimize the code, try to squeeze - * few microseconds out.. [didn't try very hard] - * - * Dec 05 1985 Oz Add getopt interface, define (-D), - * undefine (-U) options. - * - * Oct 21 1985 Oz Clean up various bugs, add comment handling. - * - * June 7 1985 Oz Add some of SysV m4 stuff (m4wrap, pushdef, - * popdef, decr, shift etc.). - * - * June 5 1985 Oz Initial cut. - * - * Implementation Notes: - * - * [1] PD m4 uses a different (and simpler) stack mechanism than the one - * described in Software Tools and Software Tools in Pascal books. - * The triple stack nonsense is replaced with a single stack containing - * the call frames and the arguments. Each frame is back-linked to a - * previous stack frame, which enables us to rewind the stack after - * each nested call is completed. Each argument is a character pointer - * to the beginning of the argument string within the string space. - * The only exceptions to this are (*) arg 0 and arg 1, which are - * the macro definition and macro name strings, stored dynamically - * for the hash table. - * - * . . - * | . | <-- sp | . | - * +-------+ +-----+ - * | arg 3 ------------------------------->| str | - * +-------+ | . | - * | arg 2 --------------+ . - * +-------+ | - * * | | | - * +-------+ | +-----+ - * | plev | <-- fp +---------------->| str | - * +-------+ | . | - * | type | . - * +-------+ - * | prcf -----------+ plev: paren level - * +-------+ | type: call type - * | . | | prcf: prev. call frame - * . | - * +-------+ | - * | <----------+ - * +-------+ - * - * [2] We have three types of null values: - * - * nil - nodeblock pointer type 0 - * null - null string ("") - * NULL - Stdio-defined NULL - * - */ - -char buf[BUFSIZE]; /* push-back buffer */ -char *bp = buf; /* first available character */ -char *bb = buf; /* buffer beginning */ -char *endpbb = buf+BUFSIZE; /* end of push-back buffer */ -stae mstack[STACKMAX+1]; /* stack of m4 machine */ -char strspace[STRSPMAX+1]; /* string space for evaluation */ -char *ep = strspace; /* first free char in strspace */ -char *endest= strspace+STRSPMAX;/* end of string space */ -int sp; /* current m4 stack pointer */ -int fp; /* m4 call frame pointer */ -char *bbstack[MAXINP]; /* stack where bb is saved */ -FILE *infile[MAXINP]; /* input file stack (0=stdin) */ -FILE *outfile[MAXOUT]; /* diversion array(0=bitbucket)*/ -FILE *active; /* active output file pointer */ -int ilevel = 0; /* input file stack pointer */ -int oindex = 0; /* diversion index.. */ -char *null = ""; /* as it says.. just a null.. */ -char *m4wraps = ""; /* m4wrap string default.. */ -char lquote = LQUOTE; /* left quote character (`) */ -char rquote = RQUOTE; /* right quote character (') */ -char vquote = VQUOTE; /* verbatim quote character ^V */ -char scommt = SCOMMT; /* start character for comment */ -char ecommt = ECOMMT; /* end character for comment */ -int strip = 0; /* throw away comments? */ - -/* Definitions of diversion files. The last 6 characters MUST be - "XXXXXX" -- that is a requirement of mktemp(). The character - '0' is to be replaced by the diversion number; we assume here - that it is just before the Xs. If not, you will have to alter - the definition of UNIQUE. -*/ - -#if unix -static char DIVNAM[] = "/tmp/m40XXXXXX"; -#else -#if vms -static char DIVNAM[] = "sys$login:m40XXXXXX"; -#else -static char DIVNAM[] = "M40XXXXXX"; /* was \M4, should it be \\M4? */ -#endif -#endif -int UNIQUE = sizeof DIVNAM - 7; /* where to change m4temp. */ -char *m4temp; /* filename for diversions */ -extern char *mktemp(); - - -void cantread(s) - char *s; - { - fprintf(stderr, "m4: %s: ", s); - error("cannot open for input."); - } - - -/* initkwds() - initialises the hash table to contain all the m4 built-in functions. - The original version breached module boundaries, but there did not - seem to be any benefit in that. -*/ -static void initkwds() - { - register int i; - static struct { char *name; int type; } keyword[] = - { - "include", INCLTYPE, - "sinclude", SINCTYPE, - "define", DEFITYPE, - "defn", DEFNTYPE, - "divert", DIVRTYPE, - "expr", EXPRTYPE, - "eval", EXPRTYPE, - "substr", SUBSTYPE, - "ifelse", IFELTYPE, - "ifdef", IFDFTYPE, - "len", LENGTYPE, - "incr", INCRTYPE, - "decr", DECRTYPE, - "dnl", DNLNTYPE, - "changequote", CHNQTYPE, - "changecom", CHNCTYPE, - "index", INDXTYPE, -#ifdef EXTENDED - "paste", PASTTYPE, - "spaste", SPASTYPE, - "m4trim", TRIMTYPE, - "defquote", DEFQTYPE, -#endif - "popdef", POPDTYPE, - "pushdef", PUSDTYPE, - "dumpdef", DUMPTYPE, - "shift", SHIFTYPE, - "translit", TRNLTYPE, - "undefine", UNDFTYPE, - "undivert", UNDVTYPE, - "divnum", DIVNTYPE, - "maketemp", MKTMTYPE, - "errprint", ERRPTYPE, - "m4wrap", M4WRTYPE, - "m4exit", EXITTYPE, -#if unix || vms - "syscmd", SYSCTYPE, - "sysval", SYSVTYPE, -#endif -#if unix - "unix", MACRTYPE, -#else -#if vms - "vms", MACRTYPE, -#endif -#endif - (char*)0, 0 - }; - - for (i = 0; keyword[i].type != 0; i++) - addkywd(keyword[i].name, keyword[i].type); - } - - -/* inspect(Name) - Build an input token.., considering only those which start with - [A-Za-z_]. This is fused with lookup() to speed things up. - name must point to an array of at least MAXTOK characters. -*/ -ndptr inspect(name) - char *name; - { - register char *tp = name; - register char *etp = name+(MAXTOK-1); - register int c; - register unsigned long h = 0; - register ndptr p; - - while (is_sym2(c = gpbc())) { - if (tp == etp) error("m4: token too long"); - *tp++ = c, h = (h << 5) + h + c; - } - putback(c); - *tp = EOS; - for (p = hashtab[h%HASHSIZE]; p != nil; p = p->nxtptr) - if (strcmp(name, p->name) == 0) - return p; - return nil; - } - - -/* - * macro - the work horse.. - * - */ -void macro() - { - char token[MAXTOK]; - register int t; - register FILE *op = active; - static char ovmsg[] = "m4: internal stack overflow"; - - for (;;) { - t = gpbc(); - if (is_sym1(t)) { - register char *s; - register ndptr p; - - putback(t); - if ((p = inspect(s = token)) == nil) { - if (sp < 0) { - while (t = *s++) putc(t, op); - } else { - while (t = *s++) chrsave(t); - } - } else { - /* real thing.. First build a call frame */ - if (sp >= STACKMAX-6) error(ovmsg); - mstack[1+sp].sfra = fp; /* previous call frm */ - mstack[2+sp].sfra = p->type; /* type of the call */ - mstack[3+sp].sfra = 0; /* parenthesis level */ - fp = sp+3; /* new frame pointer */ - /* now push the string arguments */ - mstack[4+sp].sstr = p->defn; /* defn string */ - mstack[5+sp].sstr = p->name; /* macro name */ - mstack[6+sp].sstr = ep; /* start next.. */ - sp += 6; - - t = gpbc(); - putback(t); - if (t != LPAREN) { putback(RPAREN); putback(LPAREN); } - } - } else - if (t == EOF) { - if (sp >= 0) error("m4: unexpected end of input"); - if (--ilevel < 0) break; /* all done thanks */ -#ifndef NO__FILE - remhash("__FILE__", TOP); -#endif - bb = bbstack[ilevel+1]; - (void) fclose(infile[ilevel+1]); - } else - /* non-alpha single-char token seen.. - [the order of else if .. stmts is important.] - */ - if (t == lquote) { /* strip quotes */ - register int nlpar; - - for (nlpar = 1; ; ) { - t = gpbc(); - if (t == rquote) { - if (--nlpar == 0) break; - } else - if (t == lquote) { - nlpar++; - } else { - if (t == vquote) t = gpbc(); - if (t == EOF) { - error("m4: missing right quote"); - } - } - if (sp < 0) { - putc(t, op); - } else { - chrsave(t); - } - } - } else - if (sp < 0) { /* not in a macro at all */ - if (t != scommt) { /* not a comment, so */ - putc(t, op); /* copy it to output */ - } else - if (strip) { /* discard a comment */ - do { - t = gpbc(); - } while (t != ecommt && t != EOF); - } else { /* copy comment to output */ - do { - putc(t, op); - t = gpbc(); - } while (t != ecommt && t != EOF); - putc(t, op); - /* A note on comment handling: this is NOT robust. - | We should do something safe with comments that - | are missing their ecommt termination. - */ - } - } else - switch (t) { - /* There is a peculiar detail to notice here. - Layout is _always_ discarded after left parentheses, - but it is only discarded after commas if they separate - arguments. For example, - define(foo,`|$1|$2|') - foo( a, b) => |a|b| - foo(( a ), ( b )) => |(a )|(b )| - foo((a, x), (b, y)) => |(a, x)|(b, y)| - I find this counter-intuitive, and would expect the code - for LPAREN to read something like this: - - if (PARLEV == 0) { - (* top level left parenthesis: skip layout *) - do t = gpbc(); while (is_blnk(t)); - putback(t); - } else { - (* left parenthesis inside an argument *) - chrsave(t); - } - PARLEV++; - - However, it turned out that Oz wrote the actual code - very carefully to mimic the behaviour of "real" m4; - UNIX m4 really does skip layout after all left parens - but only some commas in just this fashion. Sigh. - */ - case LPAREN: - if (PARLEV > 0) chrsave(t); - do t = gpbc(); while (is_blnk(t)); /* skip layout */ - putback(t); - PARLEV++; - break; - - case COMMA: - if (PARLEV == 1) { - chrsave(EOS); /* new argument */ - if (sp >= STACKMAX) error(ovmsg); - do t = gpbc(); while (is_blnk(t)); /* skip layout */ - putback(t); - mstack[++sp].sstr = ep; - } else { - chrsave(t); - } - break; - - case RPAREN: - if (--PARLEV > 0) { - chrsave(t); - } else { - char **argv = (char **)(mstack+fp+1); - int argc = sp-fp; -#if unix | vms - static int sysval; -#endif - - chrsave(EOS); /* last argument */ - if (sp >= STACKMAX) error(ovmsg); -#ifdef DEBUG - fprintf(stderr, "argc = %d\n", argc); - for (t = 0; t < argc; t++) - fprintf(stderr, "argv[%d] = %s\n", t, argv[t]); -#endif - /* If argc == 3 and argv[2] is null, then we - have a call like `macro_or_builtin()'. We - adjust argc to avoid further checking.. - */ - if (argc == 3 && !argv[2][0]) argc--; - - switch (CALTYP & ~STATIC) { - case MACRTYPE: - expand(argv, argc); - break; - - case DEFITYPE: /* define(..) */ - for (; argc > 2; argc -= 2, argv += 2) - dodefine(argv[2], argc > 3 ? argv[3] : null); - break; - - case PUSDTYPE: /* pushdef(..) */ - for (; argc > 2; argc -= 2, argv += 2) - dopushdef(argv[2], argc > 3 ? argv[3] : null); - break; - - case DUMPTYPE: - dodump(argv, argc); - break; - - case EXPRTYPE: /* eval(Expr) */ - { /* evaluate arithmetic expression */ - /* eval([val: 0[, radix:10 [,min: 1]]]) */ - /* excess arguments are ignored */ - /* eval() with no arguments returns 0 */ - /* this is based on V.3 behaviour */ - int min_digits = 1; - int radix = 10; - long int value = 0; - - switch (argc) { - default: - /* ignore excess arguments */ - case 5: - min_digits = expr(argv[4]); - case 4: - radix = expr(argv[3]); - case 3: - value = expr(argv[2]); - case 2: - break; - } - pbrad(value, radix, min_digits); - } - break; - - case IFELTYPE: /* ifelse(X,Y,IFX=Y,Else) */ - doifelse(argv, argc); - break; - - case IFDFTYPE: /* ifdef(Mac,IfDef[,IfNotDef]) */ - /* select one of two alternatives based on the existence */ - /* of another definition */ - if (argc > 3) { - if (lookup(argv[2]) != nil) { - pbstr(argv[3]); - } else - if (argc > 4) { - pbstr(argv[4]); - } - } - break; - - case LENGTYPE: /* len(Arg) */ - /* find the length of the argument */ - pbnum(argc > 2 ? strlen(argv[2]) : 0); - break; - - case INCRTYPE: /* incr(Expr) */ - /* increment the value of the argument */ - if (argc > 2) pbnum(expr(argv[2]) + 1); - break; - - case DECRTYPE: /* decr(Expr) */ - /* decrement the value of the argument */ - if (argc > 2) pbnum(expr(argv[2]) - 1); - break; - -#if unix || vms - case SYSCTYPE: /* syscmd(Command) */ - /* execute system command */ - /* Make sure m4 output is NOT interrupted */ - fflush(stdout); - fflush(stderr); - - if (argc > 2) sysval = system(argv[2]); - break; - - case SYSVTYPE: /* sysval() */ - /* return value of the last system call. */ - pbnum(sysval); - break; -#endif - - case INCLTYPE: /* include(File) */ - for (t = 2; t < argc; t++) - if (!doincl(argv[t])) cantread(argv[t]); - break; - - case SINCTYPE: /* sinclude(File) */ - for (t = 2; t < argc; t++) - (void) doincl(argv[t]); - break; - -#ifdef EXTENDED - case PASTTYPE: /* paste(File) */ - for (t = 2; t < argc; t++) - if (!dopaste(argv[t])) cantread(argv[t]); - break; - - case SPASTYPE: /* spaste(File) */ - for (t = 2; t < argc; t++) - (void) dopaste(argv[t]); - break; - - case TRIMTYPE: /* m4trim(Source,..) */ - if (argc > 2) m4trim(argv, argc); - break; - - case DEFQTYPE: /* defquote(Mac,...) */ - dodefqt(argv, argc); - break; - - case QUTRTYPE: /* <quote>(text...) */ - doqutr(argv, argc); - break; -#endif - - case CHNQTYPE: /* changequote([Left[,Right]]) */ - dochq(argv, argc); - break; - - case CHNCTYPE: /* changecom([Left[,Right]]) */ - dochc(argv, argc); - break; - - case SUBSTYPE: /* substr(Source[,Offset[,Length]]) */ - /* select substring */ - if (argc > 3) dosub(argv, argc); - break; - - case SHIFTYPE: /* shift(~args~) */ - /* push back all arguments except the first one */ - /* (i.e. skip argv[2]) */ - if (argc > 3) { - for (t = argc-1; t > 3; t--) { - pbqtd(argv[t]); - putback(','); - } - pbqtd(argv[3]); - } - break; - - case DIVRTYPE: /* divert(N) */ - if (argc > 2 && (t = expr(argv[2])) != 0) { - dodiv(t); - } else { - active = stdout; - oindex = 0; - } - op = active; - break; - - case UNDVTYPE: /* undivert(N...) */ - doundiv(argv, argc); - op = active; - break; - - case DIVNTYPE: /* divnum() */ - /* return the number of current output diversion */ - pbnum(oindex); - break; - - case UNDFTYPE: /* undefine(..) */ - /* undefine a previously defined macro(s) or m4 keyword(s). */ - for (t = 2; t < argc; t++) remhash(argv[t], ALL); - break; - - case POPDTYPE: /* popdef(Mac...) */ - /* remove the topmost definitions of macro(s) or m4 keyword(s). */ - for (t = 2; t < argc; t++) remhash(argv[t], TOP); - break; - - case MKTMTYPE: /* maketemp(Pattern) */ - /* create a temporary file */ - if (argc > 2) pbstr(mktemp(argv[2])); - break; - - case TRNLTYPE: /* translit(Source,Dom,Rng) */ - /* replace all characters in the source string that */ - /* appears in the "from" string with the corresponding */ - /* characters in the "to" string. */ - - if (argc > 3) { - char temp[MAXTOK]; - - if (argc > 4) - map(temp, argv[2], argv[3], argv[4]); - else - map(temp, argv[2], argv[3], null); - pbstr(temp); - } else if (argc > 2) - pbstr(argv[2]); - break; - - case INDXTYPE: /* index(Source,Target) */ - /* find the index of the second argument string in */ - /* the first argument string. -1 if not present. */ - pbnum(argc > 3 ? indx(argv[2], argv[3]) : -1); - break; - - case ERRPTYPE: /* errprint(W,...,W) */ - /* print the arguments to stderr file */ - for (t = 2; t < argc; t++) fprintf(stderr, "%s ", argv[t]); - fprintf(stderr, "\n"); - break; - - case DNLNTYPE: /* dnl() */ - /* eat upto and including newline */ - while ((t = gpbc()) != '\n' && t != EOF) ; - break; - - case M4WRTYPE: /* m4wrap(AtExit) */ - /* set up for wrap-up/wind-down activity. */ - /* NB: if there are several calls to m4wrap */ - /* only the last is effective; strange, but */ - /* that's what System V does. */ - m4wraps = argc > 2 ? strsave(argv[2]) : null; - break; - - case EXITTYPE: /* m4exit(Expr) */ - /* immediate exit from m4. */ - killdiv(); /* mustn't forget that one! */ - exit(argc > 2 ? expr(argv[2]) : 0); - break; - - case DEFNTYPE: /* defn(Mac) */ - for (t = 2; t < argc; t++) - dodefn(argv[t]); - break; - - default: - error("m4: major botch in eval."); - break; - } - - ep = PREVEP; /* flush strspace */ - sp = PREVSP; /* previous sp.. */ - fp = PREVFP; /* rewind stack... */ - } - break; - - default: - chrsave(t); /* stack the char */ - break; - } - } - } - - -int main(argc, argv) - int argc; - char **argv; - { - register int c; - register int n; - char *p; - -#ifdef SIGINT - if (signal(SIGINT, SIG_IGN) != SIG_IGN) - signal(SIGINT, onintr); -#endif - - /* Initialise the chtype[] table. - '0' .. '9' -> 1..10 - 'A' .. 'Z' -> 11..37 - 'a' .. 'z' -> 11..37 - '_' -> 38 - all other characters -> 0 - */ - for (c = EOF; c <= UCHAR_MAX; c++) chtype[c - EOF] = 0; - for (c = 1, p = "0123456789"; *p; p++, c++) - chtype[*(unsigned char *)p - EOF] = c; - for (c = 11, p = "abcdefghijklmnopqrstuvwxyz"; *p; p++, c++) - chtype[*(unsigned char *)p - EOF] = c; - for (c = 11, p = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; *p; p++, c++) - chtype[*(unsigned char *)p - EOF] = c; - chtype['_' - EOF] = 38; - -#ifdef NONZEROPAGES - /* If your system does not initialise global variables to */ - /* 0 bits, do it here. */ - for (n = 0; n < HASHSIZE; n++) hashtab[n] = nil; - for (n = 0; n < MAXOUT; n++) outfile[n] = NULL; -#endif - initkwds(); - - while ((c = getopt(argc, argv, "cetD:U:o:B:H:S:T:")) != EOF) { - switch (c) { -#if 0 - case 's': /* enable #line sync in output */ - fprintf(stderr, "m4: this version does not support -s\n"); - exit(2); -#endif - - case 'c': /* strip comments */ - strip ^= 1; - break; - - case 'e': /* interactive */ - (void) signal(SIGINT, SIG_IGN); - setbuf(stdout, NULL); - break; - - case 'D': /* define something..*/ - for (p = optarg; *p && *p != '='; p++) ; - if (*p) *p++ = EOS; - dodefine(optarg, p); - break; - - case 'U': /* undefine... */ - remhash(optarg, TOP); - break; - - case 'B': case 'H': /* System V compatibility */ - case 'S': case 'T': /* ignore them */ - break; - - case 'o': /* specific output */ - if (!freopen(optarg, "w", stdout)) { - perror(optarg); - exit(1); - } - break; - - case '?': - default: - usage(); - } - } - - active = stdout; /* default active output */ - m4temp = mktemp(DIVNAM); /* filename for diversions */ - - sp = -1; /* stack pointer initialized */ - fp = 0; /* frame pointer initialized */ - - if (optind == argc) { /* no more args; read stdin */ - infile[0] = stdin; /* default input (naturally) */ -#ifndef NO__FILE - dodefine("__FILE__", "-"); /* Helas */ -#endif - macro(); /* process that file */ - } else /* file names in commandline */ - for (; optind < argc; optind++) { - char *name = argv[optind]; /* next file name */ - infile[0] = fopen(name, "r"); - if (!infile[0]) cantread(name); -#ifndef NO__FILE - dodefine("__FILE__", name); -#endif - macro(); - fclose(infile[0]); - } - - if (*m4wraps) { /* anything for rundown ?? */ - ilevel = 0; /* in case m4wrap includes.. */ - putback(EOF); /* eof is a must !! */ - pbstr(m4wraps); /* user-defined wrapup act */ - macro(); /* last will and testament */ - } else { /* default wrap-up: undivert */ - for (n = 1; n < MAXOUT; n++) - if (outfile[n] != NULL) getdiv(n); - } - - if (outfile[0] != NULL) { /* remove bitbucket if used */ - (void) fclose(outfile[0]); - m4temp[UNIQUE] = '0'; -#if unix - (void) unlink(m4temp); -#else - (void) remove(m4temp); -#endif - } - exit(0); - return 0; - } - //GO.SYSIN DD main.c echo misc.c 1>&2 sed 's/.//' >misc.c <<'//GO.SYSIN DD misc.c' -/* File : misc.c - Author : Ozan Yigit - Updated: 26-Mar-1993 - Purpose: Miscellaneous support code for PD M4. -*/ - -#include "mdef.h" -#include "extr.h" -#include "ourlims.h" - -#ifdef DUFFCP - -/* This version of the ANSI standard function memcpy() - uses Duff's Device (tm Tom Duff) to unroll the copying loop: - while (count-- > 0) *to++ = *from++; -*/ -void memcpy(to, from, count) - register char *from, *to; - register int count; - { - if (count > 0) { - register int loops = (count+8-1) >> 3; /* div 8 round up */ - - switch (count & (8-1)) { /* mod 8 */ - case 0: do { *to++ = *from++; - case 7: *to++ = *from++; - case 6: *to++ = *from++; - case 5: *to++ = *from++; - case 4: *to++ = *from++; - case 3: *to++ = *from++; - case 2: *to++ = *from++; - case 1: *to++ = *from++; - } while (--loops > 0); - } - } - } - -#endif - - -/* strsave(s) - return a new malloc()ed copy of s -- same as V.3's strdup(). -*/ -char *strsave(s) - char *s; - { - register int n = strlen(s)+1; - char *p = malloc(n); - - if (p) memcpy(p, s, n); - return p; - } - - -/* indx(s1, s2) - if s1 can be decomposed as alpha || s2 || omega, return the length - of the shortest such alpha, otherwise return -1. -*/ -int indx(s1, s2) - char *s1; - char *s2; - { - register char *t; - register char *m; - register char *p; - - for (p = s1; *p; p++) { - for (t = p, m = s2; *m && *m == *t; m++, t++); - if (!*m) return p-s1; - } - return -1; - } - - -char pbmsg[] = "m4: too many characters pushed back"; - -/* Xputback(c) - push character c back onto the input stream. - This is now macro putback() in misc.h -*/ -void Xputback(c) - char c; - { - if (bp < endpbb) *bp++ = c; else error(pbmsg); - } - - -/* pbstr(s) - push string s back onto the input stream. - putback() has been unfolded here to improve performance. - Example: - s = <ABC> - bp = <more stuff> - After the call: - bp = <more stuffCBA> - It would be more efficient if we ran the pushback buffer in the - opposite direction -*/ -void pbstr(s) - register char *s; - { - register char *es; - register char *zp; - - zp = bp; - for (es = s; *es; ) es++; /* now es points to terminating NUL */ - bp += es-s; /* advance bp as far as it should go */ - if (bp >= endpbb) error("m4: too many characters to push back"); - while (es > s) *zp++ = *--es; - } - - -/* pbqtd(s) - pushes string s back "quoted", doing whatever has to be done to it to - make sure that the result will evaluate to the original value. As it - happens, we have only to add lquote and rquote. -*/ -void pbqtd(s) - register char *s; - { - register char *es; - register char *zp; - - zp = bp; - for (es = s; *es; ) es++; /* now es points to terminating NUL */ - bp += 2+es-s; /* advance bp as far as it should go */ - if (bp >= endpbb) error("m4: too many characters to push back"); - *zp++ = rquote; - while (es > s) *zp++ = *--es; - *zp++ = lquote; - } - - -/* pbnum(n) - convert a number to a (decimal) string and push it back. - The original definition did not work for MININT; this does. -*/ -void pbnum(n) - int n; - { - register int num; - - num = n > 0 ? -n : n; /* MININT <= num <= 0 */ - do { - putback('0' - (num % 10)); - } while ((num /= 10) < 0); - if (n < 0) putback('-'); - } - - -/* pbrad(n, r, m) - converts a number n to base r ([-36..-2] U [2..36]), with at least - m digits. If r == 10 and m == 1, this is exactly the same as pbnum. - However, this uses the function int2str() from R.A.O'Keefe's public - domain string library, and puts the results of that back. - The Unix System V Release 3 version of m4 accepts radix 1; - THIS VERSION OF M4 DOES NOT ACCEPT RADIX 1 OR -1, - nor do we accept radix < -36 or radix > 36. At the moment such bad - radices quietly produce nothing. The V.3 treatment of radix 1 is - push back abs(n) "1"s, then - if n < 0, push back one "-". - Until I come across something which uses it, I can't bring myself to - implement this. - - I have, however, found a use for radix 0. Unsurprisingly, it is - related to radix 0 in Edinburgh Prolog. - eval('c1c2...cn', 0, m) - pushes back max(m-n,0) blanks and the characters c1...cn. This can - adjust to any byte size as long as UCHAR_MAX = (1 << CHAR_BIT) - 1. - In particular, eval(c, 0) where 0 < c <= UCHAR_MAX, pushes back the - character with code c. Note that this has to agree with eval(); so - both of them have to use the same byte ordering. -*/ -void pbrad(n, r, m) - long int n; - int r, m; - { - char buffer[34]; - char *p; - int L; - - if (r == 0) { - unsigned long int x = (unsigned long)n; - int n; - - for (n = 0; x; x >>= CHAR_BIT, n++) buffer[n] = x & UCHAR_MAX; - for (L = n; --L >= 0; ) putback(buffer[L]); - for (L = m-n; --L >= 0; ) putback(' '); - return; - } - L = m - (int2str(p = buffer, -r, n)-buffer); - if (buffer[0] == '-') L++, p++; - if (L > 0) { - pbstr(p); - while (--L >= 0) putback('0'); - if (p != buffer) putback('-'); - } else { - pbstr(buffer); - } - } - - -char csmsg[] = "m4: string space overflow"; - -/* chrsave(c) - put the character c in the string space. -*/ -void Xchrsave(c) - char c; - { -#if 0 - if (sp < 0) putc(c, active); else -#endif - if (ep < endest) *ep++ = c; else - error(csmsg); - } - - -/* getdiv(ind) - read in a diversion file and then delete it. -*/ -void getdiv(ind) - int ind; - { - register int c; - register FILE *dfil; - register FILE *afil; - - afil = active; - if (outfile[ind] == afil) - error("m4: undivert: diversion still active."); - (void) fclose(outfile[ind]); - outfile[ind] = NULL; - m4temp[UNIQUE] = '0' + ind; - if ((dfil = fopen(m4temp, "r")) == NULL) - error("m4: cannot undivert."); - while ((c = getc(dfil)) != EOF) putc(c, afil); - (void) fclose(dfil); - -#if vms - if (remove(m4temp)) error("m4: cannot unlink."); -#else - if (unlink(m4temp) == -1) error("m4: cannot unlink."); -#endif - } - - -/* killdiv() - delete all the diversion files which have been created. -*/ -void killdiv() - { - register int n; - - for (n = 0; n < MAXOUT; n++) { - if (outfile[n] != NULL) { - (void) fclose(outfile[n]); - m4temp[UNIQUE] = '0' + n; -#if unix - (void) unlink(m4temp); -#else - (void) remove(m4temp); -#endif - } - } - } - - -/* error(s) - close all files, report a fatal error, and quit, letting the caller know. -*/ -void error(s) - char *s; - { - killdiv(); - fprintf(stderr, "%s\n", s); - exit(1); - } - - -/* Interrupt handling -*/ -static char *msg = "\ninterrupted."; - -#ifdef __STDC__ -void onintr(int signo) -#else -onintr() -#endif - { - error(msg); - } - - -void usage() - { - fprintf(stderr, "Usage: m4 [-e] [-[BHST]int] [-Dname[=val]] [-Uname]\n"); - exit(1); - } - -#ifdef GETOPT -/* Henry Spencer's getopt() - get option letter from argv */ - -char *optarg; /* Global argument pointer. */ -int optind = 0; /* Global argv index. */ - -static char *scan = NULL; /* Private scan pointer. */ - -#ifndef __STDC__ -extern char *index(); -#define strchr index -#endif - -int getopt(argc, argv, optstring) - int argc; - char **argv; - char *optstring; - { - register char c; - register char *place; - - optarg = NULL; - - if (scan == NULL || *scan == '\0') { - if (optind == 0) optind++; - if (optind >= argc - || argv[optind][0] != '-' - || argv[optind][1] == '\0') - return EOF; - if (strcmp(argv[optind], "--") == 0) { - optind++; - return EOF; - } - scan = argv[optind]+1; - optind++; - } - c = *scan++; - place = strchr(optstring, c); - - if (place == NULL || c == ':') { - fprintf(stderr, "%s: unknown option -%c\n", argv[0], c); - return '?'; - } - place++; - if (*place == ':') { - if (*scan != '\0') { - optarg = scan; - scan = NULL; - } else { - optarg = argv[optind]; - optind++; - } - } - return c; - } -#endif - //GO.SYSIN DD misc.c echo serv.c 1>&2 sed 's/.//' >serv.c <<'//GO.SYSIN DD serv.c' -/* File : serv.c - Author : Ozan Yigit - Updated: 4 May 1992 - Defines: Principal built-in macros for PD M4. -*/ - -#include "mdef.h" -#include "extr.h" -#include "ourlims.h" - -#define ucArgv(n) ((unsigned char *)argv[n]) - -/* 26-Mar-1993 Made m4trim() 8-bit clean. -*/ - -/* expand(<DS FN A1 ... An>) - 0 1 2 n+1 -- initial indices in argv[] - -1 0 1 n -- after adjusting argv++, argc-- - This expands a user-defined macro; FN is the name of the macro, DS - is its definition string, and A1 ... An are its arguments. -*/ -void expand(argv, argc) - char **argv; - int argc; - { - register char *t; - register char *p; - register int n; - -#ifdef DEBUG - fprintf(stderr, "expand(%s,%d)\n", argv[1], argc); -#endif - argc--; /* discount definition string (-1th arg) */ - t = *argv++; /* definition string as a whole */ - for (p = t; *p++; ) ; - p -= 2; /* points to last character of definition */ - while (p > t) { /* if definition is empty, fails at once */ - if (*--p != ARGFLAG) { - putback(p[1]); - } else { - switch (p[1]) { - case '#': - pbnum(argc-1); - break; - case '0': case '1': case '2': case '3': case '4': - case '5': case '6': case '7': case '8': case '9': - if ((n = p[1]-'0') < argc) pbstr(argv[n]); - break; - case '*': /* push all arguments back */ - for (n = argc-1; n > 1; n--) { - pbstr(argv[n]); - putback(','); - } - pbstr(argv[1]); - break; - case '@': /* push arguments back quoted */ - for (n = argc-1; n > 1; n--) { - pbqtd(argv[n]); - putback(','); - } - pbqtd(argv[1]); - break; - case '$': /* $$ => $ */ - break; - default: - putback(p[1]); - putback(p[0]); - break; - } - p--; - } - } - if (p == t) putback(*p); /* do last character */ - } - - -static char nuldefmsg[] = "m4: defining null name."; -static char recdefmsg[] = "m4: macro defined as itself."; - -/* dodefine(Name, Definition) - install Definition as the only definition of Name in the hash table. - */ -void dodefine(name, defn) - register char *name; - register char *defn; - { - register ndptr p; - - if (!name || !*name) error(nuldefmsg); - if (strcmp(name, defn) == 0) error(recdefmsg); -#ifdef DEBUG - fprintf(stderr, "define(%s,--)\n", name); -#endif - if ((p = lookup(name)) == nil) { - p = addent(name); - } else - if (p->defn != null) { /* what if p->type & STATIC ? */ - free(p->defn); - } - p->defn = !defn || !*defn ? null : strsave(defn); - p->type = MACRTYPE; - } - - -/* dopushdef(Name, Definition) - install Definition as the *first* definition of Name in the hash table, - but do not remove any existing definitions. The new definition will - hide any old ones until a popdef() removes it. -*/ -void dopushdef(name, defn) - register char *name; - register char *defn; - { - register ndptr p; - - if (!name || !*name) error(nuldefmsg); - if (strcmp(name, defn) == 0) error(recdefmsg); -#ifdef DEBUG - fprintf(stderr, "pushdef(%s,--)\n", name); -#endif - p = addent(name); - p->defn = !defn || !*defn ? null : strsave(defn); - p->type = MACRTYPE; - } - - -/* dodefn(Name) - push back a *quoted* copy of Name's definition. -*/ -void dodefn(name) - char *name; - { - register ndptr p; - - if ((p = lookup(name)) != nil && p->defn != null) pbqtd(p->defn); - } - - -/* dodump(<? dump>) dump all definition in the hash table - dodump(<? dump F1 ... Fn>) dump the definitions of F1 ... Fn in that order - The requested definitions are written to stderr. What happens to names - which have a built-in (numeric) definition? -*/ -void dodump(argv, argc) - register char **argv; - register int argc; - { - register int n; - ndptr p; - static char dumpfmt[] = "define(`%s',\t`%s')\n"; - - if (argc > 2) { - for (n = 2; n < argc; n++) - if ((p = lookup(argv[n])) != nil) - fprintf(stderr, dumpfmt, p->name, p->defn); - } else { - for (n = 0; n < HASHSIZE; n++) - for (p = hashtab[n]; p != nil; p = p->nxtptr) - fprintf(stderr, dumpfmt, p->name, p->defn); - } - } - - -/* doifelse(<? ifelse {x y ifx=y}... [else]>) - 0 1 2 3 4 [2 when we get to it] -*/ -void doifelse(argv, argc) - register char **argv; - register int argc; - { - for (; argc >= 5; argv += 3, argc -= 3) - if (strcmp(argv[2], argv[3]) == 0) { - pbstr(argv[4]); - return; - } - if (argc >= 3) pbstr(argv[2]); - } - - -/* doinclude(FileName) - include a given file. -*/ -int doincl(FileName) - char *FileName; - { - if (ilevel+1 == MAXINP) error("m4: too many include files."); -#ifdef DEBUG - fprintf(stderr, "include(%s)\n", FileName); -#endif - if ((infile[ilevel+1] = fopen(FileName, "r")) != NULL) { -#ifndef NO__FILE - dopushdef("__FILE__", FileName); -#endif - bbstack[ilevel+1] = bb; - bb = bp; - ilevel++; - return 1; - } else { - return 0; - } - } - - -#ifdef EXTENDED -/* dopaste(FileName) - copy a given file to the output stream without any macro processing. -*/ -int dopaste(FileName) - char *FileName; - { - register FILE *pf; - register FILE *afil = active; - register int c; - - if ((pf = fopen(FileName, "r")) != NULL) { - while ((c = getc(pf)) != EOF) putc(c, afil); - (void) fclose(pf); - return 1; - } else { - return 0; - } - } -#endif - - -/* dochq(<? changequote [left [right [verbatim]]]>) - 0 1 2 3 4 - change the quote characters; to single characters only. - Empty arguments result in no change for that parameter. - Missing arguments result in defaults: - changequote => ` ' ^V - changequote(q) => q q ^V - changequote(l,r) => l r ^V - changequote(l,r,v) => l r v - There isn't any way of switching the verbatim-quote off, - but if you make it the same as the right quote it won't - be able to do anything (we check for R, L, V in that order). -*/ -void dochq(argv, argc) - register char **argv; - register int argc; - { - if (argc > 2) { - if (*argv[2]) lquote = *argv[2]; - if (argc > 3) { - if (*argv[3]) rquote = *argv[3]; - if (argc > 4 && *argv[4]) vquote = *argv[4]; - } else { - rquote = lquote; - } - } else { - lquote = LQUOTE; - rquote = RQUOTE; - vquote = VQUOTE; - } - } - - -/* dochc(<? changecomment [left [right]]>) - 0 1 2 3 - change the comment delimiters; to single characters only. -*/ -void dochc(argv, argc) - register char **argv; - register int argc; - { - if (argc > 2) { - if (*argv[2]) scommt = *argv[2]; - if (argc > 3) { - if (*argv[3]) ecommt = *argv[3]; - } else { - ecommt = ECOMMT; - } - } else { - scommt = '\0'; /* assuming no nulls in input */ - ecommt = '\0'; - } - } - - -/* dodivert - divert the output to a temporary file -*/ -void dodiv(n) - register int n; - { - if (n < 0 || n >= MAXOUT) n = 0; /* bitbucket */ - if (outfile[n] == NULL) { - m4temp[UNIQUE] = '0' + n; - if ((outfile[n] = fopen(m4temp, "w")) == NULL) - error("m4: cannot divert."); - } - oindex = n; - active = outfile[n]; - } - - -/* doundivert - undivert a specified output, or all - * other outputs, in numerical order. -*/ -void doundiv(argv, argc) - register char **argv; - register int argc; - { - register int ind; - register int n; - - if (argc > 2) { - for (ind = 2; ind < argc; ind++) { - n = expr(argv[ind]); - if (n > 0 && n < MAXOUT && outfile[n] != NULL) getdiv(n); - } - } else { - for (n = 1; n < MAXOUT; n++) - if (outfile[n] != NULL) getdiv(n); - } - } - - -/* dosub(<? substr {offset} [{length}]>) - The System V Interface Definition does not say what happens when the - offset or length are out of range. I have chosen to force them into - range, with the result that unlike the former version of this code, - dosub cannot be tricked into SIGSEGV. - - BUG: This is not 8-bit clean yet. -*/ -void dosub(argv, argc) - char **argv; - int argc; - { - register int nc; /* number of characters */ - register char *ap = argv[2]; /* target string */ - register int al = strlen(ap); /* its length */ - register int df = expr(argv[3]);/* offset */ - - if (df < 0) df = 0; else /* force df back into the range */ - if (df > al) df = al; /* 0 <= df <= al */ - al -= df; /* now al limits nc */ - - if (argc >= 5) { /* nc is provided */ - nc = expr(argv[4]); - if (nc < 0) nc = 0; else /* force nc back into the range */ - if (nc > al) nc = al; /* 0 <= nc <= strlen(ap)-df */ - } else { - nc = al; /* default is all rest of ap */ - } - ap += df + nc; - while (--nc >= 0) putback(*--ap); - } - - -/* map(dest, src, from, to) - map every character of src that is specified in from - into "to" and replace in dest. (source "src" remains untouched) - - This is a standard implementation of Icon's map(s,from,to) function. - Within mapvec, we replace every character of "from" with the - corresponding character in "to". If "to" is shorter than "from", - then the corresponding entries are null, which means that those - characters disappear altogether. Furthermore, imagine a call like - map(dest, "sourcestring", "srtin", "rn..*"). In this case, `s' maps - to `r', `r' maps to `n' and `n' maps to `*'. Thus, `s' ultimately - maps to `*'. In order to achieve this effect in an efficient manner - (i.e. without multiple passes over the destination string), we loop - over mapvec, starting with the initial source character. If the - character value (dch) in this location is different from the source - character (sch), sch becomes dch, once again to index into mapvec, - until the character value stabilizes (i.e. sch = dch, in other words - mapvec[n] == n). Even if the entry in the mapvec is null for an - ordinary character, it will stabilize, since mapvec[0] == 0 at all - times. At the end, we restore mapvec* back to normal where - mapvec[n] == n for 0 <= n <= 127. This strategy, along with the - restoration of mapvec, is about 5 times faster than any algorithm - that makes multiple passes over the destination string. -*/ - -void map(d, s, f, t) - char *d, *s, *f, *t; - { - register unsigned char *dest = (unsigned char *)d; - register unsigned char *src = (unsigned char *)s; - unsigned char *from = (unsigned char *)f; - register unsigned char *to = (unsigned char *)t; - register unsigned char *tmp; - register unsigned char sch, dch; - static unsigned char mapvec[1+UCHAR_MAX] = {1}; - - if (mapvec[0]) { - register int i; - for (i = 0; i <= UCHAR_MAX; i++) mapvec[i] = i; - } - if (src && *src) { - /* create a mapping between "from" and "to" */ - if (to && *to) - for (tmp = from; sch = *tmp++; ) mapvec[sch] = *to++; - else - for (tmp = from; sch = *tmp++; ) mapvec[sch] = '\0'; - - while (sch = *src++) { - while ((dch = mapvec[sch]) != sch) sch = dch; - if (*dest = dch) dest++; - } - /* restore all the changed characters */ - for (tmp = from; sch = *tmp++; ) mapvec[sch] = sch; - } - *dest = '\0'; - } - - -#ifdef EXTENDED - -/* m4trim(<? m4trim [string [leading [trailing [middle [rep]]]]]>) - 0 1 2 3 4 5 6 - - (1) Any prefix consisting of characters in the "leading" set is removed. - The default is " \t\n". - (2) Any suffix consisting of characters in the "trailing" set is removed. - The default is to be the same as leading. - (3) Any block of consecutive characters in the "middle" set is replaced - by the rep string. The default for middle is " \t\n", and the - default for rep is the first character of middle. -*/ -void m4trim(argv, argc) - char **argv; - int argc; - { - static unsigned char repbuf[2] = " "; - static unsigned char layout[] = " \t\n\r\f"; - unsigned char *string = argc > 2 ? ucArgv(2) : repbuf+1; - unsigned char *leading = argc > 3 ? ucArgv(3) : layout; - unsigned char *trailing = argc > 4 ? ucArgv(4) : leading; - unsigned char *middle = argc > 5 ? ucArgv(5) : trailing; - unsigned char *rep = argc > 6 ? ucArgv(6) : - (repbuf[0] = *middle, repbuf); - static unsigned char sets[1+UCHAR_MAX]; -# define PREF 1 -# define SUFF 2 -# define MIDL 4 - register int i, n; - - for (i = UCHAR_MAX; i >= 0; ) sets[i--] = 0; - while (*leading) sets[*leading++] |= PREF; - while (*trailing) sets[*trailing++] |= SUFF; - while (*middle) sets[*middle++] |= MIDL; - - while (*string && sets[*string]&PREF) string++; - n = strlen((char *)string); - while (n > 0 && sets[string[n-1]]&SUFF) n--; - while (n > 0) { - i = string[--n]; - if (sets[i]&MIDL) { - pbstr((char*)rep); - while (n > 0 && sets[string[n-1]]&MIDL) n--; - } else { - putback(i); - } - } - } - - -/* defquote(MacroName # The name of the "quoter" macro to be defined. - [, Opener # default: "'". The characters to place at the - # beginning of the result. - [, Separator # default: ",". The characters to place between - # successive arguments. - [, Closer # default: same as Opener. The characters to - # place at the end of the result. - [, Escape # default: `' The escape character to put in - # front of things that need escaping. - [, Default # default: simple. Possible values are - # [lL].* = letter, corresponds to PLAIN1. - # [dD].* = digit, corresponds to PLAIN2. - # [sS].* = simple, corresponds to SIMPLE. - # [eE].* = escaped,corresponds to SCAPED. - # .*, corresponds to FANCY - [, Letters # default: `'. The characters of type "L". - [, Digits # default: `'. The characters of type "D". - [, Simple # default: `'. The characters of type "S". - [, Escaped # default: `'. The characters of type "E". - {, Fancy # default: none. Each has the form `C'`Repr' - # saying that the character C is to be represented - # as Repr. Can be used for trigraphs, \n, &c. - }]]]]]]]]]) - - Examples: - defquote(DOUBLEQT, ") - defquote(SINGLEQT, ') - After these definitions, - DOUBLEQT(a, " b", c) => "a,"" b"",c" - SINGLEQT("Don't`, 'he said.") => '"Don''t, he said."' - Other examples defining quote styles for several languages will be - provided later. - - A quoter is represented in M4 by a special identifying number and a - pointer to a Quoter record. I expect that there will be few quoters - but that they will need to go fairly fast. - -*/ - -#define PLAIN1 0 -#define PLAIN2 1 -#define SIMPLE 2 -#define SCAPED 3 -#define FANCY 4 - -struct Quoter - { - char *opener; - char *separator; - char *closer; - char *escape; - char *fancy[1+UCHAR_MAX]; - char class[1+UCHAR_MAX]; - }; - -void freeQuoter(q) - struct Quoter *q; - { - int i; - - free(q->opener); - free(q->separator); - free(q->closer); - free(q->escape); - for (i = UCHAR_MAX; i >= 0; i--) - if (q->fancy[i]) free(q->fancy[i]); - free((char *)q); - } - -/* dodefqt(< - 0 ? - 1 defquote - 2 MacroName - [ 3 Opener - [ 4 Separator - [ 5 Closer - [ 6 Escape - [ 7 Default - [ 8 Letters - [ 9 Digits - [10 Simple - [11 Escaped - [11+i Fancy[i] ]]]]]]]]]]>) -*/ - -void dodefqt(argv, argc) - char **argv; - int argc; - { - struct Quoter q, *r; - register int i; - register unsigned char *s; - register int c; - ndptr p; - - if (!(argc > 2 && *argv[2])) error(nuldefmsg); - switch (argc > 7 ? argv[7][0] : '\0') { - case 'l': case 'L': c = PLAIN1; break; - case 'd': case 'D': c = PLAIN2; break; - case 'e': case 'E': c = SCAPED; break; - case 'f': case 'F': c = FANCY; break; - default: c = SIMPLE; - } - for (i = UCHAR_MAX; --i >= 0; ) q.class[i] = c; - for (i = UCHAR_MAX; --i >= 0; ) q.fancy[i] = 0; - q.opener = strsave(argc > 3 ? argv[3] : ""); - q.separator = strsave(argc > 4 ? argv[4] : ","); - q.closer = strsave(argc > 5 ? argv[5] : q.opener); - q.escape = strsave(argc > 6 ? argv[6] : ""); - if (argc > 8) - for (s = (unsigned char *)argv[8]; c = *s++; ) - q.class[c] = PLAIN1; - if (argc > 9) - for (s = (unsigned char *)argv[9]; c = *s++; ) - q.class[c] = PLAIN2; - if (argc > 10) - for (s = (unsigned char *)argv[10]; c = *s++; ) - q.class[c] = SIMPLE; - if (argc > 11) - for (s = (unsigned char *)argv[11]; c = *s++; ) - q.class[c] = SCAPED; - for (i = 12; i < argc; i++) { - s = (unsigned char *)argv[i]; - c = *s++; - q.fancy[c] = strsave((char *)s); - q.class[c] = FANCY; - } - /* Now we have to make sure that the closing quote works. */ - if ((c = q.closer[0]) && q.class[c] <= SIMPLE) { - if (q.escape[0]) { - q.class[c] = SCAPED; - } else { - char buf[3]; - buf[0] = c, buf[1] = c, buf[2] = '\0'; - q.fancy[c] = strsave(buf); - q.class[c] = FANCY; - } - } - /* We also have to make sure that the escape (if any) works. */ - if ((c = q.escape[0]) && q.class[c] <= SIMPLE) { - q.class[c] = SCAPED; - } - r = (struct Quoter *)malloc(sizeof *r); - if (r == NULL) error("m4: no more memory"); - *r = q; - p = addent(argv[2]); - p->defn = (char *)r; - p->type = QUTRTYPE; - } - - -/* doqutr(<DB MN A1 ... An>) - 0 1 2 n+1 argc - argv[0] points to the struct Quoter. - argv[1] points to the name of this quoting macro - argv[2..argc-1] point to the arguments. - This applies a user-defined quoting macro. For example, we could - define a macro to produce Prolog identifiers: - defquote(plid, ', , ', , simple, - abcdefghijklmnopqrstuvwxyz, - ABCDEFGHIJKLMNOPQRSTUVWXYZ_0123456789) - - After doing that, - plid(foo) => foo - plid(*) => '*' - plid(Don't) => 'Don''t' - plid(foo,) => 'foo' -*/ -void doqutr(argv, argc) - char **argv; - int argc; - /* DEFINITION-BLOCK MacroName Arg1 ... Argn - 0 1 2 n-1 argc - */ - { - struct Quoter *r = (struct Quoter *)argv[0]; - char *p; - register unsigned char *b, *e; - int i; - register int c; - - for (;;) { /* does not actually loop */ - if (argc != 3) break; - b = ucArgv(2); - e = b + strlen((char*)b); - if (e == b) break; - if (r->class[*b++] != PLAIN1) break; - while (b != e && r->class[*b] <= PLAIN2) b++; - if (b != e) break; - pbstr(argv[2]); - return; - } - - p = r->closer; - if (argc < 3) { - pbstr(p); - } else - for (i = argc-1; i >= 2; i--) { - pbstr(p); - b = ucArgv(i); - e = b+strlen((char *)b); - while (e != b) - switch (r->class[c = *--e]) { - case FANCY: - p = r->fancy[c]; - if (p) { - pbstr(p); - } else { - pbrad(c, 8, 1); - pbstr(r->escape); - } - break; - case SCAPED: - putback(c); - pbstr(r->escape); - break; - default: - putback(c); - break; - } - p = r->separator; - } - pbstr(r->opener); - } - -#endif //GO.SYSIN DD serv.c echo extr.h 1>&2 sed 's/.//' >extr.h <<'//GO.SYSIN DD extr.h' -/* Header : extr.h - Author : Ozan Yigit - Updated: %G% -*/ -#ifndef putback - -extern ndptr hashtab[]; /* hash table for macros etc. */ -extern char buf[]; /* push-back buffer */ -extern char *bp; /* first available character */ -extern char *bb; /* current beginning of bp */ -extern char *endpbb; /* end of push-back buffer */ -extern stae mstack[]; /* stack of m4 machine */ -extern char *ep; /* first free char in strspace */ -extern char *endest; /* end of string space */ -extern int sp; /* current m4 stack pointer */ -extern int fp; /* m4 call frame pointer */ -extern char *bbstack[]; -extern FILE *infile[]; /* input file stack (0=stdin) */ -extern FILE *outfile[]; /* diversion array(0=bitbucket)*/ -extern FILE *active; /* active output file pointer */ -extern char *m4temp; /* filename for diversions */ -extern int UNIQUE; /* where to change m4temp */ -extern int ilevel; /* input file stack pointer */ -extern int oindex; /* diversion index.. */ -extern char *null; /* as it says.. just a null.. */ -extern char *m4wraps; /* m4wrap string default.. */ -extern char lquote; /* left quote character (`) */ -extern char rquote; /* right quote character (') */ -extern char vquote; /* verbatim quote character ^V */ -extern char scommt; /* start character for comment */ -extern char ecommt; /* end character for comment */ - -/* inlined versions of chrsave() and putback() */ - -extern char pbmsg[]; /* error message for putback */ -extern char csmsg[]; /* error message for chrsave */ - -#define putback(c) do { if (bp >= endpbb) error(pbmsg); *bp++ = c; } while (0) -#define chrsave(c) do { if (ep >= endest) error(csmsg); *ep++ = c; } while (0) - -/* getopt() interface */ - -extern char * optarg; -extern int optind; -#ifdef __STDC__ -extern int getopt(int, char **, char *); -#else -extern int getopt(); -#endif - -#ifdef __STDC__ -#include <stdlib.h> - -/* functions from misc.c */ - -extern char * strsave(char *); -extern int indx(char *, char *); -extern void pbstr(char *); -extern void pbqtd(char *); -extern void pbnum(int); -extern void pbrad(long int, int, int); -extern void getdiv(int); -extern void killdiv(); -extern void error(char *); -extern void onintr(int); -extern void usage(); - -/* functions from look.c */ - -extern ndptr lookup(char *); -extern ndptr addent(char *); -extern void remhash(char *, int); -extern void addkywd(char *, int); - -/* functions from int2str.c */ - -extern char* int2str(/* char*, int, long */); - -/* functions from serv.c */ - -extern void expand(char **, int); -extern void dodefine(char *, char *); -extern void dopushdef(char *, char *); -extern void dodefn(char *); -extern void dodump(char **, int); -extern void doifelse(char **, int); -extern int doincl(char *); -extern void dochq(char **, int); -extern void dochc(char **, int); -extern void dodiv(int); -extern void doundiv(char **, int); -extern void dosub(char **, int); -extern void map(char *, char *, char *, char *); -#ifdef EXTENDED -extern int dopaste(char *); -extern void m4trim(char **, int); -extern void dodefqt(char **, int); -extern void doqutr(char **, int); -#endif - -/* functions from expr.c */ - -extern long expr(char *); - -#else - -/* functions from misc.c */ - -extern char * malloc(); -extern char * strsave(); -extern int indx(); -extern void pbstr(); -extern void pbqtd(); -extern void pbnum(); -extern void pbrad(); -extern void getdiv(); -extern void killdiv(); -extern void error(); -extern int onintr(); -extern void usage(); - -/* functions from look.c */ - -extern ndptr lookup(); -extern ndptr addent(); -extern void remhash(); -extern void addkywd(); - -/* functions from int2str.c */ - -extern char* int2str(/* char*, int, long */); - -/* functions from serv.c */ - -extern void expand(); -extern void dodefine(); -extern void dopushdef(); -extern void dodefn(); -extern void dodump(); -extern void doifelse(); -extern int doincl(); -extern void dochq(); -extern void dochc(); -extern void dodiv(); -extern void doundiv(); -extern void dosub(); -extern void map(); -#ifdef EXTENDED -extern int dopaste(); -extern void m4trim(); -extern void dodefqt(); -extern void doqutr(); -#endif - -/* functions from expr.c */ - -extern long expr(); - -#endif -#endif //GO.SYSIN DD extr.h echo mdef.h 1>&2 sed 's/.//' >mdef.h <<'//GO.SYSIN DD mdef.h' -/* Header : mdef.h - Author : Ozan Yigit - Updated: 4 May 1992 -*/ -#ifndef MACRTYPE - -#ifndef unix -#define unix 0 -#endif - -#ifndef vms -#define vms 0 -#endif - -#include <stdio.h> -#include <signal.h> - -#ifdef __STDC__ -#include <string.h> -#else -#ifdef VOID -#define void int -#endif -extern int strlen(); -extern int strcmp(); -extern void memcpy(); -#endif - -/* m4 constants */ - -#define MACRTYPE 1 -#define DEFITYPE 2 -#define EXPRTYPE 3 -#define SUBSTYPE 4 -#define IFELTYPE 5 -#define LENGTYPE 6 -#define CHNQTYPE 7 -#define SYSCTYPE 8 -#define UNDFTYPE 9 -#define INCLTYPE 10 -#define SINCTYPE 11 -#define PASTTYPE 12 -#define SPASTYPE 13 -#define INCRTYPE 14 -#define IFDFTYPE 15 -#define PUSDTYPE 16 -#define POPDTYPE 17 -#define SHIFTYPE 18 -#define DECRTYPE 19 -#define DIVRTYPE 20 -#define UNDVTYPE 21 -#define DIVNTYPE 22 -#define MKTMTYPE 23 -#define ERRPTYPE 24 -#define M4WRTYPE 25 -#define TRNLTYPE 26 -#define DNLNTYPE 27 -#define DUMPTYPE 28 -#define CHNCTYPE 29 -#define INDXTYPE 30 -#define SYSVTYPE 31 -#define EXITTYPE 32 -#define DEFNTYPE 33 -#define LINETYPE 34 -#define TRIMTYPE 35 -#define TLITTYPE 36 -#define DEFQTYPE 37 /* defquote */ -#define QUTRTYPE 38 /* quoter thus defined */ - -#define STATIC 128 - -/* m4 special characters */ - -#define ARGFLAG '$' -#define LPAREN '(' -#define RPAREN ')' -#define LQUOTE '`' -#define RQUOTE '\'' -#define VQUOTE ('V'&(' '- 1)) -#define COMMA ',' -#define SCOMMT '#' -#define ECOMMT '\n' - -/* - * other important constants - */ - -#define EOS (char) 0 -#define MAXINP 10 /* maximum include files */ -#define MAXOUT 10 /* maximum # of diversions */ -#ifdef SMALL -#define MAXSTR 512 /* maximum size of string */ -#define BUFSIZE 4096 /* size of pushback buffer */ -#define STACKMAX 1024 /* size of call stack */ -#define STRSPMAX 4096 /* size of string space */ -#define HASHSIZE 199 /* maximum size of hashtab */ -#else -#define MAXSTR 1024 /* maximum size of string */ -#define BUFSIZE 8192 /* size of pushback buffer */ -#define STACKMAX 2048 /* size of call stack */ -#define STRSPMAX 8192 /* size of string space */ -#define HASHSIZE 509 /* maximum size of hashtab */ -#endif -#define MAXTOK MAXSTR /* maximum chars in a tokn */ - -#define ALL 1 -#define TOP 0 - -#define TRUE 1 -#define FALSE 0 - -/* m4 data structures */ - -typedef struct ndblock *ndptr; - -struct ndblock /* hashtable structure */ - { - char *name; /* entry name.. */ - char *defn; /* definition.. */ - int type; /* type of the entry.. */ - ndptr nxtptr; /* link to next entry.. */ - }; - -#define nil ((ndptr) 0) - -typedef union /* stack structure */ - { int sfra; /* frame entry */ - char *sstr; /* string entry */ - } stae; - -/* - * macros for readibility and/or speed - * - * gpbc() - get a possibly pushed-back character - * min() - select the minimum of two elements - * pushf() - push a call frame entry onto stack - * pushs() - push a string pointer onto stack - */ -#define gpbc() bp == bb ? getc(infile[ilevel]) : *--bp -#define min(x,y) ((x > y) ? y : x) -#define pushf(x) if (sp < STACKMAX) mstack[++sp].sfra = (x) -#define pushs(x) if (sp < STACKMAX) mstack[++sp].sstr = (x) - -/* - * . . - * | . | <-- sp | . | - * +-------+ +-----+ - * | arg 3 ----------------------->| str | - * +-------+ | . | - * | arg 2 ---PREVEP-----+ . - * +-------+ | - * . | | | - * +-------+ | +-----+ - * | plev | PARLEV +-------->| str | - * +-------+ | . | - * | type | CALTYP . - * +-------+ - * | prcf ---PREVFP--+ - * +-------+ | - * | . | PREVSP | - * . | - * +-------+ | - * | <----------+ - * +-------+ - * - */ -#define PARLEV (mstack[fp].sfra) -#define CALTYP (mstack[fp-1].sfra) -#define PREVEP (mstack[fp+3].sstr) -#define PREVSP (fp-3) -#define PREVFP (mstack[fp-2].sfra) - -#endif //GO.SYSIN DD mdef.h echo ourlims.h 1>&2 sed 's/.//' >ourlims.h <<'//GO.SYSIN DD ourlims.h' -/* File : ourlims.h - Author : Richard A. O'Keefe - Defines: UCHAR_MAX, CHAR_BIT, LONG_BIT -*/ -/* If <limits.h> is available, use that. - Otherwise, use 8-bit byte as the default. - If the number of characters is a power of 2, you might be able - to use (unsigned char)(~0), but why get fancy? -*/ -#ifdef __STDC__ -#include <limits.h> -#else -#define UCHAR_MAX 255 -#define CHAR_BIT 8 -#endif -#define LONG_BIT 32 //GO.SYSIN DD ourlims.h echo m4.1 1>&2 sed 's/.//' >m4.1 <<'//GO.SYSIN DD m4.1' -.\" -.\" @(#) $Id$ -.\" -.Dd January 26, 1993 -.Dt m4 1 -.Os -.Sh NAME -.Nm m4 -.Nd macro language processor -.Sh SYNOPSIS -.Nm m4 -.Oo -.Fl D Ns Ar name Ns Op Ar =value -.Oc -.Op Fl U Ns Ar name -.Sh DESCRIPTION -The -.Nm m4 -utility is a macro processor that can be used as a front end to any -language (e.g., C, ratfor, fortran, lex, and yacc). -.Nm m4 -reads from the standard input and writes -the processed text to the standard output. -.Pp -Macro calls have the form name(argument1[, argument2, ...,] argumentN). -.Pp -There cannot be any space following the macro name and the open -parentheses '('. If the macro name is not followed by an open -parentheses it is processed with no arguments. -.Pp -Macro names consist of a leading alphabetic or underscore -possibly followed by alphanumeric or underscore characters, therefore -valid macro names match this pattern [a-zA-Z_][a-zA-Z0-9_]*. -.Pp -In arguments to macros, leading unquoted space, tab and newline -characters are ignored. To quote strings use left and right single -quotes (e.g., ` this is a string with a leading space'). You can change -the quote characters with the changequote built-in macro. -.Pp -The options are as follows: -.Bl -tag -width "-Dname[=value]xxx" -.It Fl D Ns Ar name Ns Oo -.Ar =value -.Oc -Define the symbol -.Ar name -to have some value (or NULL). -.It Fl "U" Ns Ar "name" -Undefine the symbol -.Ar name . -.El -.Sh SYNTAX -.Nm m4 -provides the following built-in macros. They may be -redefined, loosing their original meaning. -Return values are NULL unless otherwise stated. -.Bl -tag -width changequotexxx -.It changecom -Change the start and end comment sequences. The default is -the pound sign `#' and the newline character. With no arguments -comments are turned off. The maximum length for a comment marker is -five characters. -.It changequote -Defines the quote symbols to be the first and second arguments. -The symbols may be up to five characters long. If no arguments are -given it restores the default open and close single quotes. -.It decr -Decrements the argument by 1. The argument must be a valid numeric string. -.It define -Define a new macro named by the first argument to have the -value of the second argument. Each occurrence of $n (where n -is 0 through 9) is replaced by the n'th argument. $0 is the name -of the calling macro. Undefined arguments are replaced by a -NULL string. $# is replaced by the number of arguments; $* -is replaced by all arguments comma separated; $@ is the same -as $* but all arguments are quoted against further expansion. -.It defn -Returns the quoted definition for each argument. This can be used to rename -macro definitions (even for built-in macros). -.It divert -There are 10 output queues (numbered 0-9). -At the end of processing -.Nm m4 -concatenates all the queues in numerical order to produce the -final output. Initially the output queue is 0. The divert -macro allows you to select a new output queue (an invalid argument -passed to divert causes output to be discarded). -.It divnum -Returns the current output queue number. -.It dnl -Discard input characters up to and including the next newline. -.It dumpdef -Prints the names and definitions for the named items, or for everything -if no arguments are passed. -.It errprint -Prints the first argument on the standard error output stream. -.It eval -Computes the first argument as an arithmetic expression using 32-bit -arithmetic. Operators are the standard C ternary, arithmetic, logical, -shift, relational, bitwise, and parentheses operators. You can specify -octal, decimal, and hexadecimal numbers as in C. The second argument (if -any) specifies the radix for the result and the third argument (if -any) specifies the minimum number of digits in the result. -.It expr -This is an alias for eval. -.It ifdef -If the macro named by the first argument is defined then return the second -argument, otherwise the third. If there is no third argument, -the value is NULL. The word `unix' is predefined. -.It ifelse -If the first argument matches the second argument then ifelse returns -the third argument. If the match fails the three arguments are -discarded and the next three arguments are used until there is -zero or one arguments left, either this last argument or NULL is -returned if no other matches were found. -.It include -Returns the contents of the file specified in the first argument. -Include aborts with an error message if the file cannot be included. -.It incr -Increments the argument by 1. The argument must be a valid numeric string. -.It index -Returns the index of the second argument in the first argument (e.g., -index(the quick brown fox jumped, fox) returns 16). If the second -argument is not found index returns -1. -.It len -Returns the number of characters in the first argument. Extra arguments -are ignored. -.It m4exit -Immediately exits with the return value specified by the first argument, -0 if none. -.It m4wrap -Allows you to define what happens at the final EOF, usually for cleanup -purposes (e.g., m4wrap("cleanup(tempfile)") causes the macro cleanup to -invoked after all other processing is done.) -.It maketemp -Translates the string XXXXX in the first argument with the current process -ID leaving other characters alone. This can be used to create unique -temporary file names. -.It paste -Includes the contents of the file specified by the first argument without -any macro processing. Aborts with an error message if the file cannot be -included. -.It popdef -Restores the pushdef'ed definition for each argument. -.It pushdef -Takes the same arguments as define, but it saves the definition on a -stack for later retrieval by popdef. -.It shift -Returns all but the first argument, the remaining arguments are -quoted and pushed back with commas in between. The quoting -nullifies the effect of the extra scan that will subsequently be -performed. -.It sinclude -Similar to include, except it ignores any errors. -.It spaste -Similar to spaste, except it ignores any errors. -.It substr -Returns a substring of the first argument starting at the offset specified -by the second argument and the length specified by the third argument. -If no third argument is present it returns the rest of the string. -.It syscmd -Passes the first argument to the shell. Nothing is returned. -.It sysval -Returns the return value from the last syscmd. -.It translit -Transliterate the characters in the first argument from the set -given by the second argument to the set given by the third. You cannot -use -.Xr tr 1 -style abbreviations. -.It undefine -Removes the definition for the macro specified by the first argument. -.It undivert -Flushes the named output queues (or all queues if no arguments). -.It unix -A pre-defined macro for testing the OS platform. -.El -.Sh AUTHOR -Ozan Yigit <oz@sis.yorku.ca> and Richard A. O'Keefe (ok@goanna.cs.rmit.OZ.AU) //GO.SYSIN DD m4.1