path: root/contrib/one-true-awk/FIXES
diff options
Diffstat (limited to 'contrib/one-true-awk/FIXES')
1 files changed, 703 insertions, 0 deletions
diff --git a/contrib/one-true-awk/FIXES b/contrib/one-true-awk/FIXES
new file mode 100644
index 000000000000..236323fcdbcd
--- /dev/null
+++ b/contrib/one-true-awk/FIXES
@@ -0,0 +1,703 @@
+Copyright (C) Lucent Technologies 1997
+All Rights Reserved
+Permission to use, copy, modify, and distribute this software and
+its documentation for any purpose and without fee is hereby
+granted, provided that the above copyright notice appear in all
+copies and that both that the copyright notice and this
+permission notice and warranty disclaimer appear in supporting
+documentation, and that the name Lucent Technologies or any of
+its entities not be used in advertising or publicity pertaining
+to distribution of the software without specific, written prior
+This file lists all bug fixes, changes, etc., made since the AWK book
+was sent to the printers in August, 1987.
+Nov 15, 2000:
+ fixed a bug introduced in august 1997 that caused expressions
+ like $f[1] to be syntax errors. thanks to arnold robbins for
+ noticing this and providing a fix.
+Oct 30, 2000:
+ fixed some nextfile bugs: not handling all cases. thanks to
+ arnold robbins for pointing this out. new regressions added.
+ close() is now a function. it returns whatever the library
+ fclose returns, and -1 for closing a file or pipe that wasn't
+ opened.
+Sep 24, 2000:
+ permit \n explicitly in character classes; won't work right
+ if comes in as "[\n]" but ok as /[\n]/, because of multiple
+ processing of \'s. thanks to arnold robbins.
+July 5, 2000:
+ minor fiddles in tran.c to keep compilers happy about uschar.
+ thanks to norman wilson.
+May 25, 2000:
+ yet another attempt at making 8-bit input work, with another
+ band-aid in b.c (member()), and some (uschar) casts to head
+ off potential errors in subscripts (like isdigit). also
+ changed HAT to NCHARS-2. thanks again to santiago vila.
+ changed maketab.c to ignore apparently out of range definitions
+ instead of halting; new freeBSD generates one. thanks to
+ jon snader <jsnader@ix.netcom.com> for pointing out the problem.
+May 2, 2000:
+ fixed an 8-bit problem in b.c by making several char*'s into
+ unsigned char*'s. not clear i have them all yet. thanks to
+ Santiago Vila <sanvila@unex.es> for the bug report.
+Apr 21, 2000:
+ finally found and fixed a memory leak in function call; it's
+ been there since functions were added ~1983. thanks to
+ jon bentley for the test case that found it.
+ added test in envinit to catch environment "variables" with
+ names begining with '='; thanks to Berend Hasselman.
+Jul 28, 1999:
+ added test in defn() to catch function foo(foo), which
+ otherwise recurses until core dump. thanks to arnold
+ robbins for noticing this.
+Jun 20, 1999:
+ added *bp in gettok in lex.c; appears possible to exit function
+ without terminating the string. thanks to russ cox.
+Jun 2, 1999:
+ added function stdinit() to run to initialize files[] array,
+ in case stdin, etc., are not constants; some compilers care.
+May 10, 1999:
+ replaced the ERROR ... FATAL, etc., macros with functions
+ based on vprintf, to avoid problems caused by overrunning
+ fixed-size errbuf array. thanks to ralph corderoy for the
+ impetus, and for pointing out a string termination bug in
+ qstring as well.
+Apr 21, 1999:
+ fixed bug that caused occasional core dumps with commandline
+ variable with value ending in \. (thanks to nelson beebe for
+ the test case.)
+Apr 16, 1999:
+ with code kindly provided by Bruce Lilly, awk now parses
+ /=/ and similar constructs more sensibly in more places.
+ Bruce also provided some helpful test cases.
+Apr 5, 1999:
+ changed true/false to True/False in run.c to make it
+ easier to compile with C++. Added some casts on malloc
+ and realloc to be honest about casts; ditto. changed
+ ltype int to long in struct rrow to reduce some 64-bit
+ complaints; other changes scattered throughout for the
+ same purpose. thanks to Nelson Beebe for these portability
+ improvements.
+ removed some horrible pointer-int casting in b.c and elsewhere
+ by adding ptoi and itonp to localize the casts, which are
+ all benign. fixed one incipient bug that showed up on sgi
+ in 64-bit mode.
+ reset lineno for new source file; include filename in error
+ message. also fixed line number error in continuation lines.
+ (thanks to Nelson Beebe for both of these.)
+Mar 24, 1999:
+ Nelson Beebe notes that irix 5.3 yacc dies with a bogus
+ error; use a newer version or switch to bison, since sgi
+ is unlikely to fix it.
+Mar 5, 1999:
+ changed isnumber to is_number to avoid the problem caused by
+ versions of ctype.h that include the name isnumber.
+ distribution now includes a script for building on a Mac,
+ thanks to Dan Allen.
+Feb 20, 1999:
+ fixed memory leaks in run.c (call) and tran.c (setfval).
+ thanks to Stephen Nutt for finding these and providing the fixes.
+Jan 13, 1999:
+ replaced srand argument by (unsigned int) in run.c;
+ avoids problem on Mac and potentially on Unix & Windows.
+ thanks to Dan Allen.
+ added a few (int) casts to silence useless compiler warnings.
+ e.g., errorflag= in run.c jump().
+ added proctab.c to the bundle outout; one less thing
+ to have to compile out of the box.
+ added calls to _popen and _pclose to the win95 stub for
+ pipes (thanks to Steve Adams for this helpful suggestion).
+ seems to work, though properties are not well understood
+ by me, and it appears that under some circumstances the
+ pipe output is truncated. Be careful.
+Oct 19, 1998:
+ fixed a couple of bugs in getrec: could fail to update $0
+ after a getline var; because inputFS wasn't initialized,
+ could split $0 on every character, a misleading diversion.
+ fixed caching bug in makedfa: LRU was actually removing
+ least often used.
+ thanks to ross ridge for finding these, and for providing
+ great bug reports.
+May 12, 1998:
+ fixed potential bug in readrec: might fail to update record
+ pointer after growing. thanks to dan levy for spotting this
+ and suggesting the fix.
+Mar 12, 1998:
+ added -V to print version number and die.
+Feb 11, 1998:
+ subtle silent bug in lex.c: if the program ended with a number
+ longer than 1 digit, part of the input would be pushed back and
+ parsed again because token buffer wasn't terminated right.
+ example: awk 'length($0) > 10'. blush. at least i found it
+ myself.
+Aug 31, 1997:
+ s/adelete/awkdelete/: SGI uses this in malloc.h.
+ thanks to nelson beebe for pointing this one out.
+Aug 21, 1997:
+ fixed some bugs in sub and gsub when replacement includes \\.
+ this is a dark, horrible corner, but at least now i believe that
+ the behavior is the same as gawk and the intended posix standard.
+ thanks to arnold robbins for advice here.
+Aug 9, 1997:
+ somewhat regretfully, replaced the ancient lex-based lexical
+ analyzer with one written in C. it's longer, generates less code,
+ and more portable; the old one depended too much on mysterious
+ properties of lex that were not preserved in other environments.
+ in theory these recognize the same language.
+ now using strtod to test whether a string is a number, instead of
+ the convoluted original function. should be more portable and
+ reliable if strtod is implemented right.
+ removed now-pointless optimization in makefile that tries to avoid
+ recompilation when awkgram.y is changed but symbols are not.
+ removed most fixed-size arrays, though a handful remain, some
+ of which are unchecked. you have been warned.
+Aug 4, 1997:
+ with some trepidation, replaced the ancient code that managed
+ fields and $0 in fixed-size arrays with arrays that grow on
+ demand. there is still some tension between trying to make this
+ run fast and making it clean; not sure it's right yet.
+ the ill-conceived -mr and -mf arguments are now useful only
+ for debugging. previous dynamic string code removed.
+ numerous other minor cleanups along the way.
+Jul 30, 1997:
+ using code provided by dan levy (to whom profuse thanks), replaced
+ fixed-size arrays and awkward kludges by a fairly uniform mechanism
+ to grow arrays as needed for printf, sub, gsub, etc.
+Jul 23, 1997:
+ falling off the end of a function returns "" and 0, not 0.
+ thanks to arnold robbins.
+Jun 17, 1997:
+ replaced several fixed-size arrays by dynamically-created ones
+ in run.c; added overflow tests to some previously unchecked cases.
+ getline, toupper, tolower.
+ getline code is still broken in that recursive calls may wind
+ up using the same space. [fixed later]
+ increased RECSIZE to 8192 to push problems further over the horizon.
+ added \r to \n as input line separator for programs, not data.
+ damn CRLFs.
+ modified format() to permit explicit printf("%c", 0) to include
+ a null byte in output. thanks to ken stailey for the fix.
+ added a "-safe" argument that disables file output (print >,
+ print >>), process creation (cmd|getline, print |, system), and
+ access to the environment (ENVIRON). this is a first approximation
+ to a "safe" version of awk, but don't rely on it too much. thanks
+ to joan feigenbaum and matt blaze for the inspiration long ago.
+Jul 8, 1996:
+ fixed long-standing bug in sub, gsub(/a/, "\\\\&"); thanks to
+ ralph corderoy.
+Jun 29, 1996:
+ fixed awful bug in new field splitting; didn't get all the places
+ where input was done.
+Jun 28, 1996:
+ changed field-splitting to conform to posix definition: fields are
+ split using the value of FS at the time of input; it used to be
+ the value when the field or NF was first referred to, a much less
+ predictable definition. thanks to arnold robbins for encouragement
+ to do the right thing.
+May 28, 1996:
+ fixed appalling but apparently unimportant bug in parsing octal
+ numbers in reg exprs.
+ explicit hex in reg exprs now limited to 2 chars: \xa, \xaa.
+May 27, 1996:
+ cleaned up some declarations so gcc -Wall is now almost silent.
+ makefile now includes backup copies of ytab.c and lexyy.c in case
+ one makes before looking; it also avoids recreating lexyy.c unless
+ really needed.
+ s/aprintf/awkprint, s/asprintf/awksprintf/ to avoid some name clashes
+ with unwisely-written header files.
+ thanks to jeffrey friedl for several of these.
+May 26, 1996:
+ an attempt to rationalize the (unsigned) char issue. almost all
+ instances of unsigned char have been removed; the handful of places
+ in b.c where chars are used as table indices have been hand-crafted.
+ added some latin-1 tests to the regression, but i'm not confident;
+ none of my compilers seem to care much. thanks to nelson beebe for
+ pointing out some others that do care.
+May 2, 1996:
+ removed all register declarations.
+ enhanced split(), as in gawk, etc: split(s, a, "") splits s into
+ a[1]...a[length(s)] with each character a single element.
+ made the same changes for field-splitting if FS is "".
+ added nextfile, as in gawk: causes immediate advance to next
+ input file. (thanks to arnold robbins for inspiration and code).
+ small fixes to regexpr code: can now handle []], [[], and
+ variants; [] is now a syntax error, rather than matching
+ everything; [z-a] is now empty, not z. far from complete
+ or correct, however. (thanks to jeffrey friedl for pointing out
+ some awful behaviors.)
+Apr 29, 1996:
+ replaced uchar by uschar everwhere; apparently some compilers
+ usurp this name and this causes conflicts.
+ fixed call to time in run.c (bltin); arg is time_t *.
+ replaced horrible pointer/long punning in b.c by a legitimate
+ union. should be safer on 64-bit machines and cleaner everywhere.
+ (thanks to nelson beebe for pointing out some of these problems.)
+ replaced nested comments by #if 0...#endif in run.c, lib.c.
+ removed getsval, setsval, execute macros from run.c and lib.c.
+ machines are 100x faster than they were when these macros were
+ first used.
+ revised filenames: awk.g.y => awkgram.y, awk.lx.l => awklex.l,
+ y.tab.[ch] => ytab.[ch], lex.yy.c => lexyy.c, all in the aid of
+ portability to nameless systems.
+ "make bundle" now includes yacc and lex output files for recipients
+ who don't have yacc or lex.
+Aug 15, 1995:
+ initialized Cells in setsymtab more carefully; some fields
+ were not set. (thanks to purify, all of whose complaints i
+ think i now understand.)
+ fixed at least one error in gsub that looked at -1-th element
+ of an array when substituting for a null match (e.g., $).
+ delete arrayname is now legal; it clears the elements but leaves
+ the array, which may not be the right behavior.
+ modified makefile: my current make can't cope with the test used
+ to avoid unnecessary yacc invocations.
+Jul 17, 1995:
+ added dynamically growing strings to awk.lx.l and b.c
+ to permit regular expressions to be much bigger.
+ the state arrays can still overflow.
+Aug 24, 1994:
+ detect duplicate arguments in function definitions (mdm).
+May 11, 1994:
+ trivial fix to printf to limit string size in sub().
+Apr 22, 1994:
+ fixed yet another subtle self-assignment problem:
+ $1 = $2; $1 = $1 clobbered $1.
+ Regression tests now use private echo, to avoid quoting problems.
+Feb 2, 1994:
+ changed error() to print line number as %d, not %g.
+Jul 23, 1993:
+ cosmetic changes: increased sizes of some arrays,
+ reworded some error messages.
+ added CONVFMT as in posix (just replaced OFMT in getsval)
+ FILENAME is now "" until the first thing that causes a file
+ to be opened.
+Nov 28, 1992:
+ deleted yyunput and yyoutput from proto.h;
+ different versions of lex give these different declarations.
+May 31, 1992:
+ added -mr N and -mf N options: more record and fields.
+ these really ought to adjust automatically.
+ cleaned up some error messages; "out of space" now means
+ malloc returned NULL in all cases.
+ changed rehash so that if it runs out, it just returns;
+ things will continue to run slow, but maybe a bit longer.
+Apr 24, 1992:
+ remove redundant close of stdin when using -f -.
+ got rid of core dump with -d; awk -d just prints date.
+Apr 12, 1992:
+ added explicit check for /dev/std(in,out,err) in redirection.
+ unlike gawk, no /dev/fd/n yet.
+ added (file/pipe) builtin. hard to test satisfactorily.
+ not posix.
+Feb 20, 1992:
+ recompile after abortive changes; should be unchanged.
+Dec 2, 1991:
+ die-casting time: converted to ansi C, installed that.
+Nov 30, 1991:
+ fixed storage leak in freefa, failing to recover [N]CCL.
+ thanks to Bill Jones (jones@cs.usask.ca)
+Nov 19, 1991:
+ use RAND_MAX instead of literal in builtin().
+Nov 12, 1991:
+ cranked up some fixed-size arrays in b.c, and added a test for
+ overflow in penter. thanks to mark larsen.
+Sep 24, 1991:
+ increased buffer in gsub. a very crude fix to a general problem.
+ and again on Sep 26.
+Aug 18, 1991:
+ enforce variable name syntax for commandline variables: has to
+ start with letter or _.
+Jul 27, 1991:
+ allow newline after ; in for statements.
+Jul 21, 1991:
+ fixed so that in self-assignment like $1=$1, side effects
+ like recomputing $0 take place. (this is getting subtle.)
+Jun 30, 1991:
+ better test for detecting too-long output record.
+Jun 2, 1991:
+ better defense against very long printf strings.
+ made break and continue illegal outside of loops.
+May 13, 1991:
+ removed extra arg on gettemp, tempfree. minor error message rewording.
+May 6, 1991:
+ fixed silly bug in hex parsing in hexstr().
+ removed an apparently unnecessary test in isnumber().
+ warn about weird printf conversions.
+ fixed unchecked array overwrite in relex().
+ changed for (i in array) to access elements in sorted order.
+ then unchanged it -- it really does run slower in too many cases.
+ left the code in place, commented out.
+Feb 10, 1991:
+ check error status on all writes, to avoid banging on full disks.
+Jan 28, 1991:
+ awk -f - reads the program from stdin.
+Jan 11, 1991:
+ failed to set numeric state on $0 in cmd|getline context in run.c.
+Nov 2, 1990:
+ fixed sleazy test for integrality in getsval; use modf.
+Oct 29, 1990:
+ fixed sleazy buggy code in lib.c that looked (incorrectly) for
+ too long input lines.
+Oct 14, 1990:
+ fixed the bug on p. 198 in which it couldn't deduce that an
+ argument was an array in some contexts. replaced the error
+ message in intest() by code that damn well makes it an array.
+Oct 8, 1990:
+ fixed horrible bug: types and values were not preserved in
+ some kinds of self-assignment. (in assign().)
+Aug 24, 1990:
+ changed NCHARS to 256 to handle 8-bit characters in strings
+ presented to match(), etc.
+Jun 26, 1990:
+ changed struct rrow (awk.h) to use long instead of int for lval,
+ since cfoll() stores a pointer in it. now works better when int's
+ are smaller than pointers!
+May 6, 1990:
+ AVA fixed the grammar so that ! is uniformly of the same precedence as
+ unary + and -. This renders illegal some constructs like !x=y, which
+ now has to be parenthesized as !(x=y), and makes others work properly:
+ !x+y is (!x)+y, and x!y is x !y, not two pattern-action statements.
+ (These problems were pointed out by Bob Lenk of Posix.)
+ Added \x to regular expressions (already in strings).
+ Limited octal to octal digits; \8 and \9 are not octal.
+ Centralized the code for parsing escapes in regular expressions.
+ Added a bunch of tests to T.re and T.sub to verify some of this.
+Feb 9, 1990:
+ fixed null pointer dereference bug in main.c: -F[nothing]. sigh.
+ restored srand behavior: it returns the current seed.
+Jan 18, 1990:
+ srand now returns previous seed value (0 to start).
+Jan 5, 1990:
+ fix potential problem in tran.c -- something was freed,
+ then used in freesymtab.
+Oct 18, 1989:
+ another try to get the max number of open files set with
+ relatively machine-independent code.
+ small fix to input() in case of multiple reads after EOF.
+Oct 11, 1989:
+ FILENAME is now defined in the BEGIN block -- too many old
+ programs broke.
+ "-" means stdin in getline as well as on the commandline.
+ added a bunch of casts to the code to tell the truth about
+ char * vs. unsigned char *, a right royal pain. added a
+ setlocale call to the front of main, though probably no one
+ has it usefully implemented yet.
+Aug 24, 1989:
+ removed redundant relational tests against nullnode if parse
+ tree already had a relational at that point.
+Aug 11, 1989:
+ fixed bug: commandline variable assignment has to look like
+ var=something. (consider the man page for =, in file =.1)
+ changed number of arguments to functions to static arrays
+ to avoid repeated malloc calls.
+Aug 2, 1989:
+ restored -F (space) separator
+Jul 30, 1989:
+ added -v x=1 y=2 ... for immediate commandline variable assignment;
+ done before the BEGIN block for sure. they have to precede the
+ program if the program is on the commandline.
+ Modified Aug 2 to require a separate -v for each assignment.
+Jul 10, 1989:
+ fixed ref-thru-zero bug in environment code in tran.c
+Jun 23, 1989:
+ add newline to usage message.
+Jun 14, 1989:
+ added some missing ansi printf conversion letters: %i %X %E %G.
+ no sensible meaning for h or L, so they may not do what one expects.
+ made %* conversions work.
+ changed x^y so that if n is a positive integer, it's done
+ by explicit multiplication, thus achieving maximum accuracy.
+ (this should be done by pow() but it seems not to be locally.)
+ done to x ^= y as well.
+Jun 4, 1989:
+ ENVIRON array contains environment: if shell variable V=thing,
+ ENVIRON["V"] is "thing"
+ multiple -f arguments permitted. error reporting is naive.
+ (they were permitted before, but only the last was used.)
+ fixed a really stupid botch in the debugging macro dprintf
+ fixed order of evaluation of commandline assignments to match
+ what the book claims: an argument of the form x=e is evaluated
+ at the time it would have been opened if it were a filename (p 63).
+ this invalidates the suggested answer to ex 4-1 (p 195).
+ removed some code that permitted -F (space) fieldseparator,
+ since it didn't quite work right anyway. (restored aug 2)
+Apr 27, 1989:
+ Line number now accumulated correctly for comment lines.
+Apr 26, 1989:
+ Debugging output now includes a version date,
+ if one compiles it into the source each time.
+Apr 9, 1989:
+ Changed grammar to prohibit constants as 3rd arg of sub and gsub;
+ prevents class of overwriting-a-constant errors. (Last one?)
+ This invalidates the "banana" example on page 43 of the book.
+ Added \a ("alert"), \v (vertical tab), \xhhh (hexadecimal),
+ as in ANSI, for strings. Rescinded the sloppiness that permitted
+ non-octal digits in \ooo. Warning: not all compilers and libraries
+ will be able to deal with \x correctly.
+Jan 9, 1989:
+ Fixed bug that caused tempcell list to contain a duplicate.
+ The fix is kludgy.
+Dec 17, 1988:
+ Catches some more commandline errors in main.
+ Removed redundant decl of modf in run.c (confuses some compilers).
+ Warning: there's no single declaration of malloc, etc., in awk.h
+ that seems to satisfy all compilers.
+Dec 7, 1988:
+ Added a bit of code to error printing to avoid printing nulls.
+ (Not clear that it actually would.)
+Nov 27, 1988:
+ With fear and trembling, modified the grammar to permit
+ multiple pattern-action statements on one line without
+ an explicit separator. By definition, this capitulation
+ to the ghost of ancient implementations remains undefined
+ and thus subject to change without notice or apology.
+Oct 30, 1988:
+ Fixed bug in call() that failed to recover storage.
+ A warning is now generated if there are more arguments
+ in the call than in the definition (in lieu of fixing
+ another storage leak).
+Oct 20, 1988:
+ Fixed %c: if expr is numeric, use numeric value;
+ otherwise print 1st char of string value. still
+ doesn't work if the value is 0 -- won't print \0.
+ Added a few more checks for running out of malloc.
+Oct 12, 1988:
+ Fixed bug in call() that freed local arrays twice.
+ Fixed to handle deletion of non-existent array right;
+ complains about attempt to delete non-array element.
+Sep 30, 1988:
+ Now guarantees to evaluate all arguments of built-in
+ functions, as in C; the appearance is that arguments
+ are evaluated before the function is called. Places
+ affected are sub (gsub was ok), substr, printf, and
+ all the built-in arithmetic functions in bltin().
+ A warning is generated if a bltin() is called with
+ the wrong number of arguments.
+ This requires changing makeprof on p167 of the book.
+Aug 23, 1988:
+ setting FILENAME in BEGIN caused core dump, apparently
+ because it was freeing space not allocated by malloc.
+July 24, 1988:
+ fixed egregious error in toupper/tolower functions.
+ still subject to rescinding, however.
+July 2, 1988:
+ flush stdout before opening file or pipe
+July 2, 1988:
+ performance bug in b.c/cgoto(): not freeing some sets of states.
+ partial fix only right now, and the number of states increased
+ to make it less obvious.
+June 1, 1988:
+ check error status on close
+May 28, 1988:
+ srand returns seed value it's using.
+ see 1/18/90
+May 22, 1988:
+ Removed limit on depth of function calls.
+May 10, 1988:
+ Fixed lib.c to permit _ in commandline variable names.
+Mar 25, 1988:
+ main.c fixed to recognize -- as terminator of command-
+ line options. Illegal options flagged.
+ Error reporting slightly cleaned up.
+Dec 2, 1987:
+ Newer C compilers apply a strict scope rule to extern
+ declarations within functions. Two extern declarations in
+ lib.c and tran.c have been moved to obviate this problem.
+Oct xx, 1987:
+ Reluctantly added toupper and tolower functions.
+ Subject to rescinding without notice.
+Sep 17, 1987:
+ Error-message printer had printf(s) instead of
+ printf("%s",s); got core dumps when the message
+ included a %.
+Sep 12, 1987:
+ Very long printf strings caused core dump;
+ fixed aprintf, asprintf, format to catch them.
+ Can still get a core dump in printf itself.