The most significant new feature
is that the gawk internals have been
did not integrate his changes, because
they were major, and I wanted to understand them better.
Bad move: John disappeared in early
2004, and the code languished, unused.
Finally, in fall 2009, I got a volunteer
(Stephen Davies) to start bringing the
last version of the byte-code gawk that
I had into the present. He and I had
things working, pretty much, and I even
announced a test release to the world.
Again, out of the blue, John resurfaced in early 2010 and joined the
effort to make the byte-code gawk
viable. This moved things into high
gear, and we made a lot of progress.
As I write this, the byte-code version
has been merged with my “new
features” branch of the code. This is
the basis for gawk 4.0.
If you don’t yet have gawk 4.0, see
Resources for information on where to
download the source and how to build
it; building from source is very easy.
New Stuff in gawk 4.0
With all the background out of the way,
let’s look at the cool stuff. Due to space
considerations, this is just a quick tour;
96 | SEPTEMBER 2011 WWW.LINUXJOURNAL.COM
see the documentation (listed in
Resources) for details.
The most significant new feature is that
the gawk internals have been completely
redone. The parser now builds a linked
list of “instructions”. Each instruction
contains a code indicating what it is and
a few members with needed information, such as the next instruction and
which instruction to jump to if a jump is
needed. This list then is interpreted for
each record by a big switch statement
running inside a for loop that traverses
the list. Data for operations are pushed
and popped off a runtime stack.
This implementation performs no
worse than the original recursive evaluator, and in many cases, it performs
better. But what’s really cool is that
John added an awk-level debugger!
Since 1978 when awk was first introduced into the world, the only debugging tool was the print statement.
Now, gawk has a full debugger, with
breakpoints, watchpoints, stepping by
statement or instruction, the ability to
step into and out of functions, and