awk for newLisp
newLisp is the better perl ;-)
And... what about the better awk? — Here is the one!
later addition
Huh... :-)
If you have a nostalgy for awk, this page will show you, how it can be done.
Nevertheless, if you'll try it, you'll find that it's not as useful as expected, because the lispish style is much different... For example, usually it is more powerful (and simple) to parse a data structure, rather than to crunch it with tons of regexp tests.
In a real life at present I using a (dofile) iterator with or without (rawk). Both are the part of funlib.lsp library now.
I leave awk.lsp file for download here only for your reference. It isn't maintained and it's parts, that goes to funlib.lsp are slightly different.
Introduction
awk.lsp introduces AWK context, consists of two functions:
(awk...) is for processing character streams in awk manner.
(rawk...) is for implement awk-style "/regexp/{code}" sequences.
Documentation
(awk str (ini-seq) (match-seq) (fin-seq))
If str is not string, evaluate it.
Next evaluate (ini-seq). You may change RS or FS here or do some custom init.
Splits str according to regexp-delimiter RS
For each substring of str (that will be iteratively placed in S):
- split S into list of fields F according to regexp-delimiter FS
- then evaluate (match-seq)
Finally evaluate (fin-seq)
(rawk str (str-pattern body) (reg-pattern body) ...)
reg-pattern: (str-pattern int-option) – way to specify int-option for regexp
If str is not string, evaluate it.
Sequentally test matching str for each str-pattern or reg-pattern
If match, corresponding body is evaluated
Repeat for all patterns despite previous matches (like in original awk).
To stop string processing (like next in awk), (catch)/(throw) must be used.
When evaluating body
- RLIST filled with regexp result
- F and S from (awk ...) are also available (if called from (awk) of course).
Return value of last body or nil.
(rawk) not uses any context internal variables, so it can be used alone in any applications.
Using (rawk) in (match-seq) of (awk) give ability of awk-style text processing.
(rcase str (str-pattern body) (reg-pattern body) ...)
Usage and arguments meaning are the same as in (rawk).
The difference from (rawk) is that pattern matching stops on the first match.
Warning: (catch)/(throw) used internally to break loop.
Return value of last body or nil or catched error.
Usage notes
AWK context keeps its state after (awk ...) finishes, so while processing a long input stream, it is possible to sequentallly pass parts of stream to (awk) function.
Flow difference from original awk
In original awk, while processing one line, if we set field $n into new value it will be reflected to $0 and availabe to subsequent /pattern/{} tests. In (rawk ...) this behavior is not implemented. It is not difficult, but this rarely used feature will cause some processing overhead.
Examples
In newLisp code suppose awk.lsp is loaded and current context is AWK (or it's copy):
... or you may use explicit AWK:FS, AWK:RS and AWK:rcase calls instead.
1. Field cutter (yes, it can be easy done without AWK too ;-)
does the same as
2. Field cutter with processing
does near the same as
3. And more powerful example, that demonstrates real AWK context's goal:
does the same as