Commit graph

257 commits

Author SHA1 Message Date
Denys Vlasenko
b668e52c90 *: placate warnings where strchr/strstr returns constant pointer
Newer glibc is now smarter and can propagate const-ness from those!

function                                             old     new   delta
readtoken1                                          3111    3108      -3

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2026-02-15 15:15:30 +01:00
Denys Vlasenko
5cba59e9ce awk: use more understandable form of "split-globals" trick
function                                             old     new   delta
parse_expr                                           986     998     +12
chain_group                                          633     640      +7
next_token                                           930     934      +4
getvar_s                                             102     101      -1
awk_main                                             891     888      -3
evaluate                                            3379    3355     -24
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 3/3 up/down: 23/-28)             Total: -5 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2026-01-24 05:43:45 +01:00
Denys Vlasenko
0a88a7ae3b awk: mktime() with no arguments is not allowed
It was SEGVing.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2024-07-10 07:04:28 +02:00
Denys Vlasenko
2eea3494f1 awk: improve comments and constants, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2024-07-10 06:58:51 +02:00
Denys Vlasenko
45d471d435 qwk: code shrink
function                                             old     new   delta
mk_splitter                                          100      96      -4
as_regex                                             103      99      -4
parse_expr                                           991     986      -5
awk_split                                            544     538      -6
awk_getline                                          559     552      -7
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-26)             Total: -26 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2024-07-09 17:50:58 +02:00
Denys Vlasenko
38335df9e9 awk: restore assignment precedence to be lower than ternary ?:
Something is fishy with constrcts like "3==v=3" in gawk,
they should not work, but do. Ignore those for now.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2024-07-09 15:30:46 +02:00
Denys Vlasenko
49340d93ed awk: do not infinitely recurse getvar_s() if CONVFMT is set to a numeric value
function                                             old     new   delta
fmt_num                                              247     257     +10
evaluate                                            3385    3379      -6
getvar_s                                             111     102      -9
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/2 up/down: 10/-15)             Total: -5 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2024-07-09 03:04:26 +02:00
Natanael Copa
fb08d43d44 awk: fix use after free (CVE-2023-42363)
function                                             old     new   delta
evaluate                                            3377    3385      +8

Fixes https://bugs.busybox.net/show_bug.cgi?id=15865

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2024-07-09 02:34:00 +02:00
Ron Yorston
e1a6874106 awk: fix segfault when compiled by clang
A 32-bit build of BusyBox using clang segfaulted in the test
"awk assign while assign".  Specifically, on line 7 of the test
input where the adjustment of the L.v pointer when the Fields
array was reallocated

   	L.v += Fields - old_Fields_ptr;

was out by 4 bytes.

Rearrange to code so both gcc and clang generate code that works.

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
2024-03-02 17:03:37 +01:00
Denys Vlasenko
789ccac7d9 awk: fix handling of empty fields
Patch by M Rubon <rubonmtz@gmail.com>:
Busybox awk handles references to empty (not provided in the input)
fields differently during the first line of input, as compared to
subsequent lines.

$ (echo a ; echo b) | awk '$2 != 0'    #wrong
b

No field $2 value is provided in the input.  When awk references field
$2 for the "a" line, it is seen to have a different behaviour than
when it is referenced for the "b" line.

Problem in BusyBox v1.36.1 embedded in OpenWrt 23.05.0
Same problem also in 21.02 versions of OpenWrt
Same problem in BusyBox v1.37.0.git

I get the correct expected output from Ubuntu gawk and Debian mawk,
and from my fix.
will@dev:~$ (echo a ; echo b) | awk '$2 != 0'  #correct
a
b
will@dev:~/busybox$ (echo a ; echo b ) | ./busybox awk '$2 != 0'  #fixed
a
b

I built and poked into the source code at editors/awk.c  The function
fsrealloc(int size) is core to allocating, initializing, reallocating,
and reinitializing fields, both real input line fields and imaginary
fields that the script references but do not exist in the input.

When fsrealloc() needs more field space than it has previously
allocated, it initializes those new fields differently than how they
are later reinitialized for the next input line.  This works fine for
fields defined in the input, like $1, but does not work the first time
when there is no input for that field (e.g. field $99)

My one-line fix simply makes the initialization and clrvar()
reinitialization use the same value for .type.  I am not sure if there
are regression tests to run, but I have not done those.

I'm not sure if I understand why clrvar() is not setting .type to a
default constant value, but in any case I have left that untouched.

function                                             old     new   delta
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0)                 Total: 0 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-12-31 15:49:54 +01:00
Denys Vlasenko
92ab29fcf0 awk: implement -E; do not reorder -f and -e
function                                             old     new   delta
awk_main                                             843     891     +48
next_input_file                                      243     261     +18
packed_usage                                       34631   34638      +7
.rodata                                           105391  105390      -1
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 3/1 up/down: 73/-1)              Total: 72 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-10-02 15:24:06 +02:00
Denys Vlasenko
5353df91cb Update applet size estimates
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-07-10 17:25:21 +02:00
Denys Vlasenko
2ca39ffd44 awk: fix subst code to handle "start of word" pattern correctly (needs REG_STARTEND)
function                                             old     new   delta
awk_sub                                              637     714     +77

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-06-08 10:42:39 +02:00
Denys Vlasenko
113685fbcd awk: fix SEGV on read error in -f PROGFILE
function                                             old     new   delta
awk_main                                             829     843     +14

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-06-07 10:54:34 +02:00
Denys Vlasenko
f4789164e0 awk: code shrink
function                                             old     new   delta
awk_sub                                              544     548      +4
exec_builtin                                        1136    1130      -6
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/1 up/down: 4/-6)               Total: -2 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-06-06 12:48:11 +02:00
Denys Vlasenko
5f84c56336 awk: fix backslash handling in sub() builtins
function                                             old     new   delta
awk_sub                                              559     544     -15

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-06-03 00:42:10 +02:00
Denys Vlasenko
0256e00a9d awk: fix precedence of = relative to ==
Discovered while adding code to disallow assignments to non-lvalues

function                                             old     new   delta
parse_expr                                           936     991     +55
.rodata                                           105243  105247      +4
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 59/0)               Total: 59 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-30 16:44:04 +02:00
Denys Vlasenko
721bf6eaf4 awk: printf(INVALID_FMT) prints it verbatim
function                                             old     new   delta
awk_printf                                           628     640     +12

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-29 10:55:40 +02:00
Denys Vlasenko
4d7339204f awk: shrink - use setvar_sn() to set variables from non-NUL terminated strings
function                                             old     new   delta
setvar_sn                                              -      39     +39
exec_builtin                                        1145    1136      -9
awk_getline                                          591     559     -32
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 0/2 up/down: 39/-41)             Total: -2 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-28 18:08:06 +02:00
Denys Vlasenko
05e60007d4 awk: code shrink
function                                             old     new   delta
awk_getline                                          620     591     -29

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-28 17:51:59 +02:00
Denys Vlasenko
b76b420b5d awk: fix closing of non-opened file
function                                             old     new   delta
setvar_ERRNO                                           -      53     +53
.rodata                                           105252  105246      -6
awk_getline                                          639     620     -19
evaluate                                            3402    3377     -25
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 0/3 up/down: 53/-50)              Total: 3 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-28 17:25:56 +02:00
Denys Vlasenko
21dce1c3c3 awk: do not read ARGIND, only set it (gawk compat)
function                                             old     new   delta
next_input_file                                      216     243     +27
evaluate                                            3396    3402      +6
awk_main                                             826     829      +3
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 3/0 up/down: 36/0)               Total: 36 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-27 19:11:28 +02:00
Denys Vlasenko
5c8a9dfd97 awk: remove a local variable "caching" a struct member
Since we take its address, the variable lives on stack (not a GPR).
Thus, nothing is improved by caching it.

function                                             old     new   delta
awk_getline                                          642     639      -3

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-27 18:21:38 +02:00
Denys Vlasenko
528808bcd2 awk: get rid of one indirection level for iF (input file structure)
function                                             old     new   delta
try_to_assign                                          -      91     +91
next_input_file                                      214     216      +2
awk_main                                             827     826      -1
evaluate                                            3403    3396      -7
is_assignment                                         91       -     -91
------------------------------------------------------------------------------
(add/remove: 1/1 grow/shrink: 1/2 up/down: 93/-99)             Total: -6 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-27 18:05:42 +02:00
Denys Vlasenko
84ff1825dd awk: fix splitting with default FS
function                                             old     new   delta
awk_split                                            543     544      +1

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-27 16:17:38 +02:00
Denys Vlasenko
5dcc443dba awk: fix use-after-realloc (CVE-2021-42380), closes 15601
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2023-05-26 19:36:58 +02:00
Natanael Copa
e63d7cdfda awk: fix use after free (CVE-2022-30065)
fixes https://bugs.busybox.net/show_bug.cgi?id=14781

function                                             old     new   delta
evaluate                                            3343    3357     +14

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-07-11 17:18:07 +02:00
Denys Vlasenko
e2952dfaff awk: input numbers are never octal or hex (only program consts can be)
function                                             old     new   delta
next_token                                           825     930    +105
getvar_i                                             114     129     +15
nextchar                                              49      53      +4
my_strtod                                            138       -    -138
------------------------------------------------------------------------------
(add/remove: 0/1 grow/shrink: 3/0 up/down: 124/-138)          Total: -14 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-08 22:42:35 +01:00
Denys Vlasenko
857800c655 awk: never return NULL from awk_printf()
function                                             old     new   delta
awk_printf                                           651     628     -23

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-09-09 19:26:39 +02:00
Denys Vlasenko
e60c56932e awk: code shrink
function                                             old     new   delta
awk_printf                                           652     651      -1

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-09-09 19:13:32 +02:00
Denys Vlasenko
8a0adba9f6 awk: code shrink: avoid duplicate NUL checks and strlen()
function                                             old     new   delta
awk_printf                                           665     652     -13

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-09-09 18:58:39 +02:00
Ron Yorston
305a30d80b awk: fix read beyond end of buffer
Commit 7d06d6e18 (awk: fix printf %%) can cause awk printf to read
beyond the end of a strduped buffer:

  2349      while (*f && *f != '%')
  2350          f++;
  2351      c = *++f;

If the loop terminates because a NUL character is detected the
character after the NUL is read.  This can result in failures
depending on the value of that character.

function                                             old     new   delta
awk_printf                                           672     665      -7

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-09-09 18:12:21 +02:00
Daniel Thau
7d06d6e186 awk: fix printf %%
A refactor of the awk printf code in
e2e3802987
appears to have broken the printf interpretation of two percent signs,
which normally outputs only one percent sign.

The patch below brings busybox awk printf behavior back into alignment
with the pre-e2e380 behavior, the busybox printf util, and other common
(awk and non-awk) printf implementations.

function                                             old     new   delta
awk_printf                                           626     672     +46

Signed-off-by: Daniel Thau <danthau at bedrocklinux.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-09-05 03:42:51 +02:00
Denys Vlasenko
dabbeeb793 awk: whitespace and debugging tweaks
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-14 16:58:05 +02:00
Denys Vlasenko
d3480dd582 awk: disallow break/continue outside of loops
function                                             old     new   delta
.rodata                                           104139  104186     +47
chain_group                                          610     633     +23
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 70/0)               Total: 70 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-14 16:32:19 +02:00
Denys Vlasenko
d62627487a awk: tighten parsing - disallow extra semicolons
'; BEGIN {...}' and 'BEGIN {...} ;; {...}' are not accepted by gawk

function                                             old     new   delta
parse_program                                        332     353     +21

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-14 16:32:19 +02:00
Denys Vlasenko
ab755e3717 awk: in parsing, remove superfluous NEWLINE check; optimize builtin arg evaluation
function                                             old     new   delta
exec_builtin                                        1149    1145      -4

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-12 13:30:30 +02:00
Denys Vlasenko
8d269ef859 awk: fix printf "%-10c", 0
function                                             old     new   delta
awk_printf                                           596     626     +30

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-12 11:27:11 +02:00
Denys Vlasenko
caa93ecdd3 awk: fix corner case in awk_printf
Example where it wasn't working:
    awk 'BEGIN { printf "qwe %s rty %c uio\n", "a", 0, "c" }'
- the NUL printing in %c caused premature stop of printing.

function                                             old     new   delta
awk_printf                                           593     596      +3

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 18:16:10 +02:00
Denys Vlasenko
39aabfe8f0 awk: unbreak "cmd" | getline
function                                             old     new   delta
evaluate                                            3337    3343      +6

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:51:43 +02:00
Denys Vlasenko
4ef8841b21 awk: unbreak "printf('%c') can output NUL" testcase
function                                             old     new   delta
awk_printf                                           546     593     +47

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:25:33 +02:00
Denys Vlasenko
3d57a84907 awk: undo TI_PRINT, it introduced a bug (print with any redirect acting as printf)
function                                             old     new   delta
evaluate                                            3329    3337      +8

Patch by Ron Yorston <rmy@pobox.com>

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:00:31 +02:00
Denys Vlasenko
49c3ce64f0 awk: rollback_token() + chain_group() == chain_until_rbrace()
function                                             old     new   delta
parse_program                                        336     332      -4

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 11:46:21 +02:00
Denys Vlasenko
e2e3802987 awk: fix printf buffer overflow
function                                             old     new   delta
awk_printf                                           468     546     +78
fmt_num                                              239     247      +8
getvar_s                                             125     111     -14
evaluate                                            3343    3329     -14
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/2 up/down: 86/-28)             Total: 58 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-04 01:25:34 +02:00
Denys Vlasenko
08ca313d7e awk: simplify tests for operation class
Usually, an operation class has only one possible value of "info" word.
In this case, just compare the entire info word, do not bother
to mask OPCLSMASK bits.

(Example where this is not the case: OC_REPLACE for "<op>=")

function                                             old     new   delta
mk_splitter                                          106     100      -6
chain_group                                          616     610      -6
nextarg                                               40      32      -8
exec_builtin                                        1157    1149      -8
as_regex                                             111     103      -8
awk_split                                            553     543     -10
parse_expr                                           948     936     -12
awk_getline                                          656     642     -14
evaluate                                            3387    3343     -44
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/9 up/down: 0/-116)           Total: -116 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 14:11:51 +02:00
Denys Vlasenko
cb042b0582 awk: restore strdup elision optimization in assignment
function                                             old     new   delta
evaluate                                            3339    3387     +48

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 13:30:45 +02:00
Denys Vlasenko
90404ed2f6 awk: match(): code shrink
function                                             old     new   delta
do_match                                               -     165    +165
exec_builtin_match                                   202       -    -202
------------------------------------------------------------------------------
(add/remove: 1/1 grow/shrink: 0/0 up/down: 165/-202)          Total: -37 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 12:20:36 +02:00
Denys Vlasenko
0e3ef4efb0 awk: rand(): 64-bit constants should be ULL
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 11:57:59 +02:00
Denys Vlasenko
2211fa70cc awk: do not use a copy of g_progname for node->l.new_progname
We never destroy g_progname's, the strings still exist, no need to copy

function                                             old     new   delta
chain_node                                           104      97      -7

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 11:54:01 +02:00
Denys Vlasenko
e1e7ad6b60 awk: support %F %a %A in printf
function                                             old     new   delta
.rodata                                           104111  104120      +9

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 01:59:36 +02:00