#!
(shbang) lines and in functions, self
implicit parameters.
samples
directory.
samples
directory.
samples
directory.
samples
directory.
If comment removal is disabled, LuaSrcDiet only removes trailing whitespace. Trailing whitespace is not removed in long strings, a warning is generated instead. If empty line removal is disabled, LuaSrcDiet keeps all significant code on the same lines. Thus, a user is able to debug using the original sources as a reference since the line numbering is unchanged.
String optimization deals mainly with optimizing escape sequences, but delimiters can be switched between single quotes and double quotes if the source size of the string can be reduced. For long strings and long comments, LuaSrcDiet also tries to reduce the '=' separators in the delimiters if possible. For number optimization, LuaSrcDiet saves space by trying to generate the shortest possible sequence, and in the process it does not produce 'proper' scientific notation (e.g. 1.23e5) but does away with the decimal point (e.g. 123e3) instead.
The local variable name optimizer uses a full parser of Lua 5.1 source
code, thus it can rename all local variables, including upvalues and
function parameters. It should handle the implicit self
parameter
gracefully. In addition, local variable names are either renamed into
the shortest possible names following English frequent letter usage or
are arranged by calculating entropy with the --opt-entropy option.
Variable names are reused whenever possible, reducing the number of
unique variable names. For example, for LuaSrcDiet.lua
(version
0.11.0), 683 local identifiers representing 88 unique names were
optimized into 32 unique names, all which are one character in length,
saving over 2600 bytes.
If you need some kind of reassurance that your app will still work at reduced size, see the section on verification below.
LuaSrcDiet myscript.lua -o myscript_.lua
On Windows machines, the above command line can be used on Cygwin, or you can run Lua with the LuaSrcDiet script like this:
lua LuaSrcDiet.lua myscript.lua -o myscript_.lua
When run without arguments, LuaSrcDiet prints a list of options. Also,
you can check the Makefile
for some examples of command lines to
use. For example, for maximum code size reduction and maximum verbosity,
use:
LuaSrcDiet --maximum --details myscript.lua -o myscript_.lua
llex.lua
at
--maximum settings is as follows:
Statistics for: LuaSrcDiet.lua -> sample/LuaSrcDiet.lua *** local variable optimization summary *** ---------------------------------------------------------- Variable Unique Decl. Token Size Average Types Names Count Count Bytes Bytes ---------------------------------------------------------- Global 10 0 19 95 5.00 ---------------------------------------------------------- Local (in) 88 153 683 3340 4.89 TOTAL (in) 98 153 702 3435 4.89 ---------------------------------------------------------- Local (out) 32 153 683 683 1.00 TOTAL (out) 42 153 702 778 1.11 ---------------------------------------------------------- *** lexer-based optimizations summary *** -------------------------------------------------------------------- Lexical Input Input Input Output Output Output Elements Count Bytes Average Count Bytes Average -------------------------------------------------------------------- TK_KEYWORD 374 1531 4.09 374 1531 4.09 TK_NAME 795 3963 4.98 795 1306 1.64 TK_NUMBER 54 59 1.09 54 59 1.09 TK_STRING 152 1725 11.35 152 1717 11.30 TK_LSTRING 7 1976 282.29 7 1976 282.29 TK_OP 997 1092 1.10 997 1092 1.10 TK_EOS 1 0 0.00 1 0 0.00 -------------------------------------------------------------------- TK_COMMENT 140 6884 49.17 1 18 18.00 TK_LCOMMENT 7 1723 246.14 0 0 0.00 TK_EOL 543 543 1.00 197 197 1.00 TK_SPACE 1270 2465 1.94 263 263 1.00 -------------------------------------------------------------------- Total Elements 4340 21961 5.06 2841 8159 2.87 -------------------------------------------------------------------- Total Tokens 2380 10346 4.35 2380 7681 3.23 --------------------------------------------------------------------
Overall, the file size is reduced by more than 9KB. Tokens in the above report can be classified into 'real' or actual tokens, and 'fake' or whitespace tokens. The number of 'real' tokens remained the same. Short comments and long comments were completely eliminated. The number of line endings was reduced by 59, while all but 152 whitespace characters were optimized away. So, token separators (whitespace, including line endings) now takes up just 10% of the total file size. No optimization of number tokens was possible, while 2 bytes were saved for string tokens.
For local variable name optimization, the report shows that 38 unique local variable names were reduced to 20 unique names. The number of identifier tokens should stay the same (there is currently no optimization option to optimize away non-essential or unused 'real' tokens.) Since there can be at most 53 single-character identifiers, all local variables are now one character in length. Over 600 bytes was saved. --details will give a longer report and much more information.
A sample output of LuaSrcDiet 0.12.0 for processing the one-file
LuaSrcDiet.lua
program itself at --maximum and
--opt-experimental settings is as follows:
*** local variable optimization summary *** ---------------------------------------------------------- Variable Unique Decl. Token Size Average Types Names Count Count Bytes Bytes ---------------------------------------------------------- Global 27 0 51 280 5.49 ---------------------------------------------------------- Local (in) 482 1063 4889 21466 4.39 TOTAL (in) 509 1063 4940 21746 4.40 ---------------------------------------------------------- Local (out) 55 1063 4889 4897 1.00 TOTAL (out) 82 1063 4940 5177 1.05 ---------------------------------------------------------- *** BINEQUIV: binary chunks are sort of equivalent Statistics for: LuaSrcDiet.lua -> app_experimental.lua *** lexer-based optimizations summary *** -------------------------------------------------------------------- Lexical Input Input Input Output Output Output Elements Count Bytes Average Count Bytes Average -------------------------------------------------------------------- TK_KEYWORD 3083 12247 3.97 3083 12247 3.97 TK_NAME 5401 24121 4.47 5401 7552 1.40 TK_NUMBER 467 494 1.06 467 494 1.06 TK_STRING 787 7983 10.14 787 7974 10.13 TK_LSTRING 14 3453 246.64 14 3453 246.64 TK_OP 6381 6861 1.08 6171 6651 1.08 TK_EOS 1 0 0.00 1 0 0.00 -------------------------------------------------------------------- TK_COMMENT 1611 72339 44.90 1 18 18.00 TK_LCOMMENT 18 4404 244.67 0 0 0.00 TK_EOL 4419 4419 1.00 1778 1778 1.00 TK_SPACE 10439 24475 2.34 2081 2081 1.00 -------------------------------------------------------------------- Total Elements 32621 160796 4.93 19784 42248 2.14 -------------------------------------------------------------------- Total Tokens 16134 55159 3.42 15924 38371 2.41 -------------------------------------------------------------------- * WARNING: before and after lexer streams are NOT equivalent!
The command line was:
lua LuaSrcDiet.lua LuaSrcDiet.lua -o app_experimental.lua --maximum --opt-experimental --noopt-srcequiv
The important thing to note is that while the binary chunks are equivalent, the source lexer streams are not equivalent. Hence, the --noopt-srcequiv makes LuaSrcDiet report a warning for failing the source equivalence test.
LuaSrcDiet.lua
was reduced from 157KB to about 41.3KB. The
--opt-experimental option saves an extra 205 bytes over standard
--maximum. Note the reduction in TK_OP
count due to a reduction
in semicolons and parentheses. TK_SPACE
has actually increased a bit
due to semicolons that are changed into single spaces; some of these
spaces could not be removed.
For more performance numbers, see the PerformanceStats page.
eLua
and nspire
, adding a verification step will
reduce risk for all users of LuaSrcDiet.
LuaSrcDiet performs two kinds of equivalence testing as of version 0.12.0. The two tests can be very, very loosely termed as source equivalence testing and binary equivalence testing. They are controlled by the --opt-srcequiv and --opt-binequiv options and are enabled by default.
Testing behaviour can be summarized as follows:
loadstring()
and string.dump()
and the results compared.
If your file passes this test, it means that a Lua 5.1.x binary should see the exact same token streams for both before and after files. That is, the parser in Lua will see the same lexer sequence coming from the source for both files and thus they should be equivalent. Touch wood. Heh.
However, if you are cross-compiling, it may be possible for this
test to fail. Experienced Lua developers can modify equiv.lua
to
handle such cases.
loadstring()
and string.dump()
to generate binary chunks of the entire before and after files.
Also, any shbang (#!
) lines are removed prior to generation of the
binary chunks.
The binary chunks are then run through a fake undump
routine to
verify the integrity of the binary chunks and to compare all parts that
ought to be identical.
On a per-function prototype basis (where ignored means that any difference between the two binary chunks is ignored):
linedefined
and lastlinedefined
.
This test may also cause problems if you are cross-compiling.
For sample files, see the samples
directory.
Currently implemented experimental optimizations are as follows:
For example, the following:
fish("cow")fish('cow')fish([[cow]])
is turned into:
fish"cow"fish'cow'fish[[cow]]
local
keyword removal. Planned to work for a few kinds of patterns only.