Erlang and BEAM Bytecode Binary Golfing

I recently had the unfortunate experience of having to reverse engineer a large Erlang application. So, when I saw haxrob's entries [1] to BGGP5[2], I decided to write some Erlang of my own. This blog post will start out with an improvement to haxrob's submission which saves 24 bytes, and then it will move on to BEAM bytecode golfing.

Entry 1: Erlang

haxrob's original submission (129 bytes)

-module(y).
-export([s/0]).
s()->ssl:start(),inets:start(),{_,{_,_,B}}=httpc:request("https://binary.golf/5/5"),io:put_chars(B).

----- output -----

$ erlc y.erl
$ erl -noshell -s y s

Another #BGGP5 download!! @binarygolf https://binary.golf

My submission (104 bytes)

-module(q).
-export([q/0]).
q()->ssl:start(),inets:start(),q:q(httpc:request("http://binary.golf/5/5")).

----- output -----

$ erlc q.erl
$ erl -noshell -s q q

{"init terminating in do_boot",{undef,[{q,q,[{ok,{{"HTTP/1.1",200,"OK"},[{"cache-control","max-age=600"},{"connection","keep-alive"},{"via","1.1 varnish"},{"accept-ranges","bytes"},{"age","0"},{"server","GitHub.com"},{"vary","Accept-Encoding"},{"content-length","58"},{"content-type","application/octet-stream"},{"last-modified","Fri, 21 Jun 2024 13:56:50 GMT"},{"access-control-allow-origin","*"},{"x-proxy-cache","MISS"},{"x-cache","HIT"},{"x-cache-hits","0"}],"Another #BGGP5 download!! @binarygolf https://binary.golf\n"}}],[]},{init,start_em,1,[]},{init,do_boot,3,[]}]}}
init terminating in do_boot ({undef,[{q,q,[{_}],[]},{init,start_em,1,[]},{init,do_boot,3,[]}]})

Both versions start out the same. The -module attribute is required for all Erlang modules, and the -export attribute is what allows us to actually call the function. Both modules also define a single-letter function name, and start by calling ssl:start() and inets:start(). Both modules also use httpc:request to download the web page. haxrob's submission extracts the body of the request and display it using io:putchars, and while this is a short way to generate a clean output, it costs quite a few bytes. As it turns out, if you try to call an undefined external function, the error message will nicely format the arguments. That means displaying the message is as simple as wrapping the httpc:request() call in an invalid call to a nonexistent function, such as q:q/1 in my submission. Finally, httpc follows redirects so I was able to save a byte by replacing "https" with "http" in the url.

Entry 2: Compiled .beam Bytecode File

The file produced by erlc is a BEAM bytecode file. The format of this file is based on "EA IFF 85", which is a generic TLV container format consisting of multiple sections or "chunks". Each chunk is identified by a 32-bit tag and can be up to UINT32_MAX bytes long. There is an additional "FOR1"/"FORM" header at the beginning of the file, which contains the length of the entire file.

Technique 1: Stripping sections

The .beam file produced by erlc contains a signifcant amount of unneeded information used for debugging and metadata. The BEAM format[3][4] seemed simple enough, so I decided to start by trying to remove unneeded sections. After some experimentation, I found that the program only required 6 sections to function. This worked quite well, and I was immediately able to reduce the file from 740 to 384 bytes.

import gzip, sys

def align4(n): 
    return (n+3)&0xfffffffc
def repack(secs):
    def wdword(n, length=4):
        return n.to_bytes(length=length, byteorder='big')
    body=b'BEAM'
    for i in secs:
        body+=i[0]+wdword(i[1])+i[2]
    return b'FOR1'+wdword(len(body))+body

fd=open(sys.argv[1],"rb")

# BEAM modified FORM header
assert b'FOR1' == fd.read(4)
fd.read(4)
assert b'BEAM' == fd.read(4)

secs=[]
while True:
    tag=fd.read(4)
    if tag==b'': break
    length=int.from_bytes(fd.read(4), byteorder='big')
    body=fd.read(align4(length))
    assert len(body)==align4(length)
    if tag in [b'AtU8', b"Code", b"ImpT", b"ExpT", b"StrT", b"LitT"]:
        print(f"Keeping tag {tag.decode()} [{length}]")
        secs.append((tag, length, body))
    else:
        print(f"Removing tag {tag.decode()} [{length}]")

rp=repack(secs)
print(f"Original size: {fd.tell()}")
print(f"New size: {len(rp)}")
if len(sys.argv)>=3:
    open(sys.argv[2],"wb").write(rp)


-----

$ python3 beamstrip.py q.beam q.beam.stripped
Keeping tag AtU8 [71]
Keeping tag Code [86]
Keeping tag StrT [0]
Keeping tag ImpT [76]
Keeping tag ExpT [40]
Keeping tag LitT [46]
Removing tag Meta [29]
Removing tag LocT [4]
Removing tag Attr [40]
Removing tag CInf [83]
Removing tag Dbgi [88]
Removing tag Line [21]
Removing tag Type [26]
Original size: 740
New size: 384

Technique 2: starting with a BEAM assembly file

While looking at the stripped file, I noticed some extra functions had been added by the compiler. The module_info/0 and module_info/1 functions are wrappers around erlang:get_module_info, and are not needed to run the program. I was able to ask the compiler to create an Erlang assembly file using the -S flag

$ erlc -S q.erl
$ cat q.S
{module, q}.  %% version = 0

{exports, [{module_info,0},{module_info,1},{q,0}]}.

{attributes, []}.

{labels, 7}.


{function, q, 0, 2}.
  {label,1}.
    {line,[{location,"q.erl",3}]}.
    {func_info,{atom,q},{atom,q},0}.
  {label,2}.
    {allocate,0,0}.
    {call_ext,0,{extfunc,ssl,start,0}}.
    {call_ext,0,{extfunc,inets,start,0}}.
    {move,{literal,"http://binary.golf/5/5"},{x,0}}.
    {call_ext,1,{extfunc,httpc,request,1}}.
    {call_ext_last,1,{extfunc,q,q,1},0}.


{function, module_info, 0, 4}.
  {label,3}.
    {line,[]}.
    {func_info,{atom,q},{atom,module_info},0}.
  {label,4}.
    {move,{atom,q},{x,0}}.
    {call_ext_only,1,{extfunc,erlang,get_module_info,1}}.


{function, module_info, 1, 6}.
  {label,5}.
    {line,[]}.
    {func_info,{atom,q},{atom,module_info},1}.
  {label,6}.
    {move,{x,0},{x,1}}.
    {move,{atom,q},{x,0}}.
    {call_ext_only,2,{extfunc,erlang,get_module_info,2}}.

I removed all code from the first module_info declaration down, and removed these functions from the exports table at the top. I also experimented with removing other instructions, but was only able to remove the line instruction directly below the first label. I also adjusted the labels attribute, although this doesn't seem to be required. This resulted in the follwing assembly file:

{module, q}.  %% version = 0

{exports, [{q,0}]}.

{attributes, []}.

{labels, 3}.

{function, q, 0, 2}.
  {label,1}.
    {func_info,{atom,q},{atom,q},0}.
  {label,2}.
    {allocate,0,0}.
    {call_ext,0,{extfunc,ssl,start,0}}.
    {call_ext,0,{extfunc,inets,start,0}}.
    {move,{literal,"http://binary.golf/5/5"},{x,0}}.
    {call_ext,1,{extfunc,httpc,request,1}}.
    {call_ext_last,1,{extfunc,q,q,1},0}.

This compiles to a 592 byte file, which goes down to 264 bytes after stripping unused sections:

$ erlc q.S
$ python3 beamstrip.py q.beam q.beam
Keeping tag AtU8 [36]   
Keeping tag Code [49]   
Keeping tag StrT [0]    
Keeping tag ImpT [52]   
Keeping tag ExpT [16]               
Keeping tag LitT [46]   
Removing tag LocT [4]   
Removing tag Attr [40]                 
Removing tag CInf [97]                   
Removing tag Dbgi [88]                                                                         
Removing tag Line [20]                      
Removing tag Type [26]                  
Original size: 592                             
New size: 264

Technique 3: Crafting a BEAM file from scratch

At this point, I had optimization ideas that didn't seem possible with compiler-enforced validation. Most of the sections are fairly easy to generate. Each section begins with a 4-byte tag, followed by a 4-byte length, and 4-byte-aligned payload.

3.1: Atom chunk (Atom/AtU8)

The Atom chunk contains the names of each atom used by the program. Atoms are special strings such as module names and function names. When referencing a function such as ssl:start/0, the Atom section will contain "ssl" and "start", and these strings will be referenced by an index into the atom table. Each atom is a length-prefixed string up to 256 bytes, and the total count of atoms is stored at the start of the chunk. The Atom chunk was historically identified by the "Atom" tag, but this seems to have been superceded by the "AtU8" tag, indicating a UTF-8 Atom table. Lastly, the first entry in the Atom table must be the module name corresponding to the beam file.

3.2: String Table (StrT)

This section contains certain string literals. You may assume that this would be where the "https://binary.golf/5/5" string would be stored, but in fact that's actually stored in the literal table (section 3.5). The String Table must be present, but can be empty in this case. An empty string table is simply `b"StrT\0\0\0\0"`.

3.3: Import Table (ImpT)

Erlang function signatures are in the form `[module]:[function]/arity`. The module and function name are stored in the Atom table, and the arity is an integer corresponding to the number of arguments the function expects. The Import Table begins with the count of imports, followed by a sequence of (module, function, arity) fields (32-bit per field, or 12 bytes per import). The module and function names are indices into the atom table, starting at 1 for the first element.

3.4: Export Table (ExpT)

The Export Table is similar to the Import Table, but instead of storing the module name, the label of the function entrypoint is stored. The section starts with a count of exports, followed by a sequence of (function, arity, label) fields (32-bit per field, or 12 bytes per export).

3.5: Literal Table (LitT)

The Literal Table consists of an Erlang External Term Format (ETF) encoded list of literals. Unlike the string table or atom table, ETF encodes type information as well as value. The encoded data is compressed with Zlib, and the uncompressed size is stored alongside the compressed data. This already encoded only a single literal (the URL), and I found no way to shrink it further, so I simply copied the raw body of the literal table from my previous .beam files.

3.6: Code Section (Code)

The Code section is much more complicated than other sections, and I found The BEAM Book [6] to be very helpful for understanding the instruction format. The code section begins with a header containing the following 32-bit fields:

SubSize: Remaining size of the Code header, to allow backwards compatibility if the fields change.
InstructionSet: Versioning for the BEAM bytecode. This is currently still 0 for all known Erlang versions.
OpcodeMax: Highest opcode ID used by the code. New opcodes are added sequentially, so this is a quick check that the code is supported.
LabelCount: Number of local labels + 1
FunctionCount: Number of locally defined functions

This header is immediately followed by the bytecode.

Optimizations

The first optimization focused on the imports table. I realized I could remove the import of `q:q/1` by calling httpc:request/1 with the non-string HTTP response returned by httpc:request/1. The error message contained the response headers and body just like the previous error message. This also allowed me to remove the `q` atom from the atoms table by renaming the module and main function to match other atoms. I went with renaming the main function to `start:inets/0`.

Next, I worked on optimizing the bytecode, and I was able to save a byte by replacing the call_ext_last at the end of the function with a call_ext, and removing the allocate instruction. The compiler enforced the consistency of the memory allocation, but the BEAM VM doesn't seem to care.

These optimizations bring the file size down to 248 bytes.

Technique 4: Compression

After doing all of this work, I found out that .beam files support compression. Gzipping the file saves another 41 bytes, bringing the final size down to 207 bytes.

Building the final submission: mkbeam.py

import gzip 

def wdword(n, length=4):
    return n.to_bytes(length=length, byteorder='big')

def mk_atom(atoms):
    atb=wdword(len(atoms))
    for i in atoms:
        atb+=wdword(len(i), 1)
        atb+=i
    atom=b'AtU8'+wdword(len(atb))+atb
    return atom

def mk_code(instrs):
    codehdr=b''
    codehdr+=wdword(16) # infosz??
    codehdr+=wdword(0) # version
    codehdr+=wdword(169) # opcode_max, could be set lower
    codehdr+=wdword(3) # labels 
    codehdr+=wdword(1) # fcount
    return b'Code'+wdword(len(codehdr)+len(instrs))+codehdr+instrs
   
def mk_impt(imps, atoms):
    def parse_imp_sig(imp):
        mod, meth=imp.split(":")
        meth, arity=meth.split("/")
        return mod.encode(), meth.encode(), int(arity)
    data=wdword(len(imps))
    for i in imps:
        mod, meth, arity=parse_imp_sig(i)
        data+=wdword(atoms.index(mod)+1)
        data+=wdword(atoms.index(meth)+1)
        data+=wdword(arity)
    impt=b'ImpT'+wdword(len(data))+data
    return impt

def mk_expt(exps, atoms):
    data=wdword(len(exps))
    for i in exps:
        data+=wdword(atoms.index(i[0])+1)
        data+=wdword(i[1])
        data+=wdword(i[2])
    expt=b'ExpT'+wdword(len(data))+data
    return expt

# all args must be constant values in range(0,16)
def inst(op, *args):
    x=bytearray()
    x.append(op)
    for i in args:
        assert 0<=i<16
        x.append(i<<4)
    return bytes(x)

def call(name, funcs):
    n=int(name.split("/")[1])
    fn=funcs.index(name)
    return inst(7, n, fn)

def func_info(mod, func, arity, atoms):
    m=atoms.index(mod)+1
    f=atoms.index(func)+1
    data=bytearray()
    data.append(2)
    data.append((m<<4)|2)
    data.append((f<<4)|2)
    data.append(arity<<4)
    return bytes(data)

def bytecode(atoms, imports, exports):
    b=b''
    # main()
    b+=inst(1, 1) # label 1
    b+=func_info(b'start', b'inets', 0, atoms)
    b+=inst(1, 2) # label 2
    b+=call('ssl:start/0', imports)
    b+=call('inets:start/0', imports)
    mov =b'\x40' # MOV
    mov+=b'\x47' # LITERAL
    mov+=b'\x00' # literal idx
    mov+=b'\x03' # reg X0
    b+=mov          # mov(lit[0], X0)
    b+=call('httpc:request/1', imports)
    b+=call('httpc:request/1', imports)
    b+=b'\x03' # int_code_end
    return b

def pad4(data):
    if len(data)&3 ==0: return data
    return data + b'\0'*(4-(len(data)&3))

def create():
    # module: start
    # function: inets
    atoms=[b'start', b'ssl', b'inets', b'httpc', b'request']
    imports=["ssl:start/0", "inets:start/0", "httpc:request/1"]
    exports=[(b"inets", 0, 2)]
    # compressed literal table, borrowed from q.beam
    litt=bytes.fromhex("789c63606060646060906ace6610cb282929b0d2d74fcacc4b2caad44bcfcf49d337d53705007842089b")
    data=b'BEAM'
    data+=pad4(mk_atom(atoms))
    data+=pad4(mk_code(bytecode(atoms, imports, exports)))
    data+=b'StrT'+b'\0\0\0\0' # empty string table
    data+=pad4(mk_impt(imports, atoms))
    data+=pad4(mk_expt(exports, atoms))
    data+=pad4(b'LitT'+wdword(len(litt)+4)+wdword(0x22)+litt)
    return b'FOR1'+wdword(len(data))+data

with open("start.beam", "wb") as fd: 
    fd.write(gzip.compress(create()))

Running the Code

$ sha256sum start.beam 
ace72aaa0f9c546e4249e974520e71df55973b8ffbff347a193b237a090236fa  start.beam
$ erl -noshell -s start inets
Description: "Server authenticity is not verified since certificate path validation is not enabled"
     Reason: "The option {verify, verify_peer} and one of the options 'cacertfile' or 'cacerts' are required to enable this."

{"init terminating in do_boot",{badarg,[{unicode,characters_to_list,[{ok,{{"HTTP/1.1",200,"OK"},[{"cache-control","max-age=600"},{"connection","keep-alive"},{"via","1.1 varnish"},{"accept-ranges","bytes"},{"age","0"},{"server","GitHub.com"},{"vary","Accept-Encoding"},{"content-length","58"},{"content-type","application/octet-stream"},{"last-modified","Fri, 21 Jun 2024 13:56:50 GMT"},{"access-control-allow-origin","*"},{"x-proxy-cache","MISS"},{"x-cache","MISS"},{"x-cache-hits","0"}],"Another #BGGP5 download!! @binarygolf https://binary.golf\n"}}],[{file,"unicode.erl"},{line,895},{error_info,#{module=>erl_stdlib_errors}}]},{httpc,normalize_and_parse_url,1,[{file,"httpc.erl"},{line,742}]},{httpc,do_request,5,[{file,"httpc.erl"},{line,257}]},{start,inets,0,[]},{init,start_em,1,[]},{init,do_boot,3,[]}]}}
init terminating in do_boot ({badarg,[{unicode,characters_to_list,[{_}],[{_},{_},{_}]},{httpc,normalize_and_parse_url,1,[{_},{_}]},{httpc,do_request,5,[{_},{_}]},{start,inets,0,[]},{init,start_em,1,[]},{init,do_boot,3,[]}]})

Crash dump is being written to: erl_crash.dump...done

Some headers have been removed from the output for privacy.

Dissected Uncompressed BEAM file: start.beam.xx

464f5231                            | "FOR1" FORM header
000000f0                            | Filesize
4245414d                            | "BEAM" magic
-- Atom table -- 
41745538                            | "AtU8" atom table tag
00000022                            | size of atom table
 00000005                           | number of atoms
  05 7374617274                     | "start"
  03 73736c                         | "ssl"
  05 696e657473                     | "inets"
  05 6874747063                     | "httpc"
  07 72657175657374                 | "request"
 0000                               | padding
-- Code section -- 
436f6465                            | "Code" tag
0000002d                            | Size of code section
 00000010                           | Remaining size of Code header
 00000000                           | Bytecode version 0
 000000a9                           | highest bytecode opcode
 00000003                           | Number of labels
 00000001                           | Number of functions
  0110                              | {label,1}
  02123200                          | {func_info,{atom,start},{atom,inets},0}
  0120                              | {label,2}
  070000                            | {call_ext,0,{extfunc,ssl,start,0}}
  070010                            | {call_ext,0,{extfunc,inets,start,0}}
  40470003                          | {mov,{literal,"https://binary.golf/5/5"},{x,0}}
  071020                            | {call_ext,1,{extfunc,httpc,request,1}}
  071020                            | {call_ext,1,{extfunc,httpc,request,1}}
  03                                | {int_code_end}
 000000                             | padding
-- String Table --
53747254                            | "StrT" String Table tag
00000000                            | String Table size
-- Import Table --
496d7054                            | "ImpT Import Table tag
0000 0028                           | Size of import table
 0000 0003                          | Number of imports
  [ Module | Function | Arity ]
  00000002 00000001 00000000        | ssl:start/0
  00000003 00000001 00000000        | inets:start/0
  00000004 00000005 00000001        | httpc:request/0
-- Export Table --
45787054                            | "ExpT" Export Table tag
00000010                            |  Export Table size
 00000001                           | Number of exports
 [ Function | Arity | Label ]
  00000003 00000000 00000002        | start:inets/0 @ label 2 (module name implied)
-- Literal Table --
4c697454                            | "LitT" Literal Table tag
0000002f                            | Size of literal table
 00000022                           |Uncompressed size of literal table
  789c63606060646060906ace6610    #
  cb282929b0d2d74fcacc4b2caad4    #-| Compressed literal table
  4bcfcf49d337d53705007842089b    #
 0000                               | padding

References

[1] haxrob's .erl BGGP5 Entry

[2] Binary Golf Grand Prix 5

[3] BEAM File Format

[4] File Format for BEAM R5 and Later

[5] Erlang BEAM VM Specification

[6] The Beam Book