Hello assembler!
Over the last weeks, I experimented with coding on assembler on OpenBSD. In this
article, I want to demonstrate how a "hello world" binary can be written. But
keep in mind that I myself am only learning. I know some assembler from my
earlier years (I did some ROM hacking for NES and SNES games) and I had a course
that taught MIPS assembler using SPIM.
But until now, I've never had written any Intel/AMD assembly.
In theory, you can choose between two ways of writing an application in
assembly: You can issue syscall commands directly or you can call syscall
wrapping functions in existing runtime libraries like libc. I opted to implement
the second method, as it is the more stable interface on OpenBSD - the OpenBSD
developers are not shy to change the ABI if it helps them get rid of old cruft
or improve the system's overall design otherwise. This makes it unpragmatic to
ship closed source software for OpenBSD on the other hand, but nobody likes
closed source software either way ;)
Another reason to choose libc is that their was some effort to limit which code
can use syscalls directly at all. This increases security by decimating ways for
attackers to make syscalls. While ASLR makes it
difficult for an attacker to find and call a specific function, making syscalls
is much easier as they are statically known. When only libc is allowed to use
syscalls, the attacker has a much harder time. Research msyscall(2) if you are
interested in this.
A final reason to call into libc from assembler is that the same method will
also help me use other existing C libraries. There is no reason why I should not
be able to write, say, an ncurses application with assembler, if I find a use
case for that.
Anyway, let's get the show started!
Preparations
If you want to follow along, you need a i386 or amd64 OpenBSD system. The base
does contain the GNU assembler in version 2.17, which is recent enough for my
needs, but you can also install a more recent version as a package/port. The
package is called 'gas', which will also be the name of the binary.
Create a new folder and copy the following Makefile into it. I use the same
Makefile for both i386 and amd64, you do not need to modify it.
hello: hello.o openbsd-note.o ld --dynamic-linker=/usr/libexec/ld.so -pie -L/usr/lib -lc -o hello \ openbsd-note.o hello.o openbsd-note.o: openbsd-note.s as -g -o openbsd-note.o openbsd-note.s hello.o: hello.s as -g -o hello.o hello.s clean: -rm *.o -rm hello
openbsd-note.s
As you can see in the Makefile, my project contains an openbsd-note.s. That file
contains a mark that identifies the resulting binary as an executable for
OpenBSD. If you leave it out, the binary seems to be executed like a script
would be: As it has no shebang, it is given to a very confused shell.
.section ".note.openbsd.ident", "a" .p2align 2 .long 8,4,1 .ascii "OpenBSD\0" .long 0
I admit that the contents of this file is some cargo culting - projects like
fpc, the FreePascal compiler, use it, but I do not fully understand if all
values are correct and necessary. The only thing I can say so far is that with
this note everything works. I might research it later.
hello.s (i386 edition)
Let's now start with the fun part.
.intel_syntax noprefix .global _start .global puts .global exit .text _start: call get_sp_in_eax add eax, OFFSET FLAT:_GLOBAL_OFFSET_TABLE_ lea edx, msg@GOTOFF[eax] push edx mov ebx, eax call puts@plt pop edx push 0 call exit@plt get_sp_in_eax: mov eax, DWORD PTR [esp] ret .data msg: .asciz "Hello world!"
I start the code with a pragma that states that I want to work in intel syntax.
After that, I declare the global functions I want to use, which are _start as
the program entry point (think 'main' in C, but more low-level: even your C
programs start at a global named _start normally, which initializes the C
runtime and calls main!), as well as the two libc functions I want to call. I
believe I could leave these two out and everything would still work, but I like
that the file says which functions it uses at the top.
The next and most important part is marked with ".text". This tells the
assembler: Put the following stuff into the text section of the resulting ELF
file, which is were executable code goes. This is were my _start function lives.
Most of _start is straight-forward. Starting from the bottom, I call exit at the
end and pass it the parameter 0 via a push onto the stack. This guarantees that
my hello world program will terminate cleanly and return an exit code of 0.
The call to puts on the other hand is a little more complicated. The binary is
written as a position-independent executable, which means that no address is
known statically. The code will be loaded at a random address, but all symbols
from hello.o will be at the same distance to each other with every run. This
can be used to calculate any symbols address from a single known address. That
is what we have to do to find msg, the string constant that shall be passed to
puts.
I first call a little helper function named get_sp_in_eax. When a call
instruction is used, the address of the next command (i.e. the call's return
address) is pushed onto the stack. The helper function reads this address from
the stack and copies it to the eax register.
After returning, we use that information to calculate the address of the GOT,
the _GLOBAL_OFFSET_TABLE_ that can be used to find the other symbols. The lea
instruction uses it to find msg and put its address into the edx register.
Finally, we push the edx register onto the stack, so that puts can find the
address of msg there.
Now, I haven't explained the '@plt' stuff, yet. This is used to find functions
in libraries. On i386, I've read that it expects the address of the GOT in the
ebx register, which is why I copy it into ebx from eax (I might improve the code
by storing the stack pointer in ebx in the first place, but I have not tried
that yet). The plt is the Procedure Linkage Table. It contains the addresses of all
functions, but also uses lazy binding as an optimization: Each function's
address is resolved on the first call to the function.
One last thing you might find superfluous is the pop after the call. According
to the Wikipedia article for X86 calling conventions, the C calling
convention uses caller cleanup. That means that the address from msg is still on
top of the stack after puts returned. This is no problem as the next call will
terminate the program either way, but if you write a longer program and do not
clean after your calls, you will eventually overflow your stack.
After the text section, I define msg and put it into the data section. The
.asciz directive can be used for zero-terminated ascii strings - just what C
functions like puts like.
hello.s (amd64 edition)
On amd64, many things are the easier.
.intel_syntax noprefix .global _start .global puts .global exit .text _start: lea rdi, [rip + msg] call puts@plt xor rdi, rdi call exit@plt .data msg: .asciz "Hello world!"
As you can see, the whole GOT magic got replaced, as we can now use the register
rip directly for this. We don't need the helper function and can directly get
the address of msg.
The next change is because function parameters are passed via registers: We do
not need to push and pop anymore, but instead put the right value into the rdi
register (the stack will still be used if you need more space than the registers
allow for).
The 'xor rdi, rdi' is an idiom I learned a long time ago. It zeroes the given
register. I have no idea if it is still better (faster/smaller) than a mov, but
it will work either way.
And that's it!
If you now call make, you should get a binary that prints "Hello world!" and
exits. I learned a lot on the journey and was surprised how much is going on
below what you have to know when writing C.
I might do a similar exercise for my Raspberry Pi next. ARM assembly is less
ugly than what I had to work with here. But that beside, I believe that learning
on OpenBSD has prepared me for the worst :)