Over the last weeks, I experimented with coding on assembler on OpenBSD. In this article, I want to demonstrate how a "hello world" binary can be written. But keep in mind that I myself am only learning. I know some assembler from my earlier years (I did some ROM hacking for NES and SNES games) and I had a course that taught MIPS assembler using SPIM. But until now, I've never had written any Intel/AMD assembly.
In theory, you can choose between two ways of writing an application in assembly: You can issue syscall commands directly or you can call syscall wrapping functions in existing runtime libraries like libc. I opted to implement the second method, as it is the more stable interface on OpenBSD - the OpenBSD developers are not shy to change the ABI if it helps them get rid of old cruft or improve the system's overall design otherwise. This makes it unpragmatic to ship closed source software for OpenBSD on the other hand, but nobody likes closed source software either way ;)
Another reason to choose libc is that their was some effort to limit which code can use syscalls directly at all. This increases security by decimating ways for attackers to make syscalls. While ASLR makes it difficult for an attacker to find and call a specific function, making syscalls is much easier as they are statically known. When only libc is allowed to use syscalls, the attacker has a much harder time. Research msyscall(2) if you are interested in this.
A final reason to call into libc from assembler is that the same method will also help me use other existing C libraries. There is no reason why I should not be able to write, say, an ncurses application with assembler, if I find a use case for that.
Anyway, let's get the show started!
If you want to follow along, you need a i386 or amd64 OpenBSD system. The base does contain the GNU assembler in version 2.17, which is recent enough for my needs, but you can also install a more recent version as a package/port. The package is called 'gas', which will also be the name of the binary.
Create a new folder and copy the following Makefile into it. I use the same Makefile for both i386 and amd64, you do not need to modify it.
hello: hello.o openbsd-note.o ld --dynamic-linker=/usr/libexec/ld.so -pie -L/usr/lib -lc -o hello \ openbsd-note.o hello.o openbsd-note.o: openbsd-note.s as -g -o openbsd-note.o openbsd-note.s hello.o: hello.s as -g -o hello.o hello.s clean: -rm *.o -rm hello
As you can see in the Makefile, my project contains an openbsd-note.s. That file contains a mark that identifies the resulting binary as an executable for OpenBSD. If you leave it out, the binary seems to be executed like a script would be: As it has no shebang, it is given to a very confused shell.
.section ".note.openbsd.ident", "a" .p2align 2 .long 8,4,1 .ascii "OpenBSD\0" .long 0
I admit that the contents of this file is some cargo culting - projects like fpc, the FreePascal compiler, use it, but I do not fully understand if all values are correct and necessary. The only thing I can say so far is that with this note everything works. I might research it later.
hello.s (i386 edition)
Let's now start with the fun part.
.intel_syntax noprefix .global _start .global puts .global exit .text _start: call get_sp_in_eax add eax, OFFSET FLAT:_GLOBAL_OFFSET_TABLE_ lea edx, msg@GOTOFF[eax] push edx mov ebx, eax call puts@plt pop edx push 0 call exit@plt get_sp_in_eax: mov eax, DWORD PTR [esp] ret .data msg: .asciz "Hello world!"
I start the code with a pragma that states that I want to work in intel syntax. After that, I declare the global functions I want to use, which are _start as the program entry point (think 'main' in C, but more low-level: even your C programs start at a global named _start normally, which initializes the C runtime and calls main!), as well as the two libc functions I want to call. I believe I could leave these two out and everything would still work, but I like that the file says which functions it uses at the top.
The next and most important part is marked with ".text". This tells the assembler: Put the following stuff into the text section of the resulting ELF file, which is were executable code goes. This is were my _start function lives.
Most of _start is straight-forward. Starting from the bottom, I call exit at the end and pass it the parameter 0 via a push onto the stack. This guarantees that my hello world program will terminate cleanly and return an exit code of 0.
The call to puts on the other hand is a little more complicated. The binary is written as a position-independent executable, which means that no address is known statically. The code will be loaded at a random address, but all symbols from hello.o will be at the same distance to each other with every run. This can be used to calculate any symbols address from a single known address. That is what we have to do to find msg, the string constant that shall be passed to puts.
I first call a little helper function named get_sp_in_eax. When a call instruction is used, the address of the next command (i.e. the call's return address) is pushed onto the stack. The helper function reads this address from the stack and copies it to the eax register.
After returning, we use that information to calculate the address of the GOT, the _GLOBAL_OFFSET_TABLE_ that can be used to find the other symbols. The lea instruction uses it to find msg and put its address into the edx register. Finally, we push the edx register onto the stack, so that puts can find the address of msg there.
Now, I haven't explained the '@plt' stuff, yet. This is used to find functions in libraries. On i386, I've read that it expects the address of the GOT in the ebx register, which is why I copy it into ebx from eax (I might improve the code by storing the stack pointer in ebx in the first place, but I have not tried that yet). The plt is the Procedure Linkage Table. It contains the addresses of all functions, but also uses lazy binding as an optimization: Each function's address is resolved on the first call to the function.
One last thing you might find superfluous is the pop after the call. According to the Wikipedia article for X86 calling conventions, the C calling convention uses caller cleanup. That means that the address from msg is still on top of the stack after puts returned. This is no problem as the next call will terminate the program either way, but if you write a longer program and do not clean after your calls, you will eventually overflow your stack.
After the text section, I define msg and put it into the data section. The .asciz directive can be used for zero-terminated ascii strings - just what C functions like puts like.
hello.s (amd64 edition)
On amd64, many things are the easier.
.intel_syntax noprefix .global _start .global puts .global exit .text _start: lea rdi, [rip + msg] call puts@plt xor rdi, rdi call exit@plt .data msg: .asciz "Hello world!"
As you can see, the whole GOT magic got replaced, as we can now use the register rip directly for this. We don't need the helper function and can directly get the address of msg.
The next change is because function parameters are passed via registers: We do not need to push and pop anymore, but instead put the right value into the rdi register (the stack will still be used if you need more space than the registers allow for).
The 'xor rdi, rdi' is an idiom I learned a long time ago. It zeroes the given register. I have no idea if it is still better (faster/smaller) than a mov, but it will work either way.
And that's it!
If you now call make, you should get a binary that prints "Hello world!" and exits. I learned a lot on the journey and was surprised how much is going on below what you have to know when writing C.
I might do a similar exercise for my Raspberry Pi next. ARM assembly is less ugly than what I had to work with here. But that beside, I believe that learning on OpenBSD has prepared me for the worst :)