From time to time we might observe special Malware storing themselves into a MBR and run during the booting process. Attackers could use this neat technique to infect and to mess-up your disk and eventually asking for a ransom before restoring original disk-configurations (Petya was just one of the most infamous boot-ransomware). But this is only an already known scenario while humongous possibilities are still available for the attacker who holds physical rights to open your disk and to write in it whatever he desires. For this reason I believe it would be interesting to understand how MBR works and how is it possible to write a boot loader program, this skill will help you during the analysis of your next Boot Loader Malware.
How the PC boot process works ?
Actually the boot process is super easy. When you press the power button you are providing the right power to every electronic chips who needs it. The BIOS once is reached by electrical power starts by running its own stored code and when it finishes running its initialization routines it looks for bootable devices. A bootable device is a physically connected device who has 521 bytes of code at its beginning and that contains the boot magic number:
0x55AA as last 2 bytes. If the BIOS find 510 bytes followed by
0x55AA it takes the previous 510 bytes moves them into RAM (to
0x7c00 address) and assumes they are executable bytes. This code is the so-called bootloader. Just a side note: bootloader shall be written in 16bit since x86 compatible CPUs are working in “real_mode” due the limited available instruction set.
I am used to write and read assembly on “Intel sintax” (it’s the one I learned during my studies) but today I’d love to use GNU Assembler (compiler&linker) who implements AT&T syntax, which is quite different from the Intel one but it will just work fine for the simple code we are going to write. The first tool we are going to use is
as, the GNU compiler, which takes as input an assembly file and it returns its binary representation.
as -o boot.o boot.asm is what we are looking for. After the compiler we need a “linker” (GNU linker is called ld). We need to tell to the liner that we want a plain binary file without linked libraries or linked symbols, fir such a reason we’re going to use
--oformat binar. We also need to tell to the “linker” where the code starts (
-e main). We would add the parameter
-Ttext 0x7c00 just in case the code we are going to write does not fit into a 16bit address space, so we will force our linker to map the main function at such address which we know be the address where the BIOS runs bootloaders. Assuming our code named boot.asm and our original entry point to be labelled as ‘main’ we could use the following command:
ld -o boot.bin --oformat binary -e main -Ttext 0x7c00 boot.o. For running the compiled code I’ve just used qemu in the following simple way:
The following code runs on boot showing up 3 strings and a realtime clock progression. The code have been developed as demo, not caring about performance and optimization, I am sure the code could be optimized and beautified, but this is not my point for this post.
Since the BIOS is in near memory, we are able to use a whole BIOS instruction set as described in here. The used interrupts for the demo bootloader are the following:
1. Int_10,02 for setting up screen size
2. int_10,07 for cleaning the screen from BIOS outputs
3. int_12a,02 for setting cursor positions
4. int_1a,02 for reading the clock status
5.int_10,0e for writing character to screen
Following the “booting source” code is getting explained
Even if the code is slef-described let’s dig a little bit into the structure. The first two lines:
say that the code is going to be written in 16bit mode and the external (exposed) tagged function is the one labelled as ‘main’ (the linker needs it in order to setup the original entry point in proper address space).
The last two lines:
.fill 510-(.-init), 1, 0
say the code is bootalbe. In line 113 we have little-endian magic code while in line 112 we have the filling command, interpreted by the compiler, to fill-up (nop) the eventually empty bytes (up to 521 bytes) for getting safe the MBR structure.
The entire code exploits %cx register to setup the current state. For example %cx could be:
0x0000if msg is printed,
0x0001 if msg2 is printed,
0x0002 if msg3 is printed and
0x0003 if we want to start the clock printing loop. A very nice primitive command
lodsb is used to iterate over string characters (for more details here) in order to print them to monitor until null byte (\0).