Sequence before the bootloader is reached:
jmp
to 0xFFFFF0Writing a bootloader can only be done in assembly, not C. Here's a NASM refresher:
When you use db/dw to store bytes, they're written to memory and then [a] will be the address of the start of the sequence in memory. Words are stored with the least significant byte first, so in a dw 0xAA55, [a] will be 0x55. The bytes/words are just stored at the current position. You might have to jmp over them in order to avoid executing them.
You'd think somebody would have created a language which is higher level than assembly but lower level than C by now; something that maps quite cleanly to assembly.
Does NASM support both 0x7C and 7Ch syntax?
More instructions. Some register definitions.
Mainly to load the kernel. But it only has 512 bytes to do it in.
org 0x7C00
/ origin 0x7C00
bits 16
/ [bits 16]
Apparently, though, some BIOSes use 07C0:0000 (older ones) and some use 0000:7C00 (newer ones). This means that you have to canonicalise CS:IP. CS is the code segment, and IP is the instruction pointer or offset. I guess this is what origin does? "Its sole function is to specify one offset which is added to all internal address references within the section; it does not permit any of the trickery that MASM's version does." (NASM Manual) Oh, this just puts the current NASM pointer there. It's the offset within the current sector, so org 0x7C00
will only work if we're in 0000:7C00, not 07C0:0000. That's why we have to canonicalise; but org doesn't do it for us.
So I guess that jmp far 0:0x7C00 works. Maybe. Though then that is probably going to reach the jmp instruction again, so that'll just loop. You can't set CS and EIP directly, hence the jump. Perhaps you can also do something like this:
org 0x7C00 jmp far 0:start start: ...
There are at least three different ways to approach this. This thread describes some of them, and advises this as the best approach:
[BITS 16]
[ORG 0]
jmp 0x7C0:main
main:
mov ax, 0x7C0
mov ds, ax
mov es, ax
mov ss, ax
mov sp, [whatever you like]
This is nice and clean. ORG gives the positions for the labels, but we just jump into the 07C0(:0000) sector so then the labels are put there anyway. The ORG 0 and jmp here canonicalise. If you're using an ascending stack you could set it to 0x7C00 + 512 = 07E0:0000 anyway.
The registers are:
So, CS is Code Segment, DS is Data Segment, SS is Stack Segment. Cf. x86 Architecture.
You could set the stack first and then save DS and ES along with FS, GS, and BX, CX, and DX.
Direction flag: increments if the flag is clear, decrements if it is set. In other words, the stack goes up through memory or down through memory for certain instructions based on the flag. Push always decrements SP, though, so it's not obvious why you'd want to clear it. Since the direction flag only works on string instructions, it basically sets whether you access strings at the top or the bottom of the stack.
Note how setting SS and SP works:
any move to ss will cause the move and the next instruction to be
executed atomically. This way you can set up a stack using
mov ss,ax
mov sp,bx
without fearing that some interrupt will use the new ss with the old sp
The boot signature, which needs to go at the end of the boot sector, is dw 0xAA55
. There's a standard way to put it there:
times 510 - ($ - $$) db 0 dw 0xAA55
times is a macro:
$ is the current offset / pointer, and $$ is the start of the code (should be 0x7C00).
0x500 to 0x9FBFF is mostly free (0000:0500 to 9000:FBFF). You get 1MB in Real Mode, which is up to byte 0xFFFFF, except that BIOS takes some of that. (0x9FBFF - 0x500) bytes in KB = 637.749 KB, about the 640 KB figure. Actually, of course 0x7C00 to 0x7DFF is taken by the MBR. Memory at 0x80000 and above might not exist, but if it does, you can use it. The CMU course page advises loading the second stage bootloader into 0x1000 to 0x31FF. You want it there because you really only want to use the first segment, 0000:0000 to 0000:FFFF. There's only 0x500 to 0x7C00 of contiguous space there, which is about 29.75 KB. The segment registers at DS, ES, FS, GS, and SS.
One designer seemed to chose SS = 0x7FC0 and SP = 0x0400 for the boot stack, which is 0x83C0 linear. But: "Addresses loaded into the segment registers are actually pointers to 16-byte blocks, so to convert the number to its actual address, shift it left by 4-bits. That means that the address 07C0h is actually referring to 7C00h, which is where your bootloader code is placed. 288 is 120h in hex, and so the actual location of the stack is really 7C00h + 1200h = 8E00h." So it's actually 0x7FC0 << 4, which is 0x7FC00 which makes more sense. This is 0x400 bytes short of the memory ceiling. Setting SP then to 0x400 means that the SP is at the top of the 0x400 (1024) bytes of allocated stack. More about stacks.
Memory segments: a compromise between the availability of 64 KB memory and the emergence of larger memory spaces, or between 20-bit and 16-bit addressing? 07C0:0000 is 0x7C00. Registers CS, DS, ES, and SS store segment addresses (the 07C0 part in the previous example).
If you use a BPB (BIOS Parameter Block) it looks like BIOS can shortcut looking up stuff from a filesystem.
If all else fails, the bootloader should do int 0x18
and int 0x19
to pass control back to BIOS.
Three basic steps:
; Disable interrupts cli ; Could also disable NMI and enable A20 Gate here ; Load a GDT (Global Descriptor Table) lgdt [gdtr] ; Set the lowest bit of CR0 (or MSW) mov eax, cr0 or eax, 1 mov cr0, eax
The x86 Assembly page says to flush registers after loading the GDT:
flush_gdt: lgdt [gdtr] jmp 0x08:complete_flush complete_flush: mov ax, 0x10 mov ds, ax mov es, ax mov fs, ax mov gs, ax mov ss, ax ret