Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, Intel documented that the very first instruction after enabling protected mode had to be an "intra-segment" (not inter-segment) jump, to flush the prefetch queue. At least that was what it said in the 286 and 386 documents I read. You were supposed to set up everything else needed before that, do this near jump, and then jump to the new protected mode code segment.

Some later documentation contradicted this, saying that instead this first jump had to be to the protected mode segment.

From the patent (US4442484), it is apparent that the processor decodes opcodes into a microcode entry point before they are executed, and the PE bit is one of the inputs for the entry point PLA. So that would be the obvious reason for flushing the prefetch queue - but it turns out that at least on the 80286, most instructions go to the same entry point regardless of the mode they are decoded in. So they should work the same without flushing the queue.

And yet for some reason, what I've seen in my experiments is that the system would reset if there were three instructions following the "LMSW" without a jump. Even something harmless like "NOP" or "MOV AX,AX", that couldn't be different between real and protected mode. Maybe there is some clock phase where the PE bit changing during the decoding of an instruction leads to an invalid entry point, that either causes a triple fault or resets the processor?



I disassemble and read a lot of vintage bioses for fun. Recently I looked at something more ~recent, an Atom N270 945GSE Mini-ITX industrial board from 2010. Phoenix bios:

    seg000:FD56 Unreal_FFD56    proc near               ; CODE XREF: CPU_MicrocodeUpdate+A↑j
    seg000:FD56                                         ; VGA_BIOS_Shadow+20↑p ...
    seg000:FD56                 lgdt    fword ptr cs:[bx]
    seg000:FD5A                 mov     eax, cr0
    seg000:FD5D                 or      al, 1
    seg000:FD5F                 mov     cr0, eax
    seg000:FD62                 jmp     short $+2
    seg000:FD64 ; ---------------------------------------------------------------------------
    seg000:FD64
    seg000:FD64 loc_FFD64:                              ; CODE XREF: Unreal_FFD56+C↑j
    seg000:FD64                 mov     ax, 8
    seg000:FD67                 mov     ds, ax
    seg000:FD69                 assume ds:nothing
    seg000:FD69                 mov     es, ax
    seg000:FD6B                 assume es:nothing
    seg000:FD6B                 mov     eax, cr0
    seg000:FD6E                 and     al, 0FEh
    seg000:FD70                 mov     cr0, eax
    seg000:FD73                 jmp     short $+2
    seg000:FD75 ; ---------------------------------------------------------------------------
    seg000:FD75
    seg000:FD75 loc_FFD75:                              ; CODE XREF: Unreal_FFD56+1D↑j
    seg000:FD75                 xor     ax, ax
    seg000:FD77                 mov     ds, ax
    seg000:FD79                 assume ds:nothing
    seg000:FD79                 mov     es, ax
    seg000:FD7B                 assume es:nothing
    seg000:FD7B                 retn
    seg000:FD7B Unreal_FFD56    endp
two short jumps, no far jumps in sight. Apparently works just fine on Pentium 4, Core 2s and Atoms.


Yes, the far jump was never necessary on any processor, only a convention. You can stay in the same segment as in real mode and it will continue to work. But some kind of control transfer to flush the queue must be done shortly after the LMSW / MOV CR0, or things may break in ways that I'm not entirely clear on.

My test code looked like this:

        mov     ax,1            ;new MSW
        mov     bx,TestSel      ;pointer to selector value into BX
        mov     dx,[bx]         ;and load into DX
        mov     cl,31           ;shift count for delay
        cli                     ;disable interrupts
        lgdt    [Gdtr]
        lidt    [Idtr]
        jmp     enter_pm        ;flush queue now
               
    align 2
    enter_pm:                   ;go!
        rol     cl,cl           ;delay while following instructions decode
        lmsw    ax              ;set PE bit
        mov     es,[bx]         ;should load selector 0x0010 into ES
        mov     ds,dx           ;should set DS base to 0x00100 [NOPE]
        str     ax              ;should trap because not allowed in real mode
        ud2                     ;trap anyway in case it didn't
On the 286, this always caused the processor to reset. Replacing one of the two segment load instructions with a same-length "mov ax,ax" didn't change that, but removing one of them did.

In that case the "str ax" acted as the control transfer that flushed the queue (it was still decoded in real mode, so it went to the "invalid opcode" entry point). No clue as to what exactly happens to cause the reset when three instructions are run from the queue, some timing issue related to when the PE bit actually changes vs. what the decoder is doing at this point?


Up to this bios I dont remember ever seeing move to PM without far jump into just loaded 32bit selector.


Guess: Intel changed the spec. There's quite a few generations between a 286 and a P4, and new BIOS code doesn't need to run on discontinued CPU types. And new execution contexts like https://en.wikipedia.org/wiki/System_Management_Mode might benefit from minimizing the setup needed to run in protected mode.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: