No announcement yet.

Deciphering Intel Instruction Set

  • Filter
  • Time
  • Show
Clear All
new posts

    Deciphering Intel Instruction Set

    I have been asking questions like this lately, and LOVE that there is now a more appropriate forum for assembly.
    Below is one such question. and anyone at PB feel free to move any of my assembly language questions to this new forum if you see fit.

    Oh and Thank you Paul Dixon for helping me attempt to understand my more confusing points

    Deciphering Intel Instruction Set
    I am working on a Error Handler, but the more I read the Intel docs as to how instructions are formatted etc. The more confused I get. So I was hoping if I provide 1 example someone can assist me in where my confusion is.

    If I have a "Divide by Zero" Error then the following is what I think is happening.

    1.) I get the location of the Error (Given by the EIP copy of the actual IP when passed to my exception handler)

    2.) If there is a prefix (up to 4 possible) I increment my EIP by 1 byte for each prefix (in this case none found, so I sit at offset 0)

    3.) I then look at the OpCode starting at where my point now is. (in this case offset 0)

    4.) I then determine if a 1 byte or 2 byte op-code (in this case I get "F7 /6")
    Now for this particular example this becomes confusing to me. As the docs say
    /digit—A digit between 0 and 7 indicates that the ModR/M byte of the instruction uses
    only the r/m (register or memory) operand. The reg field contains the digit that provides an
    extension to the instruction's opcode.
    which would be where my "/6" comes into play.

    The rest of the below may be incorrect because I got confused at #4
    5.) At this point my OpCode is F7 and 0 so if a 1 byte opcode (which I think it is) I look at the next byte for the ModR/M

    6.) If the ModR/M exists then I increment to look at the SIB byte (at this point I now have Prefix = 0 bytes, OpCode = 1 byte, ModRM bytes = 1

    7.) If the SIB byte exists then I Increment on to the Immediate and Displacement bytes. But "Wait for it.....Wait for it...." where the heck do I find the immediate and displacement bytes????? they could each be 1,2, or 4 bytes......what did I miss????

    Now I may be amiss with using Divide by zero as an example but I am more for figuring out procedure than I am 1 particular code. If my guess is right then in the case of my example,

    1.) no prefix codes (none that I found anyways)
    2.) 1 byte OpCode
    3.) 1 byte ModR/M
    4.) 1 byte SIB
    Total of 3 bytes to increment and look at the next instruction.....but I could be WAYYYYyyyyy off because of confusion points above.
    haven't you been here before?

    The first byte of the opcode is F7. In this case that single byte does not specify the entire opcode, it specifies one of a set of opcodes. The extra bits to detrermine which opcode are held in 3 bits of the ModR/M byte that follows.

    The second byte, indicated by /6, means that the ModR/M byte is present, the Mod and the R/M fields of that byte are valid as usual but the reg/opcode field is 6. In the case of /digit in the manual the 3 REG bits of the ModR/M byte do not represent a register, they instead form part of the opcode.
                 1st byte    2nd byte/   3rd byte?
            of opcode   ModR/M
    DIV     1111 0111   mm110rrr    
              F    7       /6
    IDIV    1111 0111   mm111rrr
              F    7       /7
    MUL     1111 0111   mm100rrr
              F    7       /4 
    mm = the Mod bits and rrr = the R/M bits as usual.
    Whether the third and subsequent bytes exist depends on the values of mm and rrr. You look up what they do in table 2.2 in the Intel manual (Table 2-2. 32-Bit Addressing Forms with the ModR/M Byte)
    Last edited by Cliff Nichols; 19 Jun 2009, 04:42 PM.
    Engineer's Motto: If it aint broke take it apart and fix it

    "If at 1st you don't succeed... call it version 1.0"

    "Half of Programming is coding"....."The other 90% is DEBUGGING"

    "Document my code????" .... "WHYYY??? do you think they call it CODE? "

    Ok so starting at ground zero if I have it right so far.

    Example used - Divide by Zero
    Bytes for example - Byte0 = F7, and Byte1 = BD

    1.) Look at Byte0 at the address I need (in this case my EIP)
    2.) My Byte0 = F7
    3.) Look up F7 to see if it is a prefix
    4.) If a prefix, then keep looking if next few bytes are prefixes

    If prefix(es) exist, increment bytes I am pointing at. (Up to 4 possible)
    Since my example has no prefixes then
    Byte0 = F7, and Byte1 = BD

    Then I look up the OpCode (Byte0)
    F7 Indicates Division by zero and followed by some number (determined by bits 5 through 3 also known as Reg/OpCode which I would get from Byte1 which happens to be BD which in binary is 10111101)

    So at this point my
    Mod = 10 (bits 7,6)
    Reg/OpCode = 111 (bits 5,4,3)
    R/M = 101 (bits 2,1,0)

    Looking up these 3 values, Mod = 10 (category 10) and R/M=101 (Row = disp32[EBP]) which I think I have been glitching as row numbers and base 0 vs 1)

    So I have a displacement of 4 bytes (I think)

    Is this correct so far?

    The next Byte = 8 (I think SIB) If it is then I get
    Scale = 00 (Bits 7,6)
    Index = 100(Bits 5,4,3)
    Base = 000(Bits 2,1,0)

    Looking this up, I increment by 1 for SIB

    Is this correct?

    so my next byte to look at the next command would be

    The light green is me just thinking how the next step is....maybe I am still overthinking or lost in lookup point1 to 3 to find out 2 concept????
    Engineer's Motto: If it aint broke take it apart and fix it

    "If at 1st you don't succeed... call it version 1.0"

    "Half of Programming is coding"....."The other 90% is DEBUGGING"

    "Document my code????" .... "WHYYY??? do you think they call it CODE? "