This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Opcode Mnemonic Parameters Bytes OSZAPC Ref #s General
Timing Notes:- - Times separated by a dash/hyphen (-) are ranges.
# - This instruction's execution time is dependent on the following instruction.
n - n represents the number of iterations
pm= - In protected mode, this instruction uses this separate set of timing
Pentium and Pentium w/ MMX Specials for piping:NP - This instruction may not pipe with other instructions
PU - This instruction may pipe, but only in the U pipe
UV - This instruction may pipe in either pipe
The "General" Column
| - In a few rare cases, the timing for 16bit vs 32bit access differs (eg, DIV and IDIV). In these cases, the timing is separated by pipes (|), in "16bit|32bit" order.
/ - Times separated by a slash for processors before the Pentium Pro for instructions that support either a memory or a register operand (eg, r/m8, r/m16/32, etc), then the slash separates the timing by "reg/mem" timing. On instructions supporting REP or REPE/REPNE, the values show timing without the prefix and then the timing with the prefix (eg, "MOVSB / REP MOVSB")
: - Separates a True condition from a False condition (eg, "TRUE time : FALSE time"). This is similar to the C++ ternary notation of (A ? B : C), where A is the condition, B is what happens if it's true, C if it is false
+EA - On the 8086/8088 computers, all memory accesses required an EA calculation to determine the effective address of the operation. The amount of time required for this relies on the type of memory access used. Please refer to the "Clocks Details" tab for a list of the EA times required. And yes, the 8086/8088s were really that slow.
m - A special count of "pieces" that added time to the execution of some instructions on the 80286 and 80386. See the "Clocks Details" tab for further details.
! - This instruction's timing is far too complicated to show in the chart. Listed are the most common cases. Refer to the Clocks Details tab for full details
PV - This instruction may pipe, but only in the V pipe. If it cannot pipe with an instruction in the U pipe, it takes up both pipes
* - This instruction may pipe as shown, so long as there is not an immediate value AND an immediate displacement in the instruction. If there is an immediate value and a displacement, then the instruction may pipe in the U pipe only, on the Pentium w/ MMX (not the non-MMX Pentium)
Timing is a very complicated matter on the x86 processors, especially with how many versions of the processor there have been. The General column is an attempt to try to generalize the speeds of various instructions. There is no way to do this perfectly, but with a little tweaking it can be made fairly accurate. The numbers are on a semi-exponential scale from 0 (practically free) to 10 (a "why does a modern processor still go so slow" opcode). 0 is generally used for prefixes which really are free on the latest processors, 1 for the quickest arithmetic and move operations (the ones that the processor creators like using for MIPS calculations), 2 for slightly slower, etc. Any timing ending with a + means it can go up to 10 and often well beyond (like a rep movsd)
If the instruction is not available to every processor, the psuedo-timing is followed by what processor(s) first supported the specific instruction.
Timing Notes:Times separated by a dash/hyphen (-) are ranges.
Pentium and Pentium w/ MMX Specials for Piping:
Times with a * mean that values outside of a specific range take more cycles. See the full opcode description for details.
Times separated by a slash for processors before the Pentium are for "real mode/protected mode"Times separated by a slash for the Pentium mean Latency/Throughput. Latency is the actual amount of time the processor needs to fully execute the instruction, whereas throughput is how long the instruction takes before the next non-conflicting instruction may execute. In practice, it is possible to get the most common FPU instructions to only use 1 noticable clock each by properly staggering the instructions and using FXCH to switch register values around as needed. Note that instructions that do not have a slash cannot be used in this fashion.
NP - Means that this instruction does not support pipelining with FXCH.
FX - Means that an FXCH instruction may be used after this instruction with no CPU time wasted. (0 clocks)
This instruction has undocumented behavior. See the detailed opcode descriptions for further details.
This instruction is undocumented. See the detailed opcode descriptions for further details.
This instruction is an extension or a prefix and is not designed to be used on it's own
This instruction permits use of the REP prefix.
This instruction permits use of the REPE and REPNE prefixes.
This instruction permits use of the LOCK prefix
This is a floating point (x87) instruction.This is an MMX InstructionThis is an SSE InstructionThis is an SSE2 InstructionThis is an SSE3 InstructionThis is an SSSE3 InstructionThis is an SSE4.1 InstructionThis is an SSE4.2 InstructionThis is a 3DNow! InstructionThis is a 3DNow! Enhanced InstructionThis is a 3DNow! Professional InstructionThis is an SSE4a InstructionThis is an SSE5 Instruction
This instruction is designed for use in operating systems only, but normal user programs may still use it.
This is a serializing instructionThis instruction only exists in protected mode
Note for 386+ Only: This opcode supports both 16bit and 32 bit operand sizes. In real mode and SMM mode, it uses 16bit operands by default. In protected mode and virtual-86 mode, the selector used controls whether it is 16bit or 32bit by default. The 66 opcode reverses the operand size. For some instructions, this changes the name of the opcode (either adding a D [meaning DWORD] to the end, such as IRET to IRETD, or changing the final W [meaning word] to D, such as MOVSW to MOVSD), and the following special cases: JCXZ vs JECXZ, CBW to CWDE, and CWD to CDQ. If there are multiple ways to spell the same instruction because of this, both versions will be listed.
Note for 386+ Only: This opcode supports both 16bit and 32bit addressing. In real mode and SMM, it always uses 16bit addressing by default. In protected mode and virtual-86 mode, the selector used controls whether it uses 16bit or 32bit by default. The 67 opcode reverses the addressing size.
This instruction acts as if the LOCK prefix was used in all cases, regardless of whether LOCK is used with it or not
This instruction is designed for use in operating systems only. Normal user programs can not use this instruction as it throws a #GP exception if used outside of a CPL of 0.
This is a conditional or unknown target jump-type instruction. Use of it may incur huge cycle penalties on all processors that are Pentium and higher.
SpecialsEA 8088 - 8086 Only:
Description ClocksDisplacement 6Base or Index 5Displacement + (Base or Index) 9Base + Index (BP+DI, BX+SI) 7Base + Index (BP+SI, BX+DI) 8Base + Index + Disp (BP+DI+disp,BX+SI+disp) 11Base + Index + Disp (BP+SI+disp,BX+DI+disp) 12Add 4 cycles for word operands at odd addresses
m 286 - 386 Only:286: m represents the number of bytes in the next instruction
! Complex Timing:CALL far16:16/32 (9A far32/48)
KERBLUH - TO DO - Get info on the call gates and all that fun stuffm16:16/32 (FF /3)KERBLUH - TO DO - Get info on the call gates and all that fun stuff
ENTER imm16, 0 (C8 iw 00)This is the time shown in the chart. Refer to the chart for details.imm16, 1 (C8 iw 01)80186: 2580286: 1580386: 1280486: 17Pentium: 15NPimm16, imm8 (2 or more) (C8 iw ib)i = imm8 (the second operand)80186: 6+16i80286: 8+4i80386: 11+4i80486: 17+3iPentium: 15+2iNP
386: m represents the number of bytes in the next instruction plus the number of components, or pieces of that instruction. Each displacement (AX + BX + 1 counts as one), immediate value and prefix (such as opcode 66) add one
The timings below are for protected mode. The values in the chart are for real mode. Each value shows three times separated by a /. These are "CPL <= IOPL / CPL > IOPL / V86"
The timings below are for protected mode. The values in the chart are for real mode. Each item shows three values separated by a /. These are "CPL <= IOPL / CPL > IOPL / V86"
INT 3 (CC)KERBLUH - TO DO - Get info on the call gates and all that fun stuffINT imm8 (CD ib)KERBLUH - TO DO - Get info on the call gates and all that fun stuff
INTO
KERBLUH - TO DO - Get info on the call gates and all that fun stuffIRET
KERBLUH - TO DO - Get info on the call gates and all that fun stuffJMP
far 16:16/32 (EA far32/48)KERBLUH - TO DO - Get info on the call gates and all that fun stuffm16:16/32 (FF /5)KERBLUH - TO DO - Get info on the call gates and all that fun stuff
MOV The timings below are for protected mode on the Pentium only.
Sreg, r/m16 (8E /r)Add 8 cycles if the segment becomes a new descriptor, 6 if it's SS.
MOVS* The timings below are just extras for the 486 in REP mode.REP MOVSB (F3 A4), REP MOVSW/MOVSD (F3 A5)If n is 0, 5 cycles are takenIf n is 1, 13 cycles are taken
The timings below are for special gates and task switch interrupts in protected mode. Real mode and same-task interrupts are listed in the chart
The timings below are for special gates and task switch interrupts in protected mode, and only if overflow causes the interrupt. Real mode, same-task interrupts and failed conditions are all listed in the chart.
The timings below are for returning from special gates and task switch interrupts in protected mode. Real mode and same task interrupts are listed in the chart.
The timings below are for jumping through call gates. Real mode and same task jumps are listed in the chart
The timings below are for protected mode. The values in the chart are for real mode. Each item shows three values separated by a /. These are "CPL <= IOPL / CPL > IOPL / V86"
The timings below are for protected mode. The values in the chart are for real mode. Each item shows three values separated by a /. These are "CPL <= IOPL / CPL > IOPL / V86"
SCAS* The timings below are just extras for the 486 in REPE/REPNE mode.REPE/REPNE SCASB (F3/F2 AE), REPE/REPNE SCASW/SCASD (F3/F2 AF)If n is 0, 5 cycles are taken
STOS* The timings below are just extras for the 486 in REP mode.REP STOSB (F3 AA), REP STOSW/STOSD (F3 AB)If n is 0, 5 cycles are takenIf n is 1, 13 cycles are taken
The timings below are for protected mode. The values in the chart are for real mode. Each item shows three values separated by a /. These are "Same Privilege/Lower Privilege"