Implementations of D on different architectures, however, are free to innovate upon the memory model, function call/return conventions, argument passing conventions, etc.
This document describes the x86 and x86_64 implementations of the inline assembler. The inline assembler platform support that a compiler provides is indicated by the D_InlineAsm_X86 and D_InlineAsm_X86_64 version identifiers, respectively.
AsmStatement:
asm function, FunctionAttributesopt { AsmInstructionListopt } AsmInstructionList:
AsmInstruction ;
AsmInstruction ; AsmInstructionList
Assembler instructions must be located inside an asm block. Like functions, asm statements must be anotated with adequate function attributes to be compatible with the caller. Asm statements attributes must be explicitly defined, they are not infered.
void func1() pure nothrow @safe @nogc { asm pure nothrow @trusted @nogc {} } void func2() @safe @nogc { asm @nogc // Error: asm statement is assumed to be @system - mark it with '@trusted' if it is not {} }
AsmInstruction:
Identifier : AsmInstruction
align IntegerExpression
even
naked
db Operands
ds Operands
di Operands
dl Operands
df Operands
dd Operands
de Operands
db StringLiteral
ds StringLiteral
di StringLiteral
dl StringLiteral
dw StringLiteral
dq StringLiteral
Opcode Opcode:
Identifier
int
in
out
Operands:
Operand
Operand , Operands
Assembler instructions can be labeled just like other statements. They can be the target of goto statements. For example:
void *pc; asm { call L1 ; L1: ; pop EBX ; mov pc[EBP],EBX ; // pc now points to code at L1 }
IntegerExpression:
IntegerLiteral
IdentifierCauses the assembler to emit NOP instructions to align the next assembler instruction on an IntegerExpression boundary. IntegerExpression must evaluate at compile time to an integer that is a power of 2.
Aligning the start of a loop body can sometimes have a dramatic effect on the execution speed.
Causes the assembler to emit NOP instructions to align the next assembler instruction on an even boundary.
Causes the compiler to not generate the function prolog and epilog sequences. This means such is the responsibility of inline assembly programmer, and is normally used when the entire function is to be written in assembler.
These pseudo ops are for inserting raw data directly into the code. db is for bytes, ds is for 16 bit words, di is for 32 bit words, dl is for 64 bit words, df is for 32 bit floats, dd is for 64 bit doubles, and de is for 80 bit extended reals. Each can have multiple operands. If an operand is a string literal, it is as if there were length operands, where length is the number of characters in the string. One character is used per operand. For example:
asm { db 5,6,0x83; // insert bytes 0x05, 0x06, and 0x83 into code ds 0x1234; // insert bytes 0x34, 0x12 di 0x1234; // insert bytes 0x34, 0x12, 0x00, 0x00 dl 0x1234; // insert bytes 0x34, 0x12, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 df 1.234; // insert float 1.234 dd 1.234; // insert double 1.234 de 1.234; // insert real 1.234 db "abc"; // insert bytes 0x61, 0x62, and 0x63 ds "abc"; // insert bytes 0x61, 0x00, 0x62, 0x00, 0x63, 0x00 }
A list of supported opcodes is at the end.
The following registers are supported. Register names are always in upper case.
Register:
AL
AH
AX
EAXBL
BH
BX
EBX
CL
CH
CX
ECX
DL
DH
DX
EDX
BP
EBP
SP
ESP
DI
EDI
SI
ESI
ES
CS
SS
DS
GS
FS
CR0
CR2
CR3
CR4
DR0
DR1
DR2
DR3
DR6
DR7
TR3
TR4
TR5
TR6
TR7
ST
ST(0)
ST(1)
ST(2)
ST(3)
ST(4)
ST(5)
ST(6)
ST(7)
MM0
MM1
MM2
MM3
MM4
MM5
MM6
MM7
XMM0
XMM1
XMM2
XMM3
XMM4
XMM5
XMM6
XMM7
x86_64 adds these additional registers.
Register64:
RAX
RBX
RCX
RDXBPL
RBP
SPL
RSP
DIL
RDI
SIL
RSI
R8B
R8W
R8D
R8
R9B
R9W
R9D
R9
R10B
R10W
R10D
R10
R11B
R11W
R11D
R11
R12B
R12W
R12D
R12
R13B
R13W
R13D
R13
R14B
R14W
R14D
R14
R15B
R15W
R15D
R15
XMM8
XMM9
XMM10
XMM11
XMM12
XMM13
XMM14
XMM15
YMM0
YMM1
YMM2
YMM3
YMM4
YMM5
YMM6
YMM7
YMM8
YMM9
YMM10
YMM11
YMM12
YMM13
YMM14
YMM15
asm { rep ; movsb ; }
asm { rep ; nop ; }
which produces the same result.
fdiv ST(1); // wrong fmul ST; // wrong fdiv ST,ST(1); // right fmul ST,ST(0); // right
Operand:
AsmExp AsmExp:
AsmLogOrExp
AsmLogOrExp ? AsmExp : AsmExp
AsmLogOrExp:
AsmLogAndExp
AsmLogOrExp || AsmLogAndExp
AsmLogAndExp:
AsmOrExp
AsmLogAndExp && AsmOrExp
AsmOrExp:
AsmXorExp
AsmOrExp | AsmXorExp
AsmXorExp:
AsmAndExp
AsmXorExp ^ AsmAndExp
AsmAndExp:
AsmEqualExp
AsmAndExp & AsmEqualExp
AsmEqualExp:
AsmRelExp
AsmEqualExp == AsmRelExp
AsmEqualExp != AsmRelExp
AsmRelExp:
AsmShiftExp
AsmRelExp < AsmShiftExp
AsmRelExp <= AsmShiftExp
AsmRelExp > AsmShiftExp
AsmRelExp >= AsmShiftExp
AsmShiftExp:
AsmAddExp
AsmShiftExp << AsmAddExp
AsmShiftExp >> AsmAddExp
AsmShiftExp >>> AsmAddExp
AsmAddExp:
AsmMulExp
AsmAddExp + AsmMulExp
AsmAddExp - AsmMulExp
AsmMulExp:
AsmBrExp
AsmMulExp * AsmBrExp
AsmMulExp / AsmBrExp
AsmMulExp % AsmBrExp
AsmBrExp:
AsmUnaExp
AsmBrExp [ AsmExp ]
AsmUnaExp:
AsmTypePrefix AsmExp
offsetof AsmExp
seg AsmExp
+ AsmUnaExp
- AsmUnaExp
! AsmUnaExp
~ AsmUnaExp
AsmPrimaryExp
AsmPrimaryExp:
IntegerLiteral
FloatLiteral
__LOCAL_SIZE
$
Register
Register64 : AsmExp
DotIdentifier
this
DotIdentifier:
Identifier
Identifier . DotIdentifier
type, FundamentalType . Identifier
The operand syntax more or less follows the Intel CPU documentation conventions. In particular, the convention is that for two operand instructions the source is the right operand and the destination is the left operand. The syntax differs from that of Intel's in order to be compatible with the D language tokenizer and to simplify parsing.
The seg means load the segment number that the symbol is in. This is not relevant for flat model code. Instead, do a move from the relevant segment register.
A dotted expression is evaluated during the compilation and then must either give a constant or indicate a higher level variable that fits in the target register or variable.
AsmTypePrefix:
near ptr
far ptr
word ptr
dword ptr
qword ptr
type, FundamentalType ptrIn cases where the operand size is ambiguous, as in:
add [EAX],3 ;
it can be disambiguated by using an AsmTypePrefix:
add byte ptr [EAX],3 ; add int ptr [EAX],7 ;
far ptr is not relevant for flat model code.
To access members of an aggregate, given a pointer to the aggregate is in a register, use the .offsetof property of the qualified name of the member:
struct Foo { int a,b,c; } int bar(Foo *f) { asm { mov EBX,f ; mov EAX,Foo.b.offsetof[EBX] ; } } void main() { Foo f = Foo(0, 2, 0); assert(bar(&f) == 2); }
Alternatively, inside the scope of an aggregate, only the member name is needed:
struct Foo // or class { int a,b,c; int bar() { asm { mov EBX, this ; mov EAX, b[EBX] ; } } } void main() { Foo f = Foo(0, 2, 0); assert(f.bar() == 2); }
Stack variables (variables local to a function and allocated on the stack) are accessed via the name of the variable indexed by EBP:
int foo(int x) { asm { mov EAX,x[EBP] ; // loads value of parameter x into EAX mov EAX,x ; // does the same thing } }
If the EBP is omitted, it is assumed for local variables. If naked is used, this no longer holds.
jmp $ ;branches to the instruction following the jmp instruction. The $ can only appear as the target of a jmp or call instruction.
| aaa | aad | aam | aas | adc |
| add | addpd | addps | addsd | addss |
| and | andnpd | andnps | andpd | andps |
| arpl | bound | bsf | bsr | bswap |
| bt | btc | btr | bts | call |
| cbw | cdq | clc | cld | clflush |
| cli | clts | cmc | cmova | cmovae |
| cmovb | cmovbe | cmovc | cmove | cmovg |
| cmovge | cmovl | cmovle | cmovna | cmovnae |
| cmovnb | cmovnbe | cmovnc | cmovne | cmovng |
| cmovnge | cmovnl | cmovnle | cmovno | cmovnp |
| cmovns | cmovnz | cmovo | cmovp | cmovpe |
| cmovpo | cmovs | cmovz | cmp | cmppd |
| cmpps | cmps | cmpsb | cmpsd | cmpss |
| cmpsw | cmpxchg | cmpxchg8b | cmpxchg16b | |
| comisd | comiss | |||
| cpuid | cvtdq2pd | cvtdq2ps | cvtpd2dq | cvtpd2pi |
| cvtpd2ps | cvtpi2pd | cvtpi2ps | cvtps2dq | cvtps2pd |
| cvtps2pi | cvtsd2si | cvtsd2ss | cvtsi2sd | cvtsi2ss |
| cvtss2sd | cvtss2si | cvttpd2dq | cvttpd2pi | cvttps2dq |
| cvttps2pi | cvttsd2si | cvttss2si | cwd | cwde |
| da | daa | das | db | dd |
| de | dec | df | di | div |
| divpd | divps | divsd | divss | dl |
| dq | ds | dt | dw | emms |
| enter | f2xm1 | fabs | fadd | faddp |
| fbld | fbstp | fchs | fclex | fcmovb |
| fcmovbe | fcmove | fcmovnb | fcmovnbe | fcmovne |
| fcmovnu | fcmovu | fcom | fcomi | fcomip |
| fcomp | fcompp | fcos | fdecstp | fdisi |
| fdiv | fdivp | fdivr | fdivrp | feni |
| ffree | fiadd | ficom | ficomp | fidiv |
| fidivr | fild | fimul | fincstp | finit |
| fist | fistp | fisub | fisubr | fld |
| fld1 | fldcw | fldenv | fldl2e | fldl2t |
| fldlg2 | fldln2 | fldpi | fldz | fmul |
| fmulp | fnclex | fndisi | fneni | fninit |
| fnop | fnsave | fnstcw | fnstenv | fnstsw |
| fpatan | fprem | fprem1 | fptan | frndint |
| frstor | fsave | fscale | fsetpm | fsin |
| fsincos | fsqrt | fst | fstcw | fstenv |
| fstp | fstsw | fsub | fsubp | fsubr |
| fsubrp | ftst | fucom | fucomi | fucomip |
| fucomp | fucompp | fwait | fxam | fxch |
| fxrstor | fxsave | fxtract | fyl2x | fyl2xp1 |
| hlt | idiv | imul | in | inc |
| ins | insb | insd | insw | int |
| into | invd | invlpg | iret | iretd |
| iretq | ja | jae | jb | jbe |
| jc | jcxz | je | jecxz | jg |
| jge | jl | jle | jmp | jna |
| jnae | jnb | jnbe | jnc | jne |
| jng | jnge | jnl | jnle | jno |
| jnp | jns | jnz | jo | jp |
| jpe | jpo | js | jz | lahf |
| lar | ldmxcsr | lds | lea | leave |
| les | lfence | lfs | lgdt | lgs |
| lidt | lldt | lmsw | lock | lods |
| lodsb | lodsd | lodsw | loop | loope |
| loopne | loopnz | loopz | lsl | lss |
| ltr | maskmovdqu | maskmovq | maxpd | maxps |
| maxsd | maxss | mfence | minpd | minps |
| minsd | minss | mov | movapd | movaps |
| movd | movdq2q | movdqa | movdqu | movhlps |
| movhpd | movhps | movlhps | movlpd | movlps |
| movmskpd | movmskps | movntdq | movnti | movntpd |
| movntps | movntq | movq | movq2dq | movs |
| movsb | movsd | movss | movsw | movsx |
| movupd | movups | movzx | mul | mulpd |
| mulps | mulsd | mulss | neg | nop |
| not | or | orpd | orps | out |
| outs | outsb | outsd | outsw | packssdw |
| packsswb | packuswb | paddb | paddd | paddq |
| paddsb | paddsw | paddusb | paddusw | paddw |
| pand | pandn | pavgb | pavgw | pcmpeqb |
| pcmpeqd | pcmpeqw | pcmpgtb | pcmpgtd | pcmpgtw |
| pextrw | pinsrw | pmaddwd | pmaxsw | pmaxub |
| pminsw | pminub | pmovmskb | pmulhuw | pmulhw |
| pmullw | pmuludq | pop | popa | popad |
| popf | popfd | por | prefetchnta | prefetcht0 |
| prefetcht1 | prefetcht2 | psadbw | pshufd | pshufhw |
| pshuflw | pshufw | pslld | pslldq | psllq |
| psllw | psrad | psraw | psrld | psrldq |
| psrlq | psrlw | psubb | psubd | psubq |
| psubsb | psubsw | psubusb | psubusw | psubw |
| punpckhbw | punpckhdq | punpckhqdq | punpckhwd | punpcklbw |
| punpckldq | punpcklqdq | punpcklwd | push | pusha |
| pushad | pushf | pushfd | pxor | rcl |
| rcpps | rcpss | rcr | rdmsr | rdpmc |
| rdtsc | rep | repe | repne | repnz |
| repz | ret | retf | rol | ror |
| rsm | rsqrtps | rsqrtss | sahf | sal |
| sar | sbb | scas | scasb | scasd |
| scasw | seta | setae | setb | setbe |
| setc | sete | setg | setge | setl |
| setle | setna | setnae | setnb | setnbe |
| setnc | setne | setng | setnge | setnl |
| setnle | setno | setnp | setns | setnz |
| seto | setp | setpe | setpo | sets |
| setz | sfence | sgdt | shl | shld |
| shr | shrd | shufpd | shufps | sidt |
| sldt | smsw | sqrtpd | sqrtps | sqrtsd |
| sqrtss | stc | std | sti | stmxcsr |
| stos | stosb | stosd | stosw | str |
| sub | subpd | subps | subsd | subss |
| syscall | sysenter | sysexit | sysret | test |
| ucomisd | ucomiss | ud2 | unpckhpd | unpckhps |
| unpcklpd | unpcklps | verr | verw | wait |
| wbinvd | wrmsr | xadd | xchg | xlat |
| xlatb | xor | xorpd | xorps |
| addsubpd | addsubps | fisttp | haddpd | haddps |
| hsubpd | hsubps | lddqu | monitor | movddup |
| movshdup | movsldup | mwait |
| pavgusb | pf2id | pfacc | pfadd | pfcmpeq |
| pfcmpge | pfcmpgt | pfmax | pfmin | pfmul |
| pfnacc | pfpnacc | pfrcp | pfrcpit1 | pfrcpit2 |
| pfrsqit1 | pfrsqrt | pfsub | pfsubr | pi2fd |
| pmulhrw | pswapd |
SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2 and AVX are supported.
The GNU D Compiler uses an alternative, GCC-based syntax for inline assembler:
GccAsmStatement:
asm function, FunctionAttributesopt { GccAsmInstructionList } GccAsmInstructionList:
GccAsmInstruction ;
GccAsmInstruction ; GccAsmInstructionList
GccAsmInstruction:
GccBasicAsmInstruction
GccBasicAsmInstruction:
expression, AssignExpression
GccExtAsmInstruction:
expression, AssignExpression : GccAsmOperandsopt
expression, AssignExpression : GccAsmOperandsopt : GccAsmOperandsopt
expression, AssignExpression : GccAsmOperandsopt : GccAsmOperandsopt : GccAsmClobbersopt
GccGotoAsmInstruction:
expression, AssignExpression : : GccAsmOperandsopt : GccAsmClobbersopt : GccAsmGotoLabelsopt
GccAsmOperands:
GccSymbolicNameopt StringLiteral ( expression, AssignExpression )
GccSymbolicNameopt StringLiteral ( expression, AssignExpression ) , GccAsmOperands
GccSymbolicName:
[ Identifier ]
GccAsmClobbers:
StringLiteral
StringLiteral , GccAsmClobbers
GccAsmGotoLabels:
Identifier
Identifier , GccAsmGotoLabels
float, Floating Point, ddoc, Embedded Documentation
$(HTMLTAG3 a, href="http://digitalmars.com/gift/index.html" title="Gift Shop" target="_top", $(HTMLTAG3V img, src="images/d5.gif" border="0" align="right" alt="Some Assembly Required" width="284" height="186") )
D, being a systems programming language, provides an inline assembler. The inline assembler is standardized for D implementations across the same CPU family, for example, the Intel Pentium inline assembler for a Win32 D compiler will be syntax compatible with the inline assembler for Linux running on an Intel Pentium.