This is simply a test piece using a similar argument passing technique to Win64. Data is written below the address in ESP and read by the called procedure. By not modifying the stack pointer, there is no need to balance the stack on procedure exit and the only overhead is the call - ret pair.
Code:
' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤ FUNCTION PBmain as LONG #REGISTER NONE LOCAL pa as DWORD LOCAL pb as DWORD a$ = "12345678901234567890" b$ = " " pa = StrPtr(a$) pb = StrPtr(b$) PREFIX "!" mov eax, pa mov edx, pb mov DWORD PTR [esp-8], eax ' source address mov DWORD PTR [esp-12], edx ' destination address mov DWORD PTR [esp-16], 20 ' byte count call fastcopy END PREFIX StdOut b$ waitkey$ End FUNCTION ' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤ FASTPROC fastcopy PREFIX "!" mov edx, esi ' save esi mov eax, edi ' save edi mov esi, [esp-4] ' the call mnemonic reduces mov edi, [esp-8] ' stack memory address by 4 mov ecx, [esp-12] rep movsb mov esi, edx ' restore esi mov edi, eax ' restore edi ret ' ret with no stack correction END PREFIX END FASTPROC ' ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤