Secure 512-bit hashing (ver. 2) for PowerBASIC
Code for the following two files appears below:
SHA512a.ZIP (available below) contains both files as well as a class implementation of the same functionality.
See the function declarations below for detailed calling information.
All code compiles with either PBWIN 9+ or PBCC 5+. It replaces an earlier version of SHA512 which I posted in 2004. My thanks go to Eddy Van Esch for help with testing and debugging the new version.
----------------------------------------------------------------------
Available here is a PDF file containing the NIST specifications for SHA-1 as well as the 224-bit, 256-bit, 384-bit, and 512-bit extensions to the SHA standard.
A hash is considered secure when it possesses the following qualities.
-- Determining the input string from the hash (i.e., working backward from the hash alone to determine the string which generated it) is not considered feasible.
-- Given an input string, it is not considered feasible to find another string which hashes to the same value.
-- It is not considered feasible to find two random strings which hash to the same value.
Secure hashes are not designed for speed. The implementation below relies on assembly language to improve speed, but unless security is required, a secure hash is a poor choice when compared with the many simpler, far more efficient hash algorithms in widespread use. Moreover, unless compelling reasons exist for employing a 512-bit secure hash, SHA256 offers faster results on most systems. My PB implementation of SHA256 is available here.
NIST is currently preparing an open competition to replace SHA-1 as the secure hash standard. Info.
----------------------------------------------------------------------
This PB implementation of SHA512 is hereby placed in the public domain. Use it as you wish.
Greg Turgeon
10/2008
Code for the following two files appears below:
- SHA512a.INC
Hash routines for returning 64-byte SHA512 hashes of buffers and files. Included are a 32-bit version as well as versions that make use of SSE2 or MMX functionality if available. - SHA512a.BAS
Test bed EXE illustrating buffer and file hashing.
SHA512a.ZIP (available below) contains both files as well as a class implementation of the same functionality.
See the function declarations below for detailed calling information.
All code compiles with either PBWIN 9+ or PBCC 5+. It replaces an earlier version of SHA512 which I posted in 2004. My thanks go to Eddy Van Esch for help with testing and debugging the new version.
----------------------------------------------------------------------
Available here is a PDF file containing the NIST specifications for SHA-1 as well as the 224-bit, 256-bit, 384-bit, and 512-bit extensions to the SHA standard.
A hash is considered secure when it possesses the following qualities.
-- Determining the input string from the hash (i.e., working backward from the hash alone to determine the string which generated it) is not considered feasible.
-- Given an input string, it is not considered feasible to find another string which hashes to the same value.
-- It is not considered feasible to find two random strings which hash to the same value.
Secure hashes are not designed for speed. The implementation below relies on assembly language to improve speed, but unless security is required, a secure hash is a poor choice when compared with the many simpler, far more efficient hash algorithms in widespread use. Moreover, unless compelling reasons exist for employing a 512-bit secure hash, SHA256 offers faster results on most systems. My PB implementation of SHA256 is available here.
NIST is currently preparing an open competition to replace SHA-1 as the secure hash standard. Info.
----------------------------------------------------------------------
This PB implementation of SHA512 is hereby placed in the public domain. Use it as you wish.
Greg Turgeon
10/2008
Code:
'===================================================================== '-- SHA512a.BAS '-- Test bed for SHA512a.INC '-- Compiles with either PBWIN 9+ or PBCC 5+ ' Greg Turgeon 10/2008 '===================================================================== #COMPILE EXE #DIM ALL '============ #INCLUDE "WIN32API.INC" #INCLUDE "SHA512a.INC" '-------------------- ' Utility macros '-------------------- #IF %def(%pb_win32) MACRO eol=$CR MACRO say(t) MessageBox 0&, BYCOPY (t), EXE.Namex$, %MB_OK OR %MB_TASKMODAL END MACRO MACRO EnterCC END MACRO MACRO ExitCC END MACRO #ELSEIF %def(%pb_cc32) MACRO eol=$CRLF MACRO say(t)=stdout t MACRO EnterCC LOCAL launched AS LONG if (cursory = 1) and (cursorx = 1) then launched = -1 END MACRO MACRO ExitCC if launched then input flush stdout "Press any key to end" waitkey$ end if END MACRO #ENDIF '-------------------- '-- Utility functions '-------------------- DECLARE FUNCTION Get_FileSize(File_Name$) AS DWORD DECLARE FUNCTION ShowHash64(ShouldBe$, Hash$) AS LONG DECLARE FUNCTION Hex2ShowQuad(Buffer$) AS STRING '==================== FUNCTION PBMain() AS LONG LOCAL ecode AS LONG LOCAL dataBuffer, sha, shouldBe AS STRING EnterCC gosub TestVectors1 gosub TestVectors2 gosub TestVectors3 gosub FileHash function = ecode '============ ExitMain: ExitCC EXIT FUNCTION '============ TestVectors1: dataBuffer$ = "abc" 'target data sha$ = nul$(%HASHLEN) 'buffer into which hash routine will place hash SHA512_Buffer byval strptr(dataBuffer$), len(dataBuffer$), byval strptr(sha$) shouldBe$ = "DDAF35A193617ABA CC417349AE204131 12E6FA4E89A97EA2 0A9EEEE64B55D39A 2192992A274FC1A8 36BA3C23A3FEEBBD 454D4423643CE80E 2A9AC94FA54CA49F" ShowHash64 shouldBe$, sha$ RETURN '============ TestVectors2: dataBuffer$ = "abcdefghbcdefghicdefghijdefghijkefghijklfghijklmghijklmnhijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu" sha$ = nul$(%HASHLEN) SHA512_Buffer byval strptr(dataBuffer$), len(dataBuffer$), byval strptr(sha$) shouldBe$ = "8E959B75DAE313DA 8CF4F72814FC143F 8F7779C6EB9F7FA1 7299AEADB6889018 501D289E4900F7E4 331B99DEC4B5433A C7D329EEB6DD2654 5E96E55B874BE909" ShowHash64 shouldBe$, sha$ RETURN '============ TestVectors3: dataBuffer$ = string$(1000000,"a") sha$ = nul$(%HASHLEN) SHA512_Buffer byval strptr(dataBuffer$), len(dataBuffer$), byval strptr(sha$) shouldBe$ = "E718483D0CE76964 4E2E42C7BC15B463 8E1F98B13B204428 5632A803AFA973EB DE0FF244877EA60A 4CB0432CE577C31B EB009C5C2C49AA2E 4EADB217AD8CC09B" ShowHash64 shouldBe$, sha$ RETURN '============ FileHash: LOCAL t$, file_name$, file_size AS DWORD LOCAL t1, t2, t3 AS SINGLE file_name = command$ if len(file_name) = 0 then say(eol + "No file specified") return end if if isfile(file_name) = 0 then say(eol + "Cannot find file " + file_name) return end if t1 = GetTickCount ecode = SHA512_File(File_Name$, sha$) t2 = GetTickCount if ecode then say(eol+ "SHA512_File error" + str$(ecode) + eol + error$(ecode)) return end if t = t + file_name + eol file_size = Get_FileSize(file_name$) t3 = (t2-t1)/1000 t = t + "File size: " + using$(",",file_size) + " bytes" + eol t = t + "Time elapsed: " + format$(t3,"###.###") + " seconds" + eol t = t + format$(file_size/t3,"#########,") + " BPS" say(t) RETURN END FUNCTION '==================== FUNCTION Get_FileSize(File_Name$) AS DWORD LOCAL totalbytes AS DWORD, fdata AS DIRDATA if len(dir$(File_Name$, to fdata)) then totalbytes = fdata.FileSizeLow end if function = totalbytes END FUNCTION '==================== FUNCTION ShowHash64(shouldBe$, Hash$) AS LONG LOCAL t$ t = "Should be:" + eol + shouldBe$ + eol t = t + "Actual: " + eol + Hex2ShowQuad(Hash$) say(t) END FUNCTION '==================== FUNCTION Hex2ShowQuad(Buffer$) AS STRING REGISTER i AS LONG, j AS LONG LOCAL t$, pbyte AS BYTE PTR pbyte = strptr(Buffer$) for i = 0 to 7 for j = 0 to 7 t = t + hex$(@pbyte,2) incr pbyte next j t = t + " " next i function = t END FUNCTION '-- END SHA512a.BAS ---------------------------------------------------
Code:
'===================================================================== '-- SHA512a.INC '-- Implementation of the SHA512 secure hash algorithm '-- Compiles with either PBWIN 9+ or PBCC 5+ '-- WIN32 API not required '-- Uses no global data ' Greg Turgeon 10/2008 '===================================================================== %TRUE = 1 %FALSE = 0 '-- Set to %FALSE to return big-endian hash$ %RETURN_LITTLE_ENDIAN = %TRUE %ALIGNMENT = 16 %WORKSPACESIZE = ((8*8)+(80*8)+(8*8)) 's_array+w_array+xx,t0,t1,etc. %HASHLEN = 64 'bytes %BLOCKSIZE = 128 'bytes %FILE_BUFFERSIZE = 32000 'bytes TYPE SHA512_CONTEXT state((%HASHLEN\8)+(%ALIGNMENT\8)) AS QUAD 'here, 80 bytes lendata AS DWORD pdata AS BYTE PTR pstate AS QUAD PTR k_array AS QUAD PTR s_array AS BYTE PTR w_array AS BYTE PTR pworkspace AS BYTE PTR dummy1 AS LONG 'padding for 64-byte alignment workspace AS STRING * (%WORKSPACESIZE + %ALIGNMENT) END TYPE DECLARE FUNCTION SHA512_Buffer(BYVAL DataBuffer AS BYTE PTR, _ BYVAL Length AS DWORD, _ BYVAL HashBuffer AS BYTE PTR) AS LONG #IF 0 Parameters for SHA512_Buffer() specify the location of the data to be hashed, the size of the data, and the location where the hash$ is to be placed. Byte pointers are listed only to fit the logic of the action being performed. For example, the routine can be called with: ecode = SHA512_Buffer(byval strptr(buffer$), _ len(buffer$), _ byval strptr(hash$)) ecode = SHA512_Buffer(byval varptr(AnArray&(0)), _ (ubound(AnArray)*4)+4, _ byval varptr(aUDT.HashString)) ecode = SHA512_Buffer(byval varptr(aUDT), _ sizeof(aUDT), _ byval varptr(HashArray?(0))) However, the routine performs no error checking to verify the validity of the parameters passed. #ENDIF DECLARE FUNCTION SHA512_File(File_Name$, Hash$) AS LONG #IF 0 SHA512_File() expects a dynamic string$ to be passed for return of the hash; the string$ itself is resized within the routine. The routine returns zero on success or a PB (not Win32) error code. #ENDIF '-- Routines used internally ' SHA512_Buffer() and SHA512_File() test for SSE2 and MMX ' availability and call the routine w/highest performance DECLARE FUNCTION SHA512_Init(Ctx AS SHA512_CONTEXT) AS LONG DECLARE FUNCTION SHA512_MakePadding(BYVAL TotalBytes AS DWORD) AS STRING DECLARE FUNCTION SHA512_Compress128(Ctx AS SHA512_CONTEXT) AS LONG DECLARE FUNCTION SHA512_Compress64(Ctx AS SHA512_CONTEXT) AS LONG DECLARE FUNCTION SHA512_Compress32(Ctx AS SHA512_CONTEXT) AS LONG DECLARE FUNCTION HasSSE2() AS LONG DECLARE FUNCTION HasMMX() AS LONG '-------------------- MACRO align(p,alignment)=((p+(alignment-1)) AND (NOT(alignment-1))) '-------------------- MACRO ROR8_128(XMMReg,RotateVal) '-- Returns (x >> n) | (x << (64 - n)) '-- Destroys eax, edx, xmm6, xmm7 ! mov eax, RotateVal ! mov edx, 64 ! sub edx, eax ! movd xmm6, edx ! movdqa xmm7, XMMReg ;copy to xmm7 ! psrlq XMMReg, RotateVal ;shift each quad right ! psllq xmm7, xmm6 ;shift each quad left by edx ! por XMMReg, xmm7 ;OR the results END MACRO '-------------------- MACRO ROR8_64(MMXReg,RotateVal) '-- Returns (x >> n) | (x << (64 - n)) '-- Destroys eax, edx, mm6, mm7 ! mov eax, RotateVal ! mov edx, 64 ! sub edx, eax ! movd mm6, edx ! movq mm7, MMXReg ;copy to mm7 ! psrlq MMXReg, RotateVal ;shift right ! psllq mm7, mm6 ;shift left by edx ! por MMXReg, mm7 ;OR the results END MACRO '-------------------- MACRO ROR8(pQuad,RotateVal) MACROTEMP RorStore '-- Destroys eax, ecx, edx ! mov eax, pQuad ! mov ecx, RotateVal ! mov edx, [eax+4] ! mov eax, [eax] ! mov ebx, eax ;duplicate ebx = QuadLO ! and ecx, 63 ;RotateVal mod 64 ! shrd eax, edx, cl ! shrd edx, ebx, cl ! test ecx, 32 ;RotateVal > 31? ! jz RorStore ;done if yes ! xchg eax, edx ;otherwise rotate edx:eax, 32 RorStore: ! mov ecx, pQuad ! mov [ecx], eax ! mov [ecx+4], edx END MACRO '-------------------- MACRO SHR8_128(XMMReg,ShiftVal) ! psrlq XMMReg, ShiftVal END MACRO '-------------------- MACRO SHR8_64(MMXReg,ShiftVal) ! psrlq MMXReg, ShiftVal END MACRO '-------------------- MACRO SHR8(pQuad,ShiftVal) '-- Destroys eax, ecx, edx MACROTEMP SHR8Done ! mov eax, pQuad ! mov ecx, ShiftVal ! mov edx, [eax+4] ! mov eax, [eax] ! and ecx, 63 ;ShiftVal mod 64 ! shrd eax, edx, cl ;edx:eax shr (ShiftVal mod 32 ) ! shr edx, cl ! test ecx, 32 ;ShiftVal > 31? ! jz SHR8Done ;done if yes ! mov eax, edx ;otherwise shift right edx:eax, 32 ! xor edx, edx SHR8Done: ! mov ecx, pQuad ! mov [ecx], eax ! mov [ecx+4], edx END MACRO '-------------------- MACRO XOR8(px,py,pz) '-- Destroys eax, ebx, ecx, edx '-- Returns result at [px] ! mov edx, pz ! mov ecx, py ! mov edx, [edx] ! mov ecx, [ecx] ! mov eax, px ! xor edx, ecx ;edx = (pzLO XOR pyLO) ! mov ebx, eax ;save ebx --> px ! mov ecx, [eax] ! xor edx, ecx ;edx = (pzLO XOR pyLO) XOR pxLO ! mov [ebx], edx ;store low dword to pxLO ! mov ecx, py ! mov edx, [eax+4] ! mov ecx, [ecx+4] ! mov eax, pz ! xor edx, ecx ;edx = (pxHI XOR pyHI) ! mov ecx, [eax+4] ! xor edx, ecx ;edx = (pxHI XOR pyHI) XOR pzHI ! mov [ebx+4], edx ;store hi dword to pxHI END MACRO '-------------------- MACRO Chh128(px,py,pz,presult) '-- Chh(x,y,z)=(z XOR (x AND (y XOR z))) '-- Returns presult at [presult] '-- Destroys eax, ecx, edx, xmm0, xmm1, xmm2 ! mov eax, px ! mov ecx, py ! mov edx, pz ! movq xmm0, [eax] ! movq xmm1, [ecx] ! movq xmm2, [edx] ! mov eax, presult ! pxor xmm1, xmm2 ;(y XOR z) ! pand xmm0, xmm1 ;(x AND (y XOR z))) ! pxor xmm2, xmm0 ;(z XOR (x AND (y XOR z))) ! movq [eax], xmm2 END MACRO '-------------------- MACRO Chh64(px,py,pz,presult) '-- Chh(x,y,z)=(z XOR (x AND (y XOR z))) '-- Returns presult at [presult] '-- Destroys eax, ecx, edx, mm0, mm1, mm2 ! mov eax, px ! mov ecx, py ! mov edx, pz ! movq mm0, [eax] ! movq mm1, [ecx] ! movq mm2, [edx] ! mov eax, presult ! pxor mm1, mm2 ;(y XOR z) ! pand mm0, mm1 ;(x AND (y XOR z))) ! pxor mm2, mm0 ;(z XOR (x AND (y XOR z))) ! movq [eax], mm2 END MACRO '-------------------- MACRO Chh(px,py,pz,presult) '-- Chh(x,y,z)=(z XOR (x AND (y XOR z))) '-- Returns presult at [presult] '-- Destroys eax, ecx, edx ! mov eax, px ! mov ecx, py ! mov edx, pz ! mov eax, [eax] ! mov ecx, [ecx] ! mov edx, [edx] ! xor ecx, edx ;ecx = (y XOR z) ! and eax, ecx ;eax = (x AND (y XOR z)) ! mov ecx, presult ! xor edx, eax ;edx = (z XOR (x AND (y XOR z))) ! mov [ecx], edx ! mov eax, px ! mov ecx, py ! mov edx, pz ! mov eax, [eax+4] ! mov ecx, [ecx+4] ! mov edx, [edx+4] ! xor ecx, edx ;ecx = (y XOR z) ! and eax, ecx ;eax = (x AND (y XOR z)) ! mov ecx, presult ! xor edx, eax ;edx = (z XOR (x AND (y XOR z))) ! mov [ecx+4], edx END MACRO '-------------------- MACRO Maj128(px,py,pz,presult) '-- Maj(x,y,z)=(((x OR y) AND z) OR (x AND y)) '-- Destroys eax, ecx, edx, xmm0, xmm1, xmm2, xmm3 ! mov eax, px ! mov ecx, py ! mov edx, pz ! movq xmm0, [eax] ;xmm0 = [px] ! movq xmm1, [ecx] ;xmm1 = [py] ! movdqa xmm3, xmm0 ;copy: xmm3 = xmm0 = [px] ! movq xmm2, [edx] ;xmm2 = [pz] ! por xmm0, xmm1 ;xmm0 = (x OR y) ! pand xmm3, xmm1 ;xmm3 = (x AND y) ! pand xmm0, xmm2 ;xmm0 = ((x OR y) AND z) ! mov eax, presult ! por xmm0, xmm3 ;xmm0 = ((x OR y) AND z) OR (x AND y) ! movq [eax], xmm0 END MACRO '-------------------- MACRO Maj64(px,py,pz,presult) '-- Maj(x,y,z)=(((x OR y) AND z) OR (x AND y)) '-- Destroys eax, ecx, edx, mm0, mm1, mm2, mm3 ! mov eax, px ! mov ecx, py ! mov edx, pz ! movq mm0, [eax] ;mm0 = [px] ! movq mm1, [ecx] ;mm1 = [py] ! movq mm3, mm0 ;copy: mm0 = [px] ! movq mm2, [edx] ;mm2 = [pz] ! por mm0, mm1 ;mm0 = (x OR y) ! pand mm3, mm1 ;mm3 = (x AND y) ! pand mm0, mm2 ;mm0 = ((x OR y) AND z) ! mov eax, presult ! por mm0, mm3 ;mm0 = ((x OR y) AND z) OR (x AND y) ! movq [eax], mm0 END MACRO '-------------------- MACRO Maj(px,py,pz,presult) '-- Maj(x,y,z)=(((x OR y) AND z) OR (x AND y)) '-- Destroys eax, ebx, ecx, edx ! push esi ! push edi ! mov eax, px ! mov ecx, py ! mov edx, pz ! mov eax, [eax] ;eax = [pxLO] ! mov esi, [ecx] ;esi = [pyLO] ! mov edi, [edx] ;edi = [pzLO] ! mov ebx, eax ;copy: ebx = pxLO ! or eax, esi ;eax = (x OR y) ! and ebx, esi ;ebx = (x AND y) ! and eax, edi ;eax = ((x OR y) AND z) ! or eax, ebx ;eax = ((x OR y) AND z) OR (x AND y) ! mov edi, presult ! mov [edi], eax ;presultLO ! mov eax, px ! mov ecx, py ! mov eax, [eax+4] ;eax = [pxHI] ! mov esi, [ecx+4] ;esi = [pyHI] ! mov edi, [edx+4] ;edi = [pzHI] ! mov ebx, eax ;save: ebx = pxHI ! or eax, esi ;eax = (x OR y) ! and ebx, esi ;ebx = (x AND y) ! and eax, edi ;eax = ((x OR y) AND z) ! or eax, ebx ;eax = ((x OR y) AND z) OR (x AND y) ! mov edi, presult ! mov [edi+4], eax ;presultHI ! pop edi ! pop esi END MACRO '-------------------- MACRO Sigma0_128(pn,presult) '-- Destroys edx, xmm0, xmm1, xmm2 ! mov edx, pn ! movq xmm0, [edx] ! movdqa xmm1, xmm0 ! movdqa xmm2, xmm0 ROR8_128(xmm0,28) : ROR8_128(xmm1,34) : ROR8_128(xmm2,39) ! pxor xmm0, xmm1 ! mov edx, presult ! pxor xmm0, xmm2 ! movq [edx], xmm0 END MACRO '-------------------- MACRO Sigma0_64(pn,presult) '-- Destroys edx, mm0, mm1, mm2 ! mov edx, pn ! movq mm0, [edx] ! movq mm1, mm0 ! movq mm2, mm0 ROR8_64(mm0,28) : ROR8_64(mm1,34) : ROR8_64(mm2,39) ! pxor mm0, mm1 ! mov edx, presult ! pxor mm0, mm2 ! movq [edx], mm0 END MACRO '-------------------- MACRO Sigma0(pn,presult) Copy8XtoY(pn,xx) : Copy8XtoY(pn,yy) : Copy8XtoY(pn,zz) ROR8(xx,28) : ROR8(yy,34) : ROR8(zz,39) XOR8(xx,yy,zz) Copy8XtoY(xx,presult) END MACRO '-------------------- MACRO Sigma1_128(pn,presult) '-- Destroys edx, xmm0, xmm1, xmm2 ! mov edx, pn ! movq xmm0, [edx] ! movdqa xmm1, xmm0 ! movdqa xmm2, xmm0 ROR8_128(xmm0,14) : ROR8_128(xmm1,18) : ROR8_128(xmm2,41) ! pxor xmm0, xmm1 ! mov edx, presult ! pxor xmm0, xmm2 ! movq [edx], xmm0 END MACRO '-------------------- MACRO Sigma1_64(pn,presult) '-- Destroys edx, mm0, mm1, mm2 ! mov edx, pn ! movq mm0, [edx] ! movq mm1, mm0 ! movq mm2, mm0 ROR8_64(mm0,14) : ROR8_64(mm1,18) : ROR8_64(mm2,41) ! pxor mm0, mm1 ! mov edx, presult ! pxor mm0, mm2 ! movq [edx], mm0 END MACRO '-------------------- MACRO Sigma1(pn,presult) Copy8XtoY(pn,xx) : Copy8XtoY(pn,yy) : Copy8XtoY(pn,zz) ROR8(xx,14) : ROR8(yy,18) : ROR8(zz,41) XOR8(xx,yy,zz) Copy8XtoY(xx,presult) END MACRO '-------------------- MACRO Gamma0_128(pn,presult) '-- Destroys edx, xmm0, xmm1, xmm2 ! mov edx, pn ! movq xmm0, [edx] ! movdqa xmm1, xmm0 ! movdqa xmm2, xmm0 ROR8_128(xmm0,1) : ROR8_128(xmm1,8) : SHR8_128(xmm2,7) ! pxor xmm0, xmm1 ! mov edx, presult ! pxor xmm0, xmm2 ! movq [edx], xmm0 END MACRO '-------------------- MACRO Gamma0_64(pn,presult) '-- Destroys edx, mm0, mm1, mm2 ! mov edx, pn ! movq mm0, [edx] ! movq mm1, mm0 ! movq mm2, mm0 ROR8_64(mm0,1) : ROR8_64(mm1,8) : SHR8_64(mm2,7) ! pxor mm0, mm1 ! mov edx, presult ! pxor mm0, mm2 ! movq [edx], mm0 END MACRO '-------------------- MACRO Gamma0(pn,presult) Copy8XtoY(pn,xx) : Copy8XtoY(pn,yy) : Copy8XtoY(pn,zz) ROR8(xx,1) : ROR8(yy,8) : SHR8(zz,7) XOR8(xx,yy,zz) Copy8XtoY(xx,presult) END MACRO '-------------------- MACRO Gamma1_128(pn,presult) '-- Destroys edx, xmm0, xmm1, xmm2 ! mov edx, pn ! movq xmm0, [edx] ! movdqa xmm1, xmm0 ! movdqa xmm2, xmm0 ROR8_128(xmm0,19) : ROR8_128(xmm1,61) : SHR8_128(xmm2,6) ! pxor xmm0, xmm1 ! mov edx, presult ! pxor xmm0, xmm2 ! movq [edx], xmm0 END MACRO '-------------------- MACRO Gamma1_64(pn,presult) '-- Destroys edx, mm0, mm1, mm2 ! mov edx, pn ! movq mm0, [edx] ! movq mm1, mm0 ! movq mm2, mm0 ROR8_64(mm0,19) : ROR8_64(mm1,61) : SHR8_64(mm2,6) ! pxor mm0, mm1 ! mov edx, presult ! pxor mm0, mm2 ! movq [edx], mm0 END MACRO '-------------------- MACRO Gamma1(pn,presult) Copy8XtoY(pn,xx) : Copy8XtoY(pn,yy) : Copy8XtoY(pn,zz) ROR8(xx,19) : ROR8(yy,61) : SHR8(zz,6) XOR8(xx,yy,zz) Copy8XtoY(xx,presult) END MACRO '-------------------- MACRO Copy8XtoY128(px,py) '-- Destroys eax, edx, xmm0 ! mov eax, px ! mov edx, py ! movq xmm0, [eax] ! movq [edx], xmm0 END MACRO '-------------------- MACRO Copy8XtoY64(px,py) '-- Destroys eax, edx, mm0 ! mov eax, px ! mov edx, py ! movq mm0, [eax] ! movq [edx], mm0 END MACRO '-------------------- MACRO Copy8XtoY(px,py) '-- Destroys eax, ebx, ecx, edx ! mov eax, px ! mov edx, py ! mov ebx, [eax] ! mov ecx, [eax+4] ! mov [edx], ebx ! mov [edx+4], ecx END MACRO '-------------------- MACRO Add8XtoY128(px,py) '-- Destroys eax, edx, xmm6, xmm7 ! mov edx, py ;edx --> y throughout (target) ! mov eax, px ;eax --> x throughout ! movq xmm6, [edx] ! movq xmm7, [eax] ! paddq xmm6, xmm7 ! movq [edx], xmm6 END MACRO '-------------------- MACRO Add8XtoY64(px,py) Add8XtoY(px,py) END MACRO '-------------------- MACRO Add8XtoY(px,py) '-- Destroys eax, ebx, ecx, edx ! mov edx, py ;edx --> y throughout (target) ! mov eax, px ;eax --> x throughout ! mov ecx, [edx] ;ecx = y[0] ! add ecx, [eax] ;y[0] = y[0] + x[0] ! mov [edx], ecx ;store to y[0] ! mov ecx, [edx+4] ;ecx = y[3] ! adc ecx, [eax+4] ;ecx = y[3] + x[3] ! mov [edx+4], ecx ;store to y[3] END MACRO '==================== FUNCTION SHA512_Init(Ctx AS SHA512_CONTEXT) AS LONG LOCAL p AS DWORD p = varptr(Ctx.state(0)) Ctx.pstate = align(p,%ALIGNMENT) [email protected][0] = &h6A09E667F3BCC908&& [email protected][1] = &hBB67AE8584CAA73B&& [email protected][2] = &h3C6EF372FE94F82B&& [email protected][3] = &hA54FF53A5F1D36F1&& [email protected][4] = &h510E527FADE682D1&& [email protected][5] = &h9B05688C2B3E6C1F&& [email protected][6] = &h1F83D9ABFB41BD6B&& [email protected][7] = &h5BE0CD19137E2179&& Ctx.k_array = codeptr(K_Array_Data) p = varptr(Ctx.workspace) p = align(p,%ALIGNMENT) Ctx.s_array = p Ctx.w_array = p+(8*8) ' allow for 16-byte alignment in SHA512_Compress128() Ctx.pworkspace = p+((8*8)+(80*8)) EXIT FUNCTION '============ #ALIGN 16 K_Array_Data: ! DD &hD728AE22,&h428A2F98, &h23EF65CD,&h71374491, &hEC4D3B2F,&hB5C0FBCF, &h8189DBBC,&hE9B5DBA5 ! DD &hF348B538,&h3956C25B, &hB605D019,&h59F111F1, &hAF194F9B,&h923F82A4, &hDA6D8118,&hAB1C5ED5 ! DD &hA3030242,&hD807AA98, &h45706FBE,&h12835B01, &h4EE4B28C,&h243185BE, &hD5FFB4E2,&h550C7DC3 ! DD &hF27B896F,&h72BE5D74, &h3B1696B1,&h80DEB1FE, &h25C71235,&h9BDC06A7, &hCF692694,&hC19BF174 ! DD &h9EF14AD2,&hE49B69C1, &h384F25E3,&hEFBE4786, &h8B8CD5B5,&h0FC19DC6, &h77AC9C65,&h240CA1CC ! DD &h592B0275,&h2DE92C6F, &h6EA6E483,&h4A7484AA, &hBD41FBD4,&h5CB0A9DC, &h831153B5,&h76F988DA ! DD &hEE66DFAB,&h983E5152, &h2DB43210,&hA831C66D, &h98FB213F,&hB00327C8, &hBEEF0EE4,&hBF597FC7 ! DD &h3DA88FC2,&hC6E00BF3, &h930AA725,&hD5A79147, &hE003826F,&h06CA6351, &h0A0E6E70,&h14292967 ! DD &h46D22FFC,&h27B70A85, &h5C26C926,&h2E1B2138, &h5AC42AED,&h4D2C6DFC, &h9D95B3DF,&h53380D13 ! DD &h8BAF63DE,&h650A7354, &h3C77B2A8,&h766A0ABB, &h47EDAEE6,&h81C2C92E, &h1482353B,&h92722C85 ! DD &h4CF10364,&hA2BFE8A1, &hBC423001,&hA81A664B, &hD0F89791,&hC24B8B70, &h0654BE30,&hC76C51A3 ! DD &hD6EF5218,&hD192E819, &h5565A910,&hD6990624, &h5771202A,&hF40E3585, &h32BBD1B8,&h106AA070 ! DD &hB8D2D0C8,&h19A4C116, &h5141AB53,&h1E376C08, &hDF8EEB99,&h2748774C, &hE19B48A8,&h34B0BCB5 ! DD &hC5C95A63,&h391C0CB3, &hE3418ACB,&h4ED8AA4A, &h7763E373,&h5B9CCA4F, &hD6B2B8A3,&h682E6FF3 ! DD &h5DEFB2FC,&h748F82EE, &h43172F60,&h78A5636F, &hA1F0AB72,&h84C87814, &h1A6439EC,&h8CC70208 ! DD &h23631E28,&h90BEFFFA, &hDE82BDE9,&hA4506CEB, &hB2C67915,&hBEF9A3F7, &hE372532B,&hC67178F2 ! DD &hEA26619C,&hCA273ECE, &h21C0C207,&hD186B8C7, &hCDE0EB1E,&hEADA7DD6, &hEE6ED178,&hF57D4F7F ! DD &h72176FBA,&h06F067AA, &hA2C898A6,&h0A637DC5, &hBEF90DAE,&h113F9804, &h131C471B,&h1B710B35 ! DD &h23047D84,&h28DB77F5, &h40C72493,&h32CAAB7B, &h15C9BEBC,&h3C9EBE0A, &h9C100D4C,&h431D67C4 ! DD &hCB3E42B6,&h4CC5D4BE, &hFC657E2A,&h597F299C, &h3AD6FAEC,&h5FCB6FAB, &h4A475817,&h6C44198C END FUNCTION '==================== FUNCTION SHA512_Compress128(Ctx AS SHA512_CONTEXT) AS LONG '-- Requires SSE2 '-- In macros, EBX is considered always available; ESI & EDI are ' preserved around use #REGISTER NONE LOCAL i, x, xx, t0, t1, pstate, result AS LONG LOCAL s_array, w_array, k_array AS LONG LOCAL aa, bb, cc, ddd, ee, ff, gg, hh AS LONG s_array = Ctx.s_array : w_array = Ctx.w_array : k_array = CTX.k_array '-- Local vars aa-hh overlay s_array&&(0-7) aa = s_array : bb = s_array+8 : cc = s_array+16 : ddd = s_array+24 ee = s_array+32 : ff = s_array+40 : gg = s_array+48 : hh = s_array+56 xx = Ctx.pworkspace : t0 = xx+16 : t1 = xx+32 : result = xx+48 i = Ctx.pdata pstate = Ctx.pstate ! push ebx ! push esi ! push edi '-- Copy current state into s_array&&() 'poke$ s, peek$(Ctx.pstate, %HASHLEN) ! mov esi, pstate ! mov edi, s_array ! movdqa xmm0, [esi] ! movdqa xmm1, [esi+16] ! movdqa xmm2, [esi+32] ! movdqa xmm3, [esi+48] ! movdqa [edi], xmm0 ! movdqa [edi+16],xmm1 ! movdqa [edi+32],xmm2 ! movdqa [edi+48],xmm3 '-- Copy target data into w&&(0-15) w/64-bit little-to-big endian conversion ! mov esi, i ;i = Ctx.pdata = unaligned ! mov edi, w_array ! mov ecx, %BLOCKSIZE '-- 64-bit BSWAP * 2 /loop #ALIGN 16 BSwapCopyTop: ! sub ecx, 16 ! movdqu xmm0, [esi+ecx] ! sub ecx, 16 ! movdqu xmm2, [esi+ecx] ! movdqa xmm1, xmm0 ! movdqa xmm3, xmm2 ! psllw xmm0, 8 ! psllw xmm2, 8 ! psrlw xmm1, 8 ! psrlw xmm3, 8 ! por xmm0, xmm1 ! por xmm2, xmm3 ! pshufhw xmm0, xmm0, &b00011011 ! pshufhw xmm2, xmm2, &b00011011 ! pshuflw xmm0, xmm0, &b00011011 ! pshuflw xmm2, xmm2, &b00011011 ! movdqa [edi+ecx+16], xmm0 ! movdqa [edi+ecx], xmm2 ! test ecx, ecx ! jnz BSwapCopyTop '-- Fill w&&(16-79) ' for i = 16 to 79 ' @w[i] = Gamma1(@w[i-2]) + @w[i-7] + Gamma0(@w[i-15]) + @w[i-16] ' next i ! mov esi, 16 ;edi = w_array from above #ALIGN 16 TopLoop1: ! mov ebx, esi ! sub ebx, 2 ! lea eax, [edi+ebx*8] ;x = w+((i-2)*8) ! mov x, eax ;x --> w[i-2] Gamma1_128(x,result) 'result = Gamma1(@w[i-2]) ! mov ebx, esi ! mov edx, result ! sub ebx, 7 ! lea eax, [edi+ebx*8] ;eax --> @w[i-7] '! mov x, eax ;x --> w[i-2] 'Add8XtoY128(x,result) 'result = result + @w[i-2] ! movq xmm6, [edx] ! movq xmm7, [eax] ! paddq xmm6, xmm7 ! movq [edx], xmm6 ! mov ebx, esi ! sub ebx, 15 ! lea eax, [edi+ebx*8] ! mov x, eax ;x --> w[i-15] Gamma0_128(x,xx) Add8XtoY128(xx,result) 'result = result + Gamma0(@w[i-15]) ! mov ebx, esi ! mov edx, result ! sub ebx, 16 'Add8XtoY128(x,result) 'result = result + @w[i-16] ! lea eax, [edi+ebx*8] ;eax --> @w[i-16] ! movq xmm6, [edx] ! movq xmm7, [eax] ! paddq xmm6, xmm7 ! movq [edx], xmm6 ! lea eax, [edi+esi*8] ;x = w+(i*8) ! mov edx, result 'Copy8XtoY128(result,x) '@w[i] = @result ! movq xmm0, [edx] ! movq [eax], xmm0 ! inc esi ! cmp esi, 79 ! jng TopLoop1 'for i = 79 to 0 step -1 ! xor esi, esi ! mov edi, 80 #ALIGN 16 TopLoop2: 't0 = @hh + Sigma1&&(@ee) + Chh(@ee, @ff, @gg) + @CTX.k_array[i] + @w[i] Copy8XtoY128(hh,t0) Sigma1_128(ee,result) Add8XtoY128(result,t0) Chh128(ee,ff,gg,result) Add8XtoY128(result,t0) ! mov ebx, k_array ! mov edx, t0 ! lea eax, [ebx+esi] 'Add8XtoY128(x,t0) ! movq xmm6, [edx] ! movq xmm7, [eax] ! paddq xmm6, xmm7 ! movq [edx], xmm6 ! mov ebx, w_array ! lea eax, [ebx+esi] 'Add8XtoY128(x,t0) ! movq xmm6, [edx] ! movq xmm7, [eax] ! paddq xmm6, xmm7 ! movq [edx], xmm6 Sigma0_128(aa,t1) Maj128(aa,bb,cc,result) Add8XtoY128(result,t1) 'Copy8XtoY64(gg,hh) 'Copy8XtoY64(ff,gg) 'Copy8XtoY64(ee,ff) 'Copy8XtoY64(ddd,ee) '-- aa, cc, ee, gg = aligned ! mov edx, gg ! mov ecx, ff ! mov ebx, ee ! mov eax, ddd ! movq xmm3, [edx] ! movq xmm2, [ecx] ! movq xmm1, [ebx] ! movq xmm0, [eax] ! movq [edx+8], xmm3 ! movq [edx], xmm2 ! movq [ecx], xmm1 ! movq [ebx], xmm0 Add8XtoY128(t0,ee) 'Copy8XtoY64(cc,ddd) 'Copy8XtoY64(bb,cc) 'Copy8XtoY64(aa,bb) '@aa = t0 + t1 ! mov ecx, cc ! mov ebx, bb ! mov eax, aa ! mov edx, t0 ! movq xmm3, [ecx] ! movq xmm2, [ebx] ! movq xmm1, [eax] ! movq xmm0, [edx] ! movq [ecx+8], xmm3 ! movq [ecx], xmm2 ! movq [ebx], xmm1 ! movq [eax], xmm0 Add8XtoY128(t1,aa) 'next i ! add esi, 8 ! dec edi ! jnz TopLoop2 'for i = 0 to 7 : [email protected][i] = [email protected][i] + @s[i] : next i ! mov esi, s_array ;esi --> s_array&&(0) (aligned) ! mov edi, pstate ;edi --> Ctx.State(0) ! movdqa xmm0, [edi] ! movdqa xmm1, [edi+16] ! movdqa xmm2, [edi+32] ! movdqa xmm3, [edi+48] ! paddq xmm0, [esi] ! paddq xmm1, [esi+16] ! paddq xmm2, [esi+32] ! paddq xmm3, [esi+48] ! movdqa [edi], xmm0 ! movdqa [edi+16], xmm1 ! movdqa [edi+32], xmm2 ! movdqa [edi+48], xmm3 '-- Burn context's temp values (poke$ Ctx.pworkspace, nul$(%WORKSPACESIZE)) ! mov ecx, %WORKSPACESIZE ! pxor xmm0, xmm0 ! lea edi, [esi+ecx] ;point to end of workspace ! pxor xmm1, xmm1 ! neg ecx BurnTop: ! movdqa [edi+ecx], xmm0 ! movdqa [edi+ecx+16], xmm0 ! add ecx, 32 ! jnz BurnTop ! pop edi ! pop esi ! pop ebx END FUNCTION '==================== FUNCTION SHA512_Compress64(Ctx AS SHA512_CONTEXT) AS LONG '-- Requires MMX '-- In macros, EBX is considered always available; ESI & EDI are ' preserved around use #REGISTER NONE LOCAL i, x, xx, yy, zz, pstate, t0, t1, result AS LONG LOCAL s_array, w_array, k_array AS LONG LOCAL aa, bb, cc, ddd, ee, ff, gg, hh AS LONG s_array = Ctx.s_array : w_array = Ctx.w_array : k_array = CTX.k_array '-- Local vars aa-hh overlay s_array&&(0-7) aa = s_array : bb = s_array+8 : cc = s_array+16 : ddd = s_array+24 ee = s_array+32 : ff = s_array+40 : gg = s_array+48 : hh = s_array+56 xx = Ctx.pworkspace : yy = xx+8 : zz = xx+16 t0 = xx+24 : t1 = xx+32 : result = xx+40 i = Ctx.pdata pstate = Ctx.pstate ! push ebx ! push esi ! push edi '-- Copy current state into s_array&&() 'poke$ s, peek$(Ctx.pstate, %HASHLEN) ! mov esi, pstate ! mov edi, s_array ! movq mm0, [esi] ! movq mm1, [esi+8] ! movq mm2, [esi+16] ! movq mm3, [esi+24] ! movq mm4, [esi+32] ! movq mm5, [esi+40] ! movq mm6, [esi+48] ! movq mm7, [esi+56] ! movq [edi], mm0 ! movq [edi+8], mm1 ! movq [edi+16],mm2 ! movq [edi+24],mm3 ! movq [edi+32],mm4 ! movq [edi+40],mm5 ! movq [edi+48],mm6 ! movq [edi+56],mm7 '-- Copy target data into w&&(0-15) w/64-bit little-to-big endian conversion ! mov esi, i ! mov edi, w_array ! mov ecx, %BLOCKSIZE #ALIGN 4 BSwapCopyTop: ! sub ecx, 4 ! mov eax, [esi+ecx] ! sub ecx, 4 ! bswap eax ! mov edx, [esi+ecx] ! mov [edi+ecx], eax ! bswap edx ! test ecx, ecx ! mov [edi+ecx+4], edx ! jnz BSwapCopyTop '-- Fill w&&(16-79) ' for i = 16 to 79 ' @w[i] = Gamma1(@w[i-2]) + @w[i-7] + Gamma0(@w[i-15]) + @w[i-16] ' next i ! mov esi, 16 'edi = w from above #ALIGN 8 TopLoop1: ! mov ebx, esi ! sub ebx, 2 ! lea eax, [edi+ebx*8] ;x = w+((i-2)*8) ! mov x, eax ;x --> w[i-2] Gamma1_64(x,result) 'result = Gamma1(@w[i-2]) ! mov ebx, esi ! sub ebx, 7 ! lea eax, [edi+ebx*8] ! mov x, eax ;x --> @w[i-7] (x = w+((i-7)*8)) Add8XtoY64(x,result) 'result = result + @w[i-7] ! mov ebx, esi ! sub ebx, 15 ! lea eax, [edi+ebx*8] ! mov x, eax ;x --> w[i-15] Gamma0_64(x,xx) Add8XtoY64(xx,result) 'result = result + Gamma0(@w[i-15]) ! mov ebx, esi ! sub ebx, 16 ! lea eax, [edi+ebx*8] ! mov x, eax ;x --> @w[i-16] Add8XtoY64(x,result) 'result = result + @w[i-16] ! mov edx, result ! lea eax, [edi+esi*8] ;x = w+(i*8) (@w[i]) 'Copy8XtoY64(result,x) '@w[i] = @result ! movq mm0, [edx] ! movq [eax],mm0 ! inc esi ! cmp esi, 79 ! jng TopLoop1 'for i = 79 to 0 step -1 ! xor esi, esi ! mov edi, 80 #ALIGN 8 TopLoop2: 't0 = @hh + Sigma1&&(@ee) + Chh(@ee, @ff, @gg) + @CTX.k_array[i] + @w[i] Copy8XtoY64(hh,t0) Sigma1_64(ee,result) Add8XtoY64(result,t0) Chh64(ee,ff,gg,result) Add8XtoY64(result,t0) ! mov ebx, k_array ! lea eax, [ebx+esi] ! mov x, eax Add8XtoY64(x,t0) ! mov ebx, w_array ! lea eax, [ebx+esi] ! mov x, eax Add8XtoY64(x,t0) Sigma0_64(aa,t1) Maj64(aa,bb,cc,result) Add8XtoY64(result,t1) 'Copy8XtoY64(gg,hh) 'Copy8XtoY64(ff,gg) 'Copy8XtoY64(ee,ff) 'Copy8XtoY64(ddd,ee) ! mov edx, gg ! mov ecx, ff ! mov ebx, ee ! mov eax, ddd ! movq mm3, [edx] ! movq mm2, [ecx] ! movq mm1, [ebx] ! movq mm0, [eax] ! movq [edx+8], mm3 ! movq [edx], mm2 ! movq [ecx], mm1 ! movq [ebx], mm0 Add8XtoY64(t0,ee) 'Copy8XtoY64(cc,ddd) 'Copy8XtoY64(bb,cc) 'Copy8XtoY64(aa,bb) '@aa = t0 + t1 ! mov ecx, cc ! mov ebx, bb ! mov eax, aa ! mov edx, t0 ! movq mm3, [ecx] ! movq mm2, [ebx] ! movq mm1, [eax] ! movq mm0, [edx] ! movq [ecx+8], mm3 ! movq [ecx], mm2 ! movq [ebx], mm1 ! movq [eax], mm0 Add8XtoY64(t1,aa) 'next i ! add esi, 8 ! dec edi ! jnz TopLoop2 'for i = 0 to 7 : [email protected][i] = [email protected][i] + @s[i] : next i ! mov eax, s_array ;eax --> s_array&&(0) ! mov edx, pstate ;edx --> Ctx.State(0) ! mov xx, eax ;xx --> s_array&&(0) ! mov yy, edx ;yy --> Ctx.state0 ! mov edi, 8 ! mov esi, 7 Add8XtoY64(xx,yy) #ALIGN 4 TopLoop3: 'advance pointers ! add xx, edi ;xx --> s[i] ! add yy, edi ;yy --> pcurrent_state[i] Add8XtoY64(xx,yy) ! dec esi ! jnz TopLoop3 '-- Burn context's temp values (poke$ Ctx.pworkspace, nul$(%WORKSPACESIZE)) ! mov edi, s_array ! xor eax, eax ! mov ecx, (%WORKSPACESIZE\4) ! cld ! rep stosd ! pop edi ! pop esi ! pop ebx ! emms END FUNCTION '==================== FUNCTION SHA512_Compress32(Ctx AS SHA512_CONTEXT) AS LONG '-- Uses 32-bit code only '-- In macros, EBX is considered always available; ESI & EDI are ' preserved around use #REGISTER NONE LOCAL i, x, xx, yy, zz, pstate, t0, t1, result AS LONG LOCAL s_array, w_array, k_array AS LONG LOCAL aa, bb, cc, ddd, ee, ff, gg, hh AS LONG s_array = Ctx.s_array : w_array = Ctx.w_array : k_array = CTX.k_array '-- Local vars aa-hh overlay s_array&&(0-7) aa = s_array : bb = s_array+8 : cc = s_array+16 : ddd = s_array+24 ee = s_array+32 : ff = s_array+40 : gg = s_array+48 : hh = s_array+56 xx = Ctx.pworkspace : yy = xx+8 : zz = xx+16 t0 = xx+24 : t1 = xx+32 : result = xx+40 pstate = Ctx.pstate '-- Copy current state into s_array&&() poke$ s_array, peek$(pstate, %HASHLEN) '-- Copy target data into w&&(0-15) w/64-bit little-to-big endian conversion i = Ctx.pdata ! push ebx ! push esi ! push edi ! mov esi, i ! mov edi, w_array ! mov ecx, %BLOCKSIZE #ALIGN 4 BSwapCopyTop: ! sub ecx, 4 ! mov eax, [esi+ecx] ! sub ecx, 4 ! bswap eax ! mov edx, [esi+ecx] ! mov [edi+ecx], eax ! bswap edx ! test ecx, ecx ! mov [edi+ecx+4], edx ! jnz BSwapCopyTop '-- Fill w&&(16-79) ' for i = 16 to 79 ' @w[i] = Gamma1(@w[i-2]) + @w[i-7] + Gamma0(@w[i-15]) + @w[i-16] ' next i ! mov esi, 16 ;edi = w from above #ALIGN 4 TopLoop1: '@w[i] = Gamma1(@w[i-2]) + @w[i-7] + Gamma0(@w[i-15]) + @w[i-16] ! mov ebx, esi ! sub ebx, 2 ! lea eax, [edi+ebx*8] ;x = w+((i-2)*8) ! mov x, eax ;x --> w[i-2] Gamma1(x,result) 'result = Gamma1(@w[i-2]) ! mov ebx, esi ! sub ebx, 7 ! lea eax, [edi+ebx*8] ! mov x, eax ;x --> @w[i-7] (x = w+((i-7)*8)) Add8XtoY(x,result) 'result = total + @w[i-7] ! mov ebx, esi ! sub ebx, 15 ! lea eax, [edi+ebx*8] ! mov x, eax ;x --> w[i-15] Gamma0(x,xx) Add8XtoY(xx,result) 'result = result + Gamma0(@w[i-15]) ! mov ebx, esi ! sub ebx, 16 ! lea eax, [edi+ebx*8] ! mov x, eax ;x --> @w[i-16] Add8XtoY(x,result) 'result = result + @w[i-16] ! lea eax, [edi+esi*8] ;x = w+(i*8) ! mov x, eax ;x --> @w[i] Copy8XtoY(result,x) '@w[i] = @result ! inc esi ! cmp esi, 79 ! jng TopLoop1 'for i = 79 to 0 step -1 ! xor esi, esi ! mov edi, 80 #ALIGN 4 TopLoop2: 't0 = @hh + Sigma1&&(@ee) + Chh(@ee, @ff, @gg) + @CTX.k_array[i] + @w[i] Copy8XtoY(hh,t0) Sigma1(ee,result) Add8XtoY(result,t0) Chh(ee,ff,gg,result) Add8XtoY(result,t0) ! mov ebx, k_array ! lea eax, [ebx+esi] ! mov x, eax Add8XtoY(x,t0) ! mov ebx, w_array ! lea eax, [ebx+esi] ! mov x, eax Add8XtoY(x,t0) Sigma0(aa,t1) Maj(aa,bb,cc,result) Add8XtoY(result,t1) Copy8XtoY(gg,hh) Copy8XtoY(ff,gg) Copy8XtoY(ee,ff) Copy8XtoY(ddd,ee) Add8XtoY(t0,ee) Copy8XtoY(cc,ddd) Copy8XtoY(bb,cc) Copy8XtoY(aa,bb) Copy8XtoY(t0,aa) Add8XtoY(t1,aa) 'next i ! add esi, 8 ! dec edi ! jnz TopLoop2 'for i = 0 to 7 : [email protected][i] = [email protected][i] + @s[i] : next i ! mov eax, s_array ;eax --> s_array&&(0) ! mov edx, pstate ;edx --> Ctx.State(0) ! mov xx, eax ;xx --> s_array&&(0) ! mov yy, edx ;yy --> Ctx.state0 ! mov edi, 8 ! mov esi, 7 Add8XtoY(xx,yy) #ALIGN 4 TopLoop3: 'advance pointers ! add xx, edi ;xx --> s[i] ! add yy, edi ;yy --> pcurrent_state[i] Add8XtoY(xx,yy) ! dec esi ! jnz TopLoop3 '-- Burn context's temp values (poke$ Ctx.pworkspace, nul$(%WORKSPACESIZE)) ! mov edi, s_array ! xor eax, eax ! mov ecx, (%WORKSPACESIZE\4) ! cld ! rep stosd ! pop edi ! pop esi ! pop ebx END FUNCTION '==================== FUNCTION SHA512_Buffer(BYVAL DataBuffer AS BYTE PTR, BYVAL Length AS DWORD, BYVAL HashBuffer AS BYTE PTR) EXPORT AS LONG '-- Expects parameter Hash to point to buffer of correct size of %HASHLEN bytes (512bits\8) REGISTER i AS DWORD LOCAL lastbuff$, ctx AS SHA512_CONTEXT, pfunction, pstate AS LONG i = Length AND (%BLOCKSIZE-1) lastbuff$ = peek$((DataBuffer+Length)-i, i) lastbuff$ = lastbuff$ + SHA512_MakePadding(Length) SHA512_Init ctx if HasSSE2&() then pfunction = codeptr(SHA512_Compress128) elseif HasMMX&() then pfunction = codeptr(SHA512_Compress64) else pfunction = codeptr(SHA512_Compress32) end if ctx.lendata = Length ctx.pdata = DataBuffer i = Length AND (NOT %BLOCKSIZE-1) do while i > 0 call dword pfunction SDECL (ctx) ctx.pdata = ctx.pdata + %BLOCKSIZE i = i - %BLOCKSIZE loop ctx.pdata = strptr(lastbuff$) ctx.lendata = len(lastbuff$) do while ctx.lendata > 0 call dword pfunction STDCALL (BYREF ctx) ctx.pdata = ctx.pdata + %BLOCKSIZE ctx.lendata = ctx.lendata - %BLOCKSIZE loop '-- Copy current state from s&() to Hash 'for i = 0 to (%HASHLEN\8)-1 : @Hash[i] = [email protected][i] : next i pstate = ctx.pstate ! push esi ! push edi ! mov esi, pstate ;esi -> ctx.state(0) ! mov edi, HashBuffer #IF %RETURN_LITTLE_ENDIAN ! xor ecx, ecx LoopTop: ! mov edx, [esi+ecx*4+4] ! mov eax, [esi+ecx*4] ! bswap edx ! bswap eax ! mov [ecx*4+edi], edx ! inc ecx ! mov [ecx*4+edi], eax ! inc ecx ! test ecx, (%HASHLEN\4) ! jz LoopTop #ELSE ! mov ecx, (%HASHLEN\4) ! cld ! rep movsd #ENDIF ! pop edi ! pop esi END FUNCTION '==================== FUNCTION SHA512_File(File_Name$, Hash$) EXPORT AS LONG '-- Returns 0 on success or PB (not OS) error code '-- Parameter Hash$ is resized here before return REGISTER i AS LONG, bytesleft AS DWORD LOCAL buffer$, padding$ LOCAL ctx AS SHA512_CONTEXT, phash AS QUAD PTR LOCAL ecode, pfunction, pstate, infile, lastpass, maxstring AS LONG '-- If file not found, return PB error code if isfile(File_Name$) = 0 then function = 53 : exit function end if buffer = string$(%FILE_BUFFERSIZE, 0) maxstring = %FILE_BUFFERSIZE ctx.lendata = %BLOCKSIZE SHA512_Init ctx if HasSSE2&() then pfunction = codeptr(SHA512_Compress128) elseif HasMMX&() then pfunction = codeptr(SHA512_Compress64) else pfunction = codeptr(SHA512_Compress32) end if infile = freefile open File_Name$ for binary lock shared as infile base=0 if err then goto SHA_File_Error bytesleft = lof(infile) padding = SHA512_MakePadding(bytesleft) do 'Resize if necessary & flag final buffer if bytesleft =< maxstring then maxstring = bytesleft buffer = string$(maxstring, 0) incr lastpass end if get infile,, buffer : if err then goto SHA_File_Error if lastpass then buffer = buffer + padding ctx.pdata = strptr(buffer) for i = 1 to (len(buffer)\%BLOCKSIZE) call dword pfunction STDCALL (BYREF ctx) ctx.pdata = ctx.pdata + %BLOCKSIZE next i bytesleft = bytesleft - maxstring loop until lastpass close infile : if err then goto SHA_File_Error '-- Copy current state from s&() to Hash$ 'for i = 0 to 7 : @Hash[i] = [email protected][i] : next i Hash$ = string$(%HASHLEN,0) phash = strptr(Hash$) pstate = ctx.pstate ! push esi ! push edi ! mov esi, pstate ;esi -> ctx.state(0) ! mov edi, phash #IF %RETURN_LITTLE_ENDIAN ! xor ecx, ecx LoopTop: ! mov edx, [esi+ecx*4+4] ! mov eax, [esi+ecx*4] ! bswap edx ! bswap eax ! mov [edi+ecx*4], edx ! inc ecx ! mov [edi+ecx*4], eax ! inc ecx ! test ecx, (%HASHLEN\4) ! jz LoopTop #ELSE ! mov ecx, (%HASHLEN\4) ! cld ! rep movsd #ENDIF ! pop edi ! pop esi Exit_SHA_File: function = ecode EXIT FUNCTION '============ SHA_File_Error: if err then ecode = errclear else ecode = -1 end if RESUME Exit_SHA_File END FUNCTION '========================= FUNCTION SHA512_MakePadding(BYVAL TotalBytes AS DWORD) AS STRING '-- Creates the necessary string to append to targeted data buffer REGISTER i AS LONG, padBytes AS LONG LOCAL buffBits AS QUAD, padding$ LOCAL pbyte1, pbyte2 AS BYTE PTR buffBits = TotalBytes * 8 padding$ = nul$(16) pbyte1 = strptr(padding$)+8 : pbyte2 = varptr(buffBits) '-- Reverse bytes during copy for i = 0 to 7 @pbyte1[i] = @pbyte2[7 - i] next i padBytes = %BLOCKSIZE - ((TotalBytes+17) AND (%BLOCKSIZE-1)) function = chr$(&h80) + nul$(padBytes) + padding$ END FUNCTION '=================== FUNCTION HasSSE2() AS LONG ! mov eax, 1 ! cpuid ! xor eax, eax ! test edx, &h04000000 ;bit 26 ! setnz al ;rem to force downgrade to MMX ! mov function, eax END FUNCTION '=================== FUNCTION HasMMX() AS LONG ! mov eax, 1 ! cpuid ! xor eax, eax ! test edx, &h800000 ;bit 23 ! setnz al ;rem to force downgrade to 32-bit ! mov function, eax END FUNCTION '-- END SHA512a.INC ---------------------------------------------------
Comment