Announcement

Collapse
No announcement yet.

Rnd2 disscussion

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • David Roberts
    replied
    It is looking good John.

    We must make sure that if folk use Rnd2WC.inc then Rnd2TidyUp is required whether Case 7 is used or not.

    We may finally have arrived at grandma's house.
    Yeah, but we didn't tell her when though, did we?

    Suppose we have an application which uses random numbers but insists on using sequences that have never seen the light of day before.

    Enter two new macros ...
    Code:
    Macro Rnd2SaveSeq = Rnd2(9)  ' Save current sequence
    Macro Rnd2LoadSeq = Rnd2(10) ' Load saved sequence
    and two new Cases ...
    Code:
    Case 9
      Open "Rand.dat" For Binary As #1
        Put$ #1, StoreState
      Close #1
     
    Case 10
      Open "Rand.dat" For Binary As #1
        Get$ #1, 60, StoreState
      Close #1
    In use ...
    Code:
    Rnd2SetUp ' If Rnd2WC.inc [b]and[/b] Case 7 are being used
     
    If IsFile("Rand.dat") = %False Then
      Rnd2SaveSeq
    End If
     
    Rnd2LoadSeq
     
    ' Do some work
    '-----------------------------------
    pStr = ""
    For i = 1 To 20
      pStr = pStr + Str$(Rnd2()) + $CrLf
    Next
    MsgBox pStr
    '-----------------------------------
     
    Rnd2SaveSeq
     
    Rnd2TidyUp ' If Rnd2wC.inc is being used
    Of course, all of the other generators may be used and provided that 'Rnd2SaveSeq' is used before doing so we can go back to the 'primary' generator with 'Rnd2LoadSeq' to maintain the protocol of "using sequences that have never seen the light of day before."

    I'm still kicking this around.

    Perhaps we should forget about anything new and put this on the back burner for 'Rnd2 revisited again'.

    Leave a comment:


  • John Gleason
    replied
    I'm thinking it's gonna be tough to improve on this. We may finally have arrived at grandma's house.

    I put all the code together into an include file using self-initialization and crypto availability, and am getting good speed too:
    Code:
    RND:           1.000  Reference 2571 ms
    Rnd2 [0,1):    0.722
    Rnd2 (-1,1):   0.728
    Crypto(-1,1):  1.056 x50
    Wheel(-1,1):   1.075 x50
    *****Integer Mode*****
    RND(1, 52):   1.585
    Rnd2(1, 52):  0.747
    RND(-2147483648, 2147483647):   1.663
    Rnd2(-2147483648, 2147483647):  0.789
    Rnd2WC.inc
    Code:
    'Rnd2WC.inc (Rnd2 WildWheel and Crypto)
    'Default configuration uses auto-initialization.
    'This include file is self-contained and doesn't require win32api.inc
    
    TYPE OSVERSIONINFO
      dwOSVersionInfoSize AS DWORD
      dwMajorVersion AS DWORD
      dwMinorVersion AS DWORD
      dwBuildNumber AS DWORD
      dwPlatformId AS DWORD
      szCSDVersion AS ASCIIZ * 128  ' Maintenance string for PSS usage
    END TYPE
    
    DECLARE FUNCTION FreeLibrary LIB "KERNEL32.DLL" ALIAS "FreeLibrary" (BYVAL hLibModule AS DWORD) AS LONG
    DECLARE FUNCTION GetProcAddress LIB "KERNEL32.DLL" ALIAS "GetProcAddress" _
          ( BYVAL hModule AS DWORD, lpProcName AS ASCIIZ ) AS LONG
    DECLARE FUNCTION GetVersionEx LIB "KERNEL32.DLL" ALIAS "GetVersionExA" (lpVersionInformation AS ANY) AS LONG
    DECLARE FUNCTION LoadLibrary LIB "KERNEL32.DLL" ALIAS "LoadLibraryA" (lpLibFileName AS ASCIIZ) AS LONG
    DECLARE FUNCTION QueryPerformanceCounter LIB "KERNEL32.DLL" ALIAS "QueryPerformanceCounter" _
          ( lpPerformanceCount AS QUAD) AS LONG
    DECLARE FUNCTION CryptAcquireContext LIB "advapi32.dll" ALIAS "CryptAcquireContextA" _
          ( hProv AS LONG, zContainer AS ASCIIZ, zProvider AS ASCIIZ, _
             BYVAL dwProvType AS DWORD, BYVAL dwFlags AS DWORD ) AS LONG
    DECLARE FUNCTION CryptReleaseContext LIB "advapi32.dll" ALIAS "CryptReleaseContext" _
          ( BYVAL hProv AS LONG, BYVAL dwFlags AS DWORD ) AS LONG
    DECLARE FUNCTION CryptGenRandom LIB "advapi32.dll" ALIAS "CryptGenRandom" _
          ( BYVAL hProv AS LONG, BYVAL dwLen AS DWORD, pbBuffer AS BYTE) AS LONG
    DECLARE FUNCTION RtlGenRandom( RandomBuffer AS BYTE, BYVAL RandomBufferLength AS DWORD ) AS LONG
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ' Rnd2 Macros
    MACRO Rnd2mize           = Rnd2(1)                       'WildWheel© seed the random number generator
    MACRO Rnd2mizeCrypto     = Rnd2(2)                       'cryptographically seed the random number generator
    MACRO Rnd2Long           = Rnd2(-2147483648, 2147483647) 'commonly used full LONG range
    MACRO Rnd2SetUp          = Rnd2(7)
    MACRO Rnd2TidyUp         = Rnd2(8)
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ' Rnd2 Additional Macros
    MACRO Rnd2LastNum        = Rnd2(0)                       'get last random generated.
    MACRO Rnd2Default        = Rnd2(3)                       'set or reset to default sequence
    MACRO Rnd2Redo           = Rnd2(4)                       'repeat from beginning of seq, or last bookmarked position
    MACRO Rnd2Mark           = Rnd2(5)                       'save position in a sequence
    MACRO Rnd2TogSign        = Rnd2(6)                       'turn on/off EXT random range -1 to 1 exclusive.
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    $MS_DEF_PROV            = "Microsoft Base Cryptographic Provider v1.0"
    %PROV_RSA_FULL          = 1
    %CRYPT_VERIFYCONTEXT    = &hF0000000
    %XP                     = 1
    %NULL                   = 0
    %VER_PLATFORM_WIN32_NT  = 2
    
    FUNCTION GetCryptoBytes( BYVAL Which AS LONG, BYVAL hProvProc AS LONG, x AS EXT ) AS LONG
      DIM BinaryByte(9) AS STATIC BYTE
      STATIC hexSize AS DWORD
      hexSize = 10
      IF Which = %XP THEN
        CALL DWORD hProvProc USING RtlGenRandom( BinaryByte(0), hexSize )
      ELSE
        CryptGenRandom( hProvProc, hexSize, BinaryByte(0) )
      END IF
      x = PEEK( EXT, VARPTR( BinaryByte(0)))
    END FUNCTION
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    FUNCTION Rnd2( OPT One AS LONG, Two AS LONG ) AS EXT
     #REGISTER NONE
     STATIC storeState AS STRING * 60   'mwc means "multiply with carry" random generator based on theory
                                        'by George Marsaglia and is the fastest generator known to man. A man. Me. ;)
     STATIC Buffer AS EXT
     STATIC os AS OSVERSIONINFO, strOS AS STRING
     STATIC CryptoXP AS LONG
     STATIC hProvProc, hLib AS DWORD
    
      !lea esi, storeState      ;all our variables are now in one declare: storeState
    '---------------------------------------------------------------------------------------------------------------
    '---------------- If initializing with CASE 7, comment out the below code --------------------------------------
      !cmp dword ptr[esi+00], 0 ;is mwcSoph 0? if so, this is 1st time thru & storeState must be filled with data
      !jne       passOneTime
         GOSUB gsDefaultState
         GOSUB gsStoreSeq
         ' Crypto initialization
         !push esi
         os.dwOSVersionInfoSize=SIZEOF(os)
         GetVersionEx os
         IF os.dwPlatformId = %VER_PLATFORM_WIN32_NT THEN
           strOS = TRIM$(STR$(os.dwMajorVersion)) & "." & TRIM$(STR$(os.dwMinorVersion))
           IF strOS > "5.0" THEN ' we have XP/Vista/Server 2003 or 2008
             CryptoXP = %XP
           ELSE
             CryptoXP = 0
           END IF
         ELSE
           CryptoXP = 0
         END IF
    
         IF CrypToXP = %XP THEN
           hLib = LoadLibrary( "advapi32.dll")
           hProvProc = GetProcAddress(hLib, "SystemFunction036")
         ELSE
           CryptAcquireContext( hProvProc, BYVAL %Null, BYVAL %Null, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
           CryptAcquireContext( hProvProc, BYVAL %Null, $MS_DEF_PROV, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
         END IF
         !pop esi
         ' End Crypto initialization
      passOneTime:
    '---------------- If initializing with CASE 7, comment out the above code --------------------------------------
    '---------------------------------------------------------------------------------------------------------------
    
      !lea eax, Two
      !mov ecx, [eax]
      !cmp ecx, 0
      !jne twoParams      ;if optional parameter Two passed, belt down to integer Rnd2 code section
      !lea eax, One
      !mov ecx, [eax]
      !cmp ecx, 0
      !je noParams        ;if no params, boogie down to noParams code
    
      'we got here so there must be 1 parameter
    
            SELECT CASE AS LONG One
    
              CASE 2
              reSeedCrypt:
                !push esi
                GetCryptoBytes(CryptoXP, hProvProc, Buffer)
                !pop esi
                !lea ecx, Buffer
                !mov eax, [ecx]
                !mov [esi+4], eax                 ; mwcRandom
                !mov eax, [ecx+4]
                !mov [esi+8], eax                 ; mwcCarry
                !movzx eax, Word Ptr[ecx+8]       ; the last two Of the 10 collected
                !mov ecx, 384                     ; 384 possible sophie-germain multipliers
                !mov edx, 0                       ; Clear edx For division
                !div ecx                          ; Random remainder (Mod 384) In edx now
                !lea ecx, multiplier              ; Pointer To 384 multipliers
                !mov ecx, [ecx+edx*4]             ; got Rnd soph germain multiplier now
                !mov [esi], ecx                   ; mwcSoph
                !Sub ecx, 1                       ; mwcSoph - 1
                !cmp [esi+8], ecx                 ; does mwcCarry = mwcSoph - 1
                !je mwcCarrySophFail              ; yes
                !cmp Dword Ptr[esi+8], 0          ; Is mwcCarry = 0
                !jne OK                           ; no, Not interested In mwcRandom now
                !cmp Dword Ptr[esi+4], 0          ; Is mwcRandom = 0
                !je reSeedCrypt                   ; we now have both mwcCarry & mwcRandom = 0
                !jmp OK                           ; mwcCarry = 0 but mwcRandom Is Not
              mwcCarrySophFail:
                !cmp Dword Ptr[esi+4], &hffffffff ; does mwcRandom = &hffffffff
                !je reSeedCrypt                   ; mwcCarry = mwcSoph - 1 & mwcRandom = &hffffffff
              OK:
                GOSUB gsStoreSeq
    
              CASE 7
                !lea esi, storeState      ;All our variables are now In one Declare: storeState
                GOSUB gsDefaultState
                GOSUB gsStoreSeq
                ' Crypto initialization
                !push esi
                os.dwOSVersionInfoSize=SIZEOF(os)
                GetVersionEx os
                IF os.dwPlatformId = %VER_PLATFORM_WIN32_NT THEN
                  strOS = TRIM$(STR$(os.dwMajorVersion)) & "." & TRIM$(STR$(os.dwMinorVersion))
                  IF strOS > "5.0" THEN ' we have XP/Vista/Server 2003 or 2008
                    CryptoXP = %XP
                  ELSE
                    CryptoXP = 0
                  END IF
                ELSE
                  CryptoXP = 0
                END IF
    
                IF CrypToXP = %XP THEN
                  hLib = LoadLibrary( "advapi32.dll")
                  hProvProc = GetProcAddress(hLib, "SystemFunction036")
                ELSE
                  CryptAcquireContext( hProvProc, BYVAL %Null, BYVAL %Null, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
                  CryptAcquireContext( hProvProc, BYVAL %Null, $MS_DEF_PROV, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
                END IF
                !pop esi
      
              CASE 8
                IF CryptoXP = %XP THEN
                  FreeLibrary hLib
                ELSE
                  CryptReleaseContext hProvProc, 0
                END IF
      
              CASE 1          ' Rnd2mize. tot possible unique sequences = &hffffffff * &hedffffff * 384
                              ' which = 6.6 * 10^21. This excess of initial sequences allows you to
                              ' freely use Rnd2mize/Rnd(1) at will to give WildWheel numbers.
                 GOSUB gsTSCplusQPC      'add time stamp counter and queryPerfCounter, both !ror 1
                 !mov ecx, 384           ;384 possible sophie-germain multipliers
                 !mov edx, 0             ;clear edx for division
                 !div ecx                ;random remainder (MOD 384) in edx now
                 !lea ecx, multiplier    ;codeptr to 384 multipliers
                 !mov ecx, [ecx+edx*4]   ;got rnd soph germain multiplier now
                 !mov [esi+00], ecx       ;save it
                 SLEEP 0                 'considered QPC too because rarely, this can slow 20x. Can be like idle priority.
                 GOSUB gsTSCplusQPC      'add time stamp counter and queryPerfCounter, after both !ror 1
                 !mov [esi+04], eax      ;save. Now make sure mwcCarry <> mwcSoph(from above) - 1 or 0.
                '---------------------
                 !cmp dword ptr[esi+08], 0        ;if value is already present, use it as a "memory" of previous loops
                 !ja short mwcCarNot0
                 SLEEP 0
                 GOSUB gsTSCplusQPC      'add time stamp counter and queryPerfCounter, both !ror 1
                 !mov [esi+08], eax      ;4 billion+ carrys now possible
                 !jnc short mwcCarNot0   ;ok if <> 0 after add, but if there's a carry, add a constant to be sure it's <> 0
                 !mov dword ptr[esi+08],&h1234abcd;it was zero! so make it your choice of a constant
               mwcCarNot0:
                 !mov ecx, [esi+00]
                 !sub ecx, 1             ;mwcSoph - 1
                 !cmp [esi+08], ecx      ;is carry >= mwcSoph - 1 ?
                 !jb  short mwcAllOk     ;if not, we're done
                 !and dword ptr[esi+08],&h7fffffff;make it less than mwcSoph - 1
               mwcAllOk:
                                ' Shuffle mwcRandom a bit more and shuffle mwcCarry
                 !mov eax, [esi+00]   ;'(mwcSoph * 2^31 - 1) & (mwcSoph * 2^32 - 1) are both prime (Sophie Germain prime).
                 !mov ecx, [esi+04]   ;'Get previous random
                 !mul ecx             ;'Multiply mwcSoph * mwcRandom
                 !ADD eax, [esi+08]   ;'Add previous carry
                 !adc edx, 0          ;'Add possible carry bit from low dword addition
                 !mov [esi+04], eax   ;'save our 32-bit random value for next round
                 !mov [esi+08],  edx  ;'saving carry for next round
                 GOSUB gsStoreSeq
    
              CASE 0            ' Repeat the last number generated
                !fld tbyte [esi+34]  ;load rndE
                !fstp FUNCTION       ;pop it to FUNCTION
    '           above 2 asm statements do this: FUNCTION = rndE ' will, of course, be zero if no rndE calculated yet
                EXIT FUNCTION
    
              CASE 6           'make it 50/50 + and -
                 !not dword ptr[esi+56] ;togScope = NOT togScope
    
              CASE <= -1       ' Guarrantees a non-duplicate sequence for each integer -1 thru -2147483648
                               ' so you can select your own repeatable sequences from the 2Gig available. This gives a
                               ' similar functionality to the PB statement "RANDOMIZE number" but without dupe sequences.
                !push esi
                POKE VARPTR(storeState), PEEK(DWORD, CODEPTR(multiplier) - (One MOD 384) * 4)
                POKE VARPTR(storeState) + 4, &h55b218da - One
                POKE VARPTR(storeState) + 8, &h3fe8700c
                !pop esi
                'the above 3 POKE statements do this:
    '            mwcSoph = PEEK(DWORD, CODEPTR(multiplier) - (One MOD 384) * 4)
    '            mwcRandom = &h55b218da - One 'assures maximum possible 2147483648 unique sequences
    '            mwcCarry  = &h3fe8700c
                GOSUB gsStoreSeq
                !jmp noParams
    
              CASE 3                ' Rnd2Default seq
                 GOSUB gsDefaultState
                 GOSUB gsStoreSeq
    
              CASE 4                ' Rnd2Redo  repeat from beginning of seq, or last bookmarked position
                                    ' read saved values from storeState string.
                 !mov ecx, [esi+44]         ;mwcSoph
                 !mov edx, [esi+48]         ;mwcRandom
                 !mov [esi+00], ecx         ;Save mwcSoph In storeState str.
                 !mov ecx, [esi+52]         ;mwcCarry to ecx
                 !mov [esi+04], edx         ;Save mwcRandom
                 !mov [esi+08], ecx         ;Save mwcCarry
                  'the above asm does this:
    '              mwcSoph    = PEEK(DWORD, VARPTR(storeState    ))
    '              mwcRandom  = PEEK( LONG, VARPTR(storeState) + 4)
    '              mwcCarry   = PEEK( LONG, VARPTR(storeState) + 8)
    
              CASE 5                ' Rnd2Mark is like a bookmark
                GOSUB gsStoreSeq
    
              CASE ELSE
                GOTO noParams          ' + number > 6, so just generate another EXT rnd
    
            END SELECT
    
          FUNCTION = 0 ' OK
          EXIT FUNCTION
    
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      twoParams:   ' both optional parameters passed so it's an integer range which can be quickly calculated...
                   ' note: if One=Two, then implied 'division' gets us back to One result
                               ' Get a 32-bit random value.
          !mov eax, [esi+00]   ;'(mwcSoph * 2^31 - 1) & (mwcSoph * 2^32 - 1) are both prime (Sophie Germain prime).
          !mov ecx, [esi+04]   ;'Get previous random
          !mul ecx             ;'Multiply mwcSoph * mwcRandom
          !ADD eax, [esi+08]   ;'Add previous carry
          !adc edx, 0          ;'Add possible carry bit from low dword addition
          !mov [esi+04], eax   ;'save our 32-bit random value for next round
          !mov [esi+08], edx   ;'saving carry for next round
                                'got random 32 bits
          !mov edx, eax        ;'<<<< hold 32-bit random value in EDX
          !mov ecx, One        ;'NOW, evaluate parameters. Get byref address to parameter One
          !mov eax, Two        ;'Get byref address to parameter Two
          !mov ecx, [ecx]      ;'dereference One.  Expected to be LOWER
          !mov eax, [eax]      ;'dereference Two
          !cmp ecx, eax        ;'is One > Two?
          !jl short Now1LT2    ;'jump over swap, already Two > One
          !xchg eax, ecx       ;'Had to switch them, now ecx has LOWER bound as it should
    
         Now1LT2:
          !SUB eax, ecx        ;'now eax is range    [ RANGE (before +1) ]
          !inc eax             ;'add 1 to range. Result is correct for two params, and 0 if max range
          !jz short doTheRnd   ;'jump if we incremented &hFFFFFFFF up to 0; we have maximum range
         'rngNotMax:             At this point ECX = lower,  EAX = range, EDX = random
          !mul edx             ;'random * range+1 == edx:eax. Use edx as result as if /(2^32).
          !add edx, ecx        ;'add lower bound to rand result in edx
        doTheRnd:
          !mov [esi+22], edx      ;'store dword result as signed long integer. Need this to load FPU.
          !fild dword ptr[esi+22] ;'signed integer load our result to the FPU
          !fld st(0)              ;'create duplicate of rndL in FPU
          !fstp tbyte [esi+34]    ;'pop extended result to rndE, in case of later Rnd2redo
          !fstp FUNCTION          ;'pop extended result for function result
          EXIT FUNCTION
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'GOTO noParams   'doesn't get executed, but may help PB w/ alignment
      noParams:  'No parameters, so calc extd rnd2. (v.GDR20081106a)
        !mov eax, [esi+00]      'eax << soph
        !mul dword ptr[esi+04]  'edx:ead << soph * rand0
        !ADD eax, [esi+08]      'eax << eax + carry0 ; this is rand1
        !mov [esi+26], eax      'eq(low) << rand1
        !adc edx, 0             'edx << edx + cf ; this is carry1
        !mov ecx, edx           'ecx << edx ; holding our carry1
        !mul dword ptr[esi+00]  'edx:eax << rand1 * soph
        !ADD eax, ecx           'eax << eax + carry1 ; this is rand2
        !adc edx, 0             'edx << edx + cf ; this is carry2
        !mov [esi+04], eax      'mwcRandom << rand2
        !lea ecx, [esi+26]      'load address for eq ; this is fastest sequence for P5
        !mov [ecx+4], eax       'eq(high) << rand2
        !mov [esi+08], edx      'mwcCarry << carry2
        ''' end of optimized 2x thru mwc w/ saves to mwcRandom, mwcCarry, and eq.
    
      okQuadRnd:                'now we have a complete rand quad in eq
        !fild qword [esi+26]      ;float load eq which is now a complete quad random
        !cmp dword ptr[esi+56], 0 ;do we need to make it 50/50 + and - ?
        !jne short rndNeg         ;jump if we do.
        !fabs                     ;make it +
      rndNeg:
        '---------------------  'now take the quad and divide by 9223372036854775816, and it perfectly ranges from [0,1) or (-1,1)
        !fld tbyte [esi+12]     ;float load ...-36
        !fmulp st(1),st(0)      ;eq * ...-36 to range rnd to [0,1). This = eq / 9223372036854775816 but is much faster
        !fld st(0)              ;copy final answer
        !fstp tbyte [esi+34]    ;save copy
        !fstp FUNCTION          ;save FUNCTION
    
    EXIT FUNCTION
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     gsTSCplusQPC:       'Performs time stamp cntr + queryPerformCntr. gs prefix means this is a GOSUB.
       !cpuid            ;serialize
       !dw &h310f        ;read time stamp counter
       !ror eax, 1       ;smooth because some processors are always even low bit
       !mov [esi+22],eax ;save in rndL because QPC overwrites registers
       !push esi
        QueryPerformanceCounter PEEK(QUAD, VARPTR(storeState) + 26)
       !pop esi
       !mov eax, [esi+26];LO dword
       !ror eax, 1       ;smooth, low bit might always be even on some cpu's
       !add eax, [esi+22];smoothed QPC + saved smoothed TSC
     RETURN
    
     gsStoreSeq: ' 'save original starting point of sequence in storeState string.
                 !mov ecx, [esi+00]         ;mwcSoph
                 !mov edx, [esi+04]         ;mwcRandom
                 !mov [esi+44], ecx         ;Save mwcSoph In storeState str.
                 !mov ecx, [esi+08]         ;mwcCarry to ecx
                 !mov [esi+48], edx         ;Save mwcRandom
                 !mov [esi+52], ecx         ;Save mwcCarry
     RETURN
    
     gsDefaultState: 'Initialize generator and divisor factor. gs prefix means this is a GOSUB.
                 !mov dword ptr[esi+00], 4221732234   ; mwcSoph
                 !mov dword ptr[esi+04], &ha5b218da   ; mwcRandom can be any LONG except &hffffffff and 0
                 !mov dword ptr[esi+08], &h3fe8700c   ; mwcCarry can be any number except (mwcSoph-1) and 0
                 !mov dword ptr[esi+12], &hffffffef   ; factor1 space to hold the constant 1084202172485504433 * 10E-36 (yes I even made it 19 digits)
                 !mov dword ptr[esi+16], &hffffffff   ; factor1 ...-36 save in rndE. This is a binary image of the extended precision ...-36
                 !mov dword ptr[esi+20], &h3fbf       ; factor1 only 2 bytes, but just move whole dword for speed
             'rndL is dword ptr[esi+22] LONG
             '  eq is dword ptr[esi+26] QUAD
             'rndE is dword ptr[esi+34] EXT
       'storeState is dword ptr[esi+44] STRING * 12
         'togScope is dword ptr[esi+56] LONG
     RETURN
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '#align 4
    multiplier: 'a bunch of Sophie Germain primes. Each makes its own unique sequence of ~2^63 LONG random values
    !DD 4055108289, 4183291290, 4066422669, 4010830218, 4144557798, 4047225099, 4169878863, 4156378278
    !DD 4068734223, 4013003148, 4128794349, 4044603045, 4147482834, 4050081738, 4169007204, 4084483623
    !DD 4182936234, 4061167764, 4116557445, 4184835774, 4098609075, 4000058700, 4005596580, 4131991143
    !DD 4026365904, 4082490609, 4170263943, 4064971044, 4192040679, 4069686423, 4112450355, 4116008373
    !DD 4051352658, 4131639393, 4026209880, 4143019908, 4057560153, 4153038378, 4178347353, 4101943515
    !DD 4163817693, 4126770675, 4122227184, 4150506573, 4124871525, 4097114355, 4171215009, 4094254353
    !DD 4185190458, 4184112513, 4187782989, 4037092584, 4114448259, 4096721880, 4003880118, 4035500259
    !DD 4080989598, 4090215738, 4104202098, 4144153608, 4027213065, 4112123319, 4029634383, 4188620745
    !DD 4003957254, 4158202674, 4165028370, 4101889029, 4064867064, 4056294705, 4117302630, 4094813610
    !DD 4089078504, 4072584339, 4075250574, 4144182519, 4020827805, 4077052605, 4012941570, 4114015830
    !DD 4015303260, 4012049835, 4031934513, 4123667379, 4025171265, 4149021864, 4020494469, 4152989853
    !DD 4141465314, 4050172164, 4130534940, 4124347128, 4155032220, 4123523313, 4038610005, 4066391700
    !DD 4052359893, 4138494750, 4046848368, 4015233183, 4065337650, 4181156010, 4149686553, 4115669703
    !DD 4080411408, 4029985884, 4072279314, 4136476293, 4102312674, 4148638644, 4020161274, 4056852945
    !DD 4084467288, 4090139205, 4152479904, 4129623354, 4189154793, 4042650633, 4113056934, 4070634510
    !DD 4172190345, 4012616748, 4092782529, 4042027470, 4034320863, 4017110193, 4128178095, 4005317820
    !DD 4121565819, 4160465475, 4093432608, 4094047308, 4092039654, 4132108680, 4160799915, 4109110719
    !DD 4190254803, 4063105479, 4123739478, 4086096945, 4113466908, 4169157873, 4036670034, 4035486873
    !DD 4154194098, 4074334704, 4006945965, 4119880785, 4050935955, 4131729105, 4170646809, 4191996963
    !DD 4055775498, 4029162399, 4118132214, 4116397584, 4121266560, 4102454433, 4146555864, 4103353149
    !DD 4119974010, 4080379233, 4192378968, 4061071950, 4104928533, 4042978743, 4188739878, 4066717740
    !DD 4017709695, 4027617453, 4110604308, 4107339654, 4076278878, 4077074274, 4097495403, 4179562659
    !DD 4187765853, 4187454249, 4015793904, 4083863454, 4078492929, 4166495943, 4101303048, 4149525330
    !DD 4095286830, 4078227909, 4189944624, 4010811645, 4032304584, 4151394078, 4044317298, 4136517915
    !DD 4198354635, 4192501860, 4073134869, 4060180830, 4076815050, 4190613315, 4142749785, 4122567564
    !DD 4071542523, 4024430004, 4122798648, 4041267495, 4006243575, 4092566124, 4141397349, 4175565558
    !DD 4159829190, 4173505479, 4084339563, 4085131608, 4081507743, 4069428324, 4011038568, 4092438129
    !DD 4005482298, 4020895359, 4127615184, 4162803795, 4038272028, 4123171464, 4199942199, 4067713245
    !DD 4129181838, 4021766328, 4141102845, 4002607668, 4051580310, 4082443044, 4078962945, 4072199883
    !DD 4180693749, 4040763375, 4025696004, 4066226853, 4013137770, 4084688994, 4081465923, 4185884010
    !DD 4184193840, 4095653625, 4071642489, 4003011123, 4021708860, 4038391383, 4003548888, 4016275635
    !DD 4051483344, 4052001093, 4131504594, 4129105653, 4187278653, 4058921709, 4167113355, 4106971188
    !DD 4074045393, 4069825200, 4009724565, 4120937589, 4119577560, 4151390115, 4000637598, 4088788530
    !DD 4014859458, 4003633353, 4192075623, 4009856424, 4048255155, 4100175633, 4129717695, 4012882215
    !DD 4119226824, 4122492603, 4074693864, 4062187338, 4022104890, 4186039455, 4191285474, 4165800789
    !DD 4047934929, 4045886208, 4028478450, 4098395724, 4095869853, 4004229753, 4110500373, 4188458055
    !DD 4093944063, 4122368673, 4136075109, 4024434645, 4145270010, 4121262090, 4051650480, 4076720613
    !DD 4057135713, 4053301650, 4074379569, 4103950185, 4146078999, 4029125490, 4036104003, 4122595203
    !DD 4173008610, 4155931704, 4048316175, 4178853645, 4049069715, 4187855514, 4193714559, 4132340133
    !DD 4001184978, 4087342068, 4038996009, 4032782589, 4103313705, 4057212699, 4094324010, 4117022988
    !DD 4016133978, 4057176333, 4081210119, 4183410330, 4054406019, 4008415374, 4131217578, 4049176725
    !DD 4033804230, 4154677353, 4194818769, 4057689999, 4065887250, 4083913149, 4160269749, 4148719650
    !DD 4086572148, 4079152770, 4198797849, 4025836533, 4121774838, 4114818903, 4193265369, 4005720123
    !DD 4172736744, 4113446385, 4153872675, 4022863908, 4169665353, 4080875223, 4148976378, 4158173325
    !DD 4012107315, 4146530883, 4042645638, 4189878099, 4075365840, 4053276279, 4112504730, 4144260888
    !DD 4102144035, 4181673825, 4171915968, 4123257354, 4032551355, 4054454535, 4132616253, 4057321905
    !DD 4174490559, 4165419468, 4169862234, 4116771594, 4009920498, 4164231630, 4163597154, 4181713095
    !DD 4000268439, 4077171264, 4045424718, 4116626304, 4052701140, 4140380880, 4027965249, 4102323183
    
    END FUNCTION

    Leave a comment:


  • David Roberts
    replied
    I have noticed that the relationship between RND and its integer mode varies with PB8 and cores used so I've changed the reference to its opposite number. eg RND(1,52) is a reference for Rnd2(1,52). The actual times have been dropped - mean nothing in general.
    Code:
               PB8      PB9
     
    Single   (0.821)  (0.765)
             (0.680)  (0.549)
             (0.640)  (0.550)
     
    Dual     (0.694)  (0.762)
             (0.512)  (0.610)
             (0.375)  (0.625)
    This highlights even more PB9 beating PB8 in single core mode especially with integer mode and PB8 beating PB9 in dual core mode, again especially with integer mode.

    I wonder why PB9 should behave so with dual core.

    Interestingly, there is nothing between PB9 Rnd2() single and dual core.

    The above conclusions cannot, of course, be carried over to other applications without a lot of testing.
    Last edited by David Roberts; 10 Nov 2008, 01:02 PM.

    Leave a comment:


  • David Roberts
    replied
    John, I am not looking for a Crypto by default function.

    The arguments against re-seeding diminish as the period increases and more so with the availability of 384 multipliers.

    I find myself less against Rnd2(1) within a loop. I cannot object, then to Rnd2(2) within a loop either. Users will do what they want whether I object or not. Folks with XP/Vista and so on will benefit by using Rnd2(2) within a loop because the faster Crypto code is twice as fast as Rnd2(1). Folks without XP and so on will benefit from Rnd2(1) within a loop because it is twice as fast as the slower Crypto code.

    In the latter case a function without Crypto makes sense and in the former case a function without Wheel makes sense.

    However, if a user wants to use Crypto in the classic seeding sense they cannot with a Wheel only function and vice versa.

    By having both Wheel and Crypto on board users have all the options available to them.

    Tidying up will be required in any event because the Crypto will be set up even if not used.

    If initialization is done in PBMain then a SetUp in PbMain is required as well.

    I can, however, see your point about having a function which requires nothing to be added to PBMain. In this case Crypto needs to be removed.

    With a Crypto version I'd still want Wheel to be included. A tidying up will be required whether Wheel is used or not so why not leave it in. Folks without XP and so on will have access to the faster method if they want to re-seed within a loop and need the extra speed.

    I have then, by argument, ended up with a Wheel only version and a Wheel/Crypto version.

    With the Wheel/Crypto version since we must add a tidying up then it is no big deal to add a set up as well ie initialization outside of Rnd2. I need to do some tests with and without.

    Added: Initialisation with PBMain is 6% faster. That is not a big jump but if we have to include a tidy up we may as well have a set up and get the 6%.
    Last edited by David Roberts; 10 Nov 2008, 12:13 PM. Reason: Spelling

    Leave a comment:


  • John Gleason
    replied
    Dave, that's a ton of work you've done on the crypto seed which now works everywhere, and my intention is not to choke it out of the include file. But to get the best of both worlds, I think the only way to do it is to have the minimal version in addition to the crypto version in the include file. I removed the crypto code from Rnd2min.inc not to permanently bury the crypto code, but rather to demo how the one-declare nearly all asm version can easily co-exist and run fast with max RND compatibility.

    Furthermore, the macros and function name "Rnd2" can use the crypto seed--which actually you and Gary preferred--by default. The minimum function version can go by the name... let's say "Rnd2w", a reference back to the WildWheel initialization. This makes the only difference between your .INC file and my proposed .INC file the addition of function Rnd2w. It will not affect Rnd2 in any way except # of code lines in the include file.

    Leave a comment:


  • David Roberts
    replied
    Just for the record this is what I'm now getting.

    The blocks are Rnd2(), Rnd2(1,52), Rnd2Long.

    Code:
                   PB8               PB9
     
    Single   267.17ms (0.821)  249.11ms (0.766)
             323.02ms (0.992)  262.79ms (0.766)
             309.81ms (0.952)  266.97ms (0.821)
     
    Dual     226.28ms (0.691)  245.66ms (0.758)
             240.23ms (0.733)  290.65ms (0.895)
             180.87ms (0.552)  301.53ms (0.928)
    I had some strange results and contradictions when Crypto was in object mode.

    In Single Core operation PB9 now beats PB8.

    In Dual Core operation PB8 has the edge.

    Your esi approach, John, gave a boost.

    No alignment tweaking has been considered yet but I have a feeling that no great gains will be got there now.

    Regardless of how we slice the cake all relatives are less than 1.000. This was never the aim; period and resolution was. To achieve the aim and beat RND in the speed stakes as well is quite an achievement and my hat goes off to John and Gary for their relentless pursuit in the asm domain. Well done, boys.

    Added: Previous nop tweaking with PB9/Dual which gave significant improvements is no longer beneficial. Thank goodness for that.
    Last edited by David Roberts; 10 Nov 2008, 07:56 AM.

    Leave a comment:


  • David Roberts
    replied
    .... and if you want to initialize from PBMain then remove the label passOneTime: and the code above it except '!lea esi, storeState' so it now looks like this:
    Code:
    !lea esi, storeState
    !lea eax, Two
    !mov ecx, [eax]
    !cmp ecx, 0
    .....
    .....
    Introduce Case 7:
    Code:
    Case 7
      !lea esi, storeState      ;All our variables are now In one Declare: storeState
      GoSub gsDefaultState
      GoSub gsStoreSeq
      ' Crypto initialization
      !push esi
      os.dwOSVersionInfoSize=SizeOf(os)
      GetVersionEx os
      If os.dwPlatformId = %VER_PLATFORM_WIN32_NT Then
        strOS = Trim$(Str$(os.dwMajorVersion)) & "." & Trim$(Str$(os.dwMinorVersion))
        If strOS > "5.0" Then ' we have XP/Vista/Server 2003 or 2008
          CryptoXP = %XP
        Else
          CryptoXP = 0
        End If
      Else
        CryptoXP = 0
      End If
     
      If CrypToXP = %XP Then
        hLib = LoadLibrary( "advapi32.dll")
        hProvProc = GetProcAddress(hLib, "SystemFunction036")
      Else
        CryptAcquireContext( hProvProc, ByVal %Null, ByVal %Null, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
        CryptAcquireContext( hProvProc, ByVal %Null, $MS_DEF_PROV, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
      End If
      !pop esi
    and add the macro 'Macro Rnd2SetUp = Rnd2(7)'

    In PBMain we now have

    InitializeTimer
    Rnd2SetUp
    .....
    .....
    .....
    Rnd2TidyUp

    Leave a comment:


  • David Roberts
    replied
    I think having a minimum code version like that below is useful because 1) It matches RND almost identically in usage, 2) The function is self-contained and portable, ie. it can stand alone without an include file if wanted, and 3) It's as fast as possible.
    Hmmm, yerrrs. In post #316 the Crypto code was effectively stunned within a #IF 0/#ENDIF block, in post #337 it was throttled in its sleep.

    I am, perhaps uncharacteristically, going to say no more.

    For those of you who would like a Crypto seed then read on.

    Add:
    Code:
    Static Buffer As Ext
    Static os As OSVERSIONINFO, strOS As String
    Static CryptoXP As Long
    Static hProvProc, hLib As Dword
    Add:
    Code:
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    $MS_DEF_PROV = "Microsoft Base Cryptographic Provider v1.0"
    %PROV_RSA_FULL = 1
    %CRYPT_VERIFYCONTEXT = &hF0000000
    %XP = 1
    
    Function GetCryptoBytes( ByVal Which As Long, ByVal hProvProc As Long, x As Ext ) As Long
      Dim BinaryByte(9) As Static Byte
      Static hexSize As Dword
      hexSize = 10
      If Which = %XP Then
        Call Dword hProvProc Using RtlGenRandom( BinaryByte(0), hexSize )
      Else
        CryptGenRandom( hProvProc, hexSize, BinaryByte(0) )
      End If
      x = Peek( Ext, VarPtr( BinaryByte(0)))
    End Function
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    At the head of Rnd2 between 'GOSUB gsStoreSeq' and passOneTime:

    Add:
    Code:
    ' Crypto initialization
    !push esi
    os.dwOSVersionInfoSize=SizeOf(os)
    GetVersionEx os
    If os.dwPlatformId = %VER_PLATFORM_WIN32_NT Then
      strOS = Trim$(Str$(os.dwMajorVersion)) & "." & Trim$(Str$(os.dwMinorVersion))
      If strOS > "5.0" Then ' we have XP/Vista/Server 2003 or 2008
        CryptoXP = %XP
      Else
        CryptoXP = 0
      End If
    Else
      CryptoXP = 0
    End If
     
    If CrypToXP = %XP Then
      hLib = LoadLibrary( "advapi32.dll")
      hProvProc = GetProcAddress(hLib, "SystemFunction036")
    Else
      CryptAcquireContext( hProvProc, ByVal %Null, ByVal %Null, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
      CryptAcquireContext( hProvProc, ByVal %Null, $MS_DEF_PROV, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
    End If
    !pop esi
    ' End Crypto initialization
    Add the following two CASEs
    Code:
    Case 2
    reSeedCrypt:
      !push esi
      GetCryptoBytes(CryptoXP, hProvProc, Buffer)
      !pop esi
      !lea ecx, Buffer
      !mov eax, [ecx]
      !mov [esi+4], eax                 ; mwcRandom
      !mov eax, [ecx+4]
      !mov [esi+8], eax                 ; mwcCarry
      !movzx eax, Word Ptr[ecx+8]       ; the last two Of the 10 collected
      !mov ecx, 384                     ; 384 possible sophie-germain multipliers
      !mov edx, 0                       ; Clear edx For division
      !div ecx                          ; Random remainder (Mod 384) In edx now
      !lea ecx, multiplier              ; Pointer To 384 multipliers
      !mov ecx, [ecx+edx*4]             ; got Rnd soph germain multiplier now
      !mov [esi], ecx                   ; mwcSoph
      !Sub ecx, 1                       ; mwcSoph - 1
      !cmp [esi+8], ecx                 ; does mwcCarry = mwcSoph - 1
      !je mwcCarrySophFail              ; yes
      !cmp Dword Ptr[esi+8], 0          ; Is mwcCarry = 0
      !jne OK                           ; no, Not interested In mwcRandom now
      !cmp Dword Ptr[esi+4], 0          ; Is mwcRandom = 0
      !je reSeedCrypt                   ; we now have both mwcCarry & mwcRandom = 0
      !jmp OK                           ; mwcCarry = 0 but mwcRandom Is Not
    mwcCarrySophFail:
      !cmp Dword Ptr[esi+4], &hffffffff ; does mwcRandom = &hffffffff
      !je reSeedCrypt                   ; mwcCarry = mwcSoph - 1 & mwcRandom = &hffffffff
    OK:
      GoSub gsStoreSeq
     
    Case 8
      If CryptoXP = %XP Then
        FreeLibrary hLib
      Else
        CryptReleaseContext hProvProc, 0
      End If
    John forgot to remove 'MACRO Rnd2mizeCrypto = Rnd2(2)'

    John is not using CASE 7 which, in my case, initializes in PBMain.

    Case 8 has been added and can be used via adding 'Macro Rnd2TidyUp = Rnd2(8)'

    Case 8 is required even though Case 7 is not used.

    In the case of XP/Vista/Server 03 or 08 the library should be unmapped and in other cases, since the CryptoAPI uses objects, they should be destroyed.
    Last edited by David Roberts; 10 Nov 2008, 06:12 AM. Reason: !push esi/!pop esi @ begin/end of Crypto initialization

    Leave a comment:


  • John Gleason
    replied
    Dave asked: Rnd2, the basic version, and Rnd2C the crypto version.
    Why? Post #327 removes the object oriented Crypto and replaces it with a procedural Crypto.
    I think having a minimum code version like that below is useful because 1) It matches RND almost identically in usage, 2) The function is self-contained and portable, ie. it can stand alone without an include file if wanted, and 3) It's as fast as possible.

    The speed increase from previous versions is okay, not great, but now this code has got to be just about at the speed limit.

    I've tested the heckjeebers out of it, and it matches the older include files exactly in output. The way to use queryPerfCnt and other PB commands without more declared variables is to PEEK and POKE them.

    Rnd2min.inc
    Code:
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ' Rnd2 Macros
    MACRO Rnd2mize           = Rnd2(1)                       'WildWheel© seed the random number generator.
    MACRO Rnd2mizeCrypto     = Rnd2(2)                       'cryptographically seed the random number generator
    MACRO Rnd2Long           = Rnd2(-2147483648, 2147483647) 'commonly used full LONG range
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ' Rnd2 Additional Macros
    MACRO Rnd2LastNum        = Rnd2(0)                       'get last random generated.
    MACRO Rnd2Default        = Rnd2(3)                       'set or reset to default sequence
    MACRO Rnd2Redo           = Rnd2(4)                       'repeat from beginning of seq, or last bookmarked position
    MACRO Rnd2Mark           = Rnd2(5)                       'save position in a sequence
    MACRO Rnd2TogSign        = Rnd2(6)                       'turn on/off EXT random range -1 to 1 exclusive.
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    FUNCTION Rnd2( OPT One AS LONG, Two AS LONG ) AS EXT
     #REGISTER NONE
     STATIC storeState AS STRING * 60   'mwc means "multiply with carry" random generator based on theory
                                        'by George Marsaglia and is the fastest generator known to man. A man. Me. ;)
      !lea esi, storeState      ;all our variables are now in one declare: storeState
      !cmp dword ptr[esi+00], 0 ;is mwcSoph 0? if so, this is 1st time thru & storeState must be filled with data
      !jne short passOneTime
         GOSUB gsDefaultState
         GOSUB gsStoreSeq
     passOneTime:
    
      !lea eax, Two
      !mov ecx, [eax]
      !cmp ecx, 0
      !jne twoParams      ;if optional parameter Two passed, belt down to integer Rnd2 code section
      !lea eax, One
      !mov ecx, [eax]
      !cmp ecx, 0
      !je noParams        ;if no params, boogie down to noParams code
    
      'we got here so there must be 1 parameter
    
            SELECT CASE AS LONG One
    
              CASE 1          ' Rnd2mize. tot possible unique sequences = &hffffffff * &hedffffff * 384
                              ' which = 6.6 * 10^21. This excess of initial sequences allows you to
                              ' freely use Rnd2mize/Rnd(1) at will to give WildWheel numbers.
                 GOSUB gsTSCplusQPC      'add time stamp counter and queryPerfCounter, both !ror 1
                 !mov ecx, 384           ;384 possible sophie-germain multipliers
                 !mov edx, 0             ;clear edx for division
                 !div ecx                ;random remainder (MOD 384) in edx now
                 !lea ecx, multiplier    ;codeptr to 384 multipliers
                 !mov ecx, [ecx+edx*4]   ;got rnd soph germain multiplier now
                 !mov [esi+00], ecx       ;save it
                 SLEEP 0                 'considered QPC too because rarely, this can slow 20x. Can be like idle priority.
                 GOSUB gsTSCplusQPC      'add time stamp counter and queryPerfCounter, after both !ror 1
                 !mov [esi+04], eax      ;save. Now make sure mwcCarry <> mwcSoph(from above) - 1 or 0.
                '---------------------
                 !cmp dword ptr[esi+08], 0        ;if value is already present, use it as a "memory" of previous loops
                 !ja short mwcCarNot0
                 SLEEP 0
                 GOSUB gsTSCplusQPC      'add time stamp counter and queryPerfCounter, both !ror 1
                 !mov [esi+08], eax      ;4 billion+ carrys now possible
                 !jnc short mwcCarNot0   ;ok if <> 0 after add, but if there's a carry, add a constant to be sure it's <> 0
                 !mov dword ptr[esi+08],&h1234abcd;it was zero! so make it your choice of a constant
               mwcCarNot0:
                 !mov ecx, [esi+00]
                 !sub ecx, 1             ;mwcSoph - 1
                 !cmp [esi+08], ecx      ;is carry >= mwcSoph - 1 ?
                 !jb  short mwcAllOk     ;if not, we're done
                 !and dword ptr[esi+08],&h7fffffff;make it less than mwcSoph - 1
               mwcAllOk:
                                ' Shuffle mwcRandom a bit more and shuffle mwcCarry
                 !mov eax, [esi+00]   ;'(mwcSoph * 2^31 - 1) & (mwcSoph * 2^32 - 1) are both prime (Sophie Germain prime).
                 !mov ecx, [esi+04]   ;'Get previous random
                 !mul ecx             ;'Multiply mwcSoph * mwcRandom
                 !ADD eax, [esi+08]   ;'Add previous carry
                 !adc edx, 0          ;'Add possible carry bit from low dword addition
                 !mov [esi+04], eax   ;'save our 32-bit random value for next round
                 !mov [esi+08],  edx  ;'saving carry for next round
                 GOSUB gsStoreSeq
    
              CASE 0            ' Repeat the last number generated
                !fld tbyte [esi+34]  ;load rndE
                !fstp FUNCTION       ;pop it to FUNCTION
    '           above 2 asm statements do this: FUNCTION = rndE ' will, of course, be zero if no rndE calculated yet
                EXIT FUNCTION
    
              CASE 6           'make it 50/50 + and -
                 !not dword ptr[esi+56] ;togScope = NOT togScope
    
              CASE <= -1       ' Guarrantees a non-duplicate sequence for each integer -1 thru -2147483648
                               ' so you can select your own repeatable sequences from the 2Gig available. This gives a
                               ' similar functionality to the PB statement "RANDOMIZE number" but without dupe sequences.
                !push esi
                POKE VARPTR(storeState), PEEK(DWORD, CODEPTR(multiplier) - (One MOD 384) * 4)
                POKE VARPTR(storeState) + 4, &h55b218da - One
                POKE VARPTR(storeState) + 8, &h3fe8700c
                !pop esi
                'the above 3 POKE statements do this:
    '            mwcSoph = PEEK(DWORD, CODEPTR(multiplier) - (One MOD 384) * 4)
    '            mwcRandom = &h55b218da - One 'assures maximum possible 2147483648 unique sequences
    '            mwcCarry  = &h3fe8700c
                GOSUB gsStoreSeq
                !jmp noParams
    
              CASE 3                ' Rnd2Default seq
                 GOSUB gsDefaultState
                 GOSUB gsStoreSeq
    
              CASE 4                ' Rnd2Redo  repeat from beginning of seq, or last bookmarked position
                                    ' read saved values from storeState string.
                 !mov ecx, [esi+44]         ;mwcSoph
                 !mov edx, [esi+48]         ;mwcRandom
                 !mov [esi+00], ecx         ;Save mwcSoph In storeState str.
                 !mov ecx, [esi+52]         ;mwcCarry to ecx
                 !mov [esi+04], edx         ;Save mwcRandom
                 !mov [esi+08], ecx         ;Save mwcCarry
                  'the above asm does this:
    '              mwcSoph    = PEEK(DWORD, VARPTR(storeState    ))
    '              mwcRandom  = PEEK( LONG, VARPTR(storeState) + 4)
    '              mwcCarry   = PEEK( LONG, VARPTR(storeState) + 8)
    
              CASE 5                ' Rnd2Mark is like a bookmark
                GOSUB gsStoreSeq
    
              CASE ELSE
                GOTO noParams          ' + number > 6, so just generate another EXT rnd
    
            END SELECT
    
          FUNCTION = 0 ' OK
          EXIT FUNCTION
    
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      twoParams:   ' both optional parameters passed so it's an integer range which can be quickly calculated...
                   ' note: if One=Two, then implied 'division' gets us back to One result
                               ' Get a 32-bit random value.
          !mov eax, [esi+00]   ;'(mwcSoph * 2^31 - 1) & (mwcSoph * 2^32 - 1) are both prime (Sophie Germain prime).
          !mov ecx, [esi+04]   ;'Get previous random
          !mul ecx             ;'Multiply mwcSoph * mwcRandom
          !ADD eax, [esi+08]   ;'Add previous carry
          !adc edx, 0          ;'Add possible carry bit from low dword addition
          !mov [esi+04], eax   ;'save our 32-bit random value for next round
          !mov [esi+08], edx   ;'saving carry for next round
                                'got random 32 bits
          !mov edx, eax        ;'<<<< hold 32-bit random value in EDX
          !mov ecx, One        ;'NOW, evaluate parameters. Get byref address to parameter One
          !mov eax, Two        ;'Get byref address to parameter Two
          !mov ecx, [ecx]      ;'dereference One.  Expected to be LOWER
          !mov eax, [eax]      ;'dereference Two
          !cmp ecx, eax        ;'is One > Two?
          !jl short Now1LT2    ;'jump over swap, already Two > One
          !xchg eax, ecx       ;'Had to switch them, now ecx has LOWER bound as it should
    
         Now1LT2:
          !SUB eax, ecx        ;'now eax is range    [ RANGE (before +1) ]
          !inc eax             ;'add 1 to range. Result is correct for two params, and 0 if max range
          !jz short doTheRnd   ;'jump if we incremented &hFFFFFFFF up to 0; we have maximum range
         'rngNotMax:             At this point ECX = lower,  EAX = range, EDX = random
          !mul edx             ;'random * range+1 == edx:eax. Use edx as result as if /(2^32).
          !add edx, ecx        ;'add lower bound to rand result in edx
        doTheRnd:
          !mov [esi+22], edx      ;'store dword result as signed long integer. Need this to load FPU.
          !fild dword ptr[esi+22] ;'signed integer load our result to the FPU
          !fld st(0)              ;'create duplicate of rndL in FPU
          !fstp tbyte [esi+34]    ;'pop extended result to rndE, in case of later Rnd2redo
          !fstp FUNCTION          ;'pop extended result for function result
          EXIT FUNCTION
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'GOTO noParams   'doesn't get executed, but may help PB w/ alignment
      noParams:  'No parameters, so calc extd rnd2. (v.GDR20081106a)
        !mov eax, [esi+00]      'eax << soph
        !mul dword ptr[esi+04]  'edx:ead << soph * rand0
        !ADD eax, [esi+08]      'eax << eax + carry0 ; this is rand1
        !mov [esi+26], eax      'eq(low) << rand1
        !adc edx, 0             'edx << edx + cf ; this is carry1
        !mov ecx, edx           'ecx << edx ; holding our carry1
        !mul dword ptr[esi+00]  'edx:eax << rand1 * soph
        !ADD eax, ecx           'eax << eax + carry1 ; this is rand2
        !adc edx, 0             'edx << edx + cf ; this is carry2
        !mov [esi+04], eax      'mwcRandom << rand2
        !lea ecx, [esi+26]      'load address for eq ; this is fastest sequence for P5
        !mov [ecx+4], eax       'eq(high) << rand2
        !mov [esi+08], edx      'mwcCarry << carry2
        ''' end of optimized 2x thru mwc w/ saves to mwcRandom, mwcCarry, and eq.
    
      okQuadRnd:                'now we have a complete rand quad in eq
        !fild qword [esi+26]      ;float load eq which is now a complete quad random
        !cmp dword ptr[esi+56], 0 ;do we need to make it 50/50 + and - ?
        !jne short rndNeg         ;jump if we do.
        !fabs                     ;make it +
      rndNeg:
        '---------------------  'now take the quad and divide by 9223372036854775816, and it perfectly ranges from [0,1) or (-1,1)
        !fld tbyte [esi+12]     ;float load ...-36
        !fmulp st(1),st(0)      ;eq * ...-36 to range rnd to [0,1). This = eq / 9223372036854775816 but is much faster
        !fld st(0)              ;copy final answer
        !fstp tbyte [esi+34]    ;save copy
        !fstp FUNCTION          ;save FUNCTION
    
    EXIT FUNCTION
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     gsTSCplusQPC:       'Performs time stamp cntr + queryPerformCntr. gs prefix means this is a GOSUB.
       !cpuid            ;serialize
       !dw &h310f        ;read time stamp counter
       !ror eax, 1       ;smooth because some processors are always even low bit
       !mov [esi+22],eax ;save in rndL because QPC overwrites registers
       !push esi
        QueryPerformanceCounter PEEK(QUAD, VARPTR(storeState) + 26)
       !pop esi
       !mov eax, [esi+26];LO dword
       !ror eax, 1       ;smooth, low bit might always be even on some cpu's
       !add eax, [esi+22];smoothed QPC + saved smoothed TSC
     RETURN
    
     gsStoreSeq: ' 'save original starting point of sequence in storeState string.
                 !mov ecx, [esi+00]         ;mwcSoph
                 !mov edx, [esi+04]         ;mwcRandom
                 !mov [esi+44], ecx         ;Save mwcSoph In storeState str.
                 !mov ecx, [esi+08]         ;mwcCarry to ecx
                 !mov [esi+48], edx         ;Save mwcRandom
                 !mov [esi+52], ecx         ;Save mwcCarry
     RETURN
    
     gsDefaultState: 'Initialize generator and divisor factor. gs prefix means this is a GOSUB.
                 !mov dword ptr[esi+00], 4221732234   ; mwcSoph
                 !mov dword ptr[esi+04], &ha5b218da   ; mwcRandom can be any LONG except &hffffffff and 0
                 !mov dword ptr[esi+08], &h3fe8700c   ; mwcCarry can be any number except (mwcSoph-1) and 0
                 !mov dword ptr[esi+12], &hffffffef   ; factor1 space to hold the constant 1084202172485504433 * 10E-36 (yes I even made it 19 digits)
                 !mov dword ptr[esi+16], &hffffffff   ; factor1 ...-36 save in rndE. This is a binary image of the extended precision ...-36
                 !mov dword ptr[esi+20], &h3fbf       ; factor1 only 2 bytes, but just move whole dword for speed
             'rndL is dword ptr[esi+22] LONG
             '  eq is dword ptr[esi+26] QUAD
             'rndE is dword ptr[esi+34] EXT
       'storeState is dword ptr[esi+44] STRING * 12
         'togScope is dword ptr[esi+56] LONG
     RETURN
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    '#align 4
    multiplier: 'a bunch of Sophie Germain primes. Each makes its own unique sequence of ~2^63 LONG random values
    !DD 4055108289, 4183291290, 4066422669, 4010830218, 4144557798, 4047225099, 4169878863, 4156378278
    !DD 4068734223, 4013003148, 4128794349, 4044603045, 4147482834, 4050081738, 4169007204, 4084483623
    !DD 4182936234, 4061167764, 4116557445, 4184835774, 4098609075, 4000058700, 4005596580, 4131991143
    !DD 4026365904, 4082490609, 4170263943, 4064971044, 4192040679, 4069686423, 4112450355, 4116008373
    !DD 4051352658, 4131639393, 4026209880, 4143019908, 4057560153, 4153038378, 4178347353, 4101943515
    !DD 4163817693, 4126770675, 4122227184, 4150506573, 4124871525, 4097114355, 4171215009, 4094254353
    !DD 4185190458, 4184112513, 4187782989, 4037092584, 4114448259, 4096721880, 4003880118, 4035500259
    !DD 4080989598, 4090215738, 4104202098, 4144153608, 4027213065, 4112123319, 4029634383, 4188620745
    !DD 4003957254, 4158202674, 4165028370, 4101889029, 4064867064, 4056294705, 4117302630, 4094813610
    !DD 4089078504, 4072584339, 4075250574, 4144182519, 4020827805, 4077052605, 4012941570, 4114015830
    !DD 4015303260, 4012049835, 4031934513, 4123667379, 4025171265, 4149021864, 4020494469, 4152989853
    !DD 4141465314, 4050172164, 4130534940, 4124347128, 4155032220, 4123523313, 4038610005, 4066391700
    !DD 4052359893, 4138494750, 4046848368, 4015233183, 4065337650, 4181156010, 4149686553, 4115669703
    !DD 4080411408, 4029985884, 4072279314, 4136476293, 4102312674, 4148638644, 4020161274, 4056852945
    !DD 4084467288, 4090139205, 4152479904, 4129623354, 4189154793, 4042650633, 4113056934, 4070634510
    !DD 4172190345, 4012616748, 4092782529, 4042027470, 4034320863, 4017110193, 4128178095, 4005317820
    !DD 4121565819, 4160465475, 4093432608, 4094047308, 4092039654, 4132108680, 4160799915, 4109110719
    !DD 4190254803, 4063105479, 4123739478, 4086096945, 4113466908, 4169157873, 4036670034, 4035486873
    !DD 4154194098, 4074334704, 4006945965, 4119880785, 4050935955, 4131729105, 4170646809, 4191996963
    !DD 4055775498, 4029162399, 4118132214, 4116397584, 4121266560, 4102454433, 4146555864, 4103353149
    !DD 4119974010, 4080379233, 4192378968, 4061071950, 4104928533, 4042978743, 4188739878, 4066717740
    !DD 4017709695, 4027617453, 4110604308, 4107339654, 4076278878, 4077074274, 4097495403, 4179562659
    !DD 4187765853, 4187454249, 4015793904, 4083863454, 4078492929, 4166495943, 4101303048, 4149525330
    !DD 4095286830, 4078227909, 4189944624, 4010811645, 4032304584, 4151394078, 4044317298, 4136517915
    !DD 4198354635, 4192501860, 4073134869, 4060180830, 4076815050, 4190613315, 4142749785, 4122567564
    !DD 4071542523, 4024430004, 4122798648, 4041267495, 4006243575, 4092566124, 4141397349, 4175565558
    !DD 4159829190, 4173505479, 4084339563, 4085131608, 4081507743, 4069428324, 4011038568, 4092438129
    !DD 4005482298, 4020895359, 4127615184, 4162803795, 4038272028, 4123171464, 4199942199, 4067713245
    !DD 4129181838, 4021766328, 4141102845, 4002607668, 4051580310, 4082443044, 4078962945, 4072199883
    !DD 4180693749, 4040763375, 4025696004, 4066226853, 4013137770, 4084688994, 4081465923, 4185884010
    !DD 4184193840, 4095653625, 4071642489, 4003011123, 4021708860, 4038391383, 4003548888, 4016275635
    !DD 4051483344, 4052001093, 4131504594, 4129105653, 4187278653, 4058921709, 4167113355, 4106971188
    !DD 4074045393, 4069825200, 4009724565, 4120937589, 4119577560, 4151390115, 4000637598, 4088788530
    !DD 4014859458, 4003633353, 4192075623, 4009856424, 4048255155, 4100175633, 4129717695, 4012882215
    !DD 4119226824, 4122492603, 4074693864, 4062187338, 4022104890, 4186039455, 4191285474, 4165800789
    !DD 4047934929, 4045886208, 4028478450, 4098395724, 4095869853, 4004229753, 4110500373, 4188458055
    !DD 4093944063, 4122368673, 4136075109, 4024434645, 4145270010, 4121262090, 4051650480, 4076720613
    !DD 4057135713, 4053301650, 4074379569, 4103950185, 4146078999, 4029125490, 4036104003, 4122595203
    !DD 4173008610, 4155931704, 4048316175, 4178853645, 4049069715, 4187855514, 4193714559, 4132340133
    !DD 4001184978, 4087342068, 4038996009, 4032782589, 4103313705, 4057212699, 4094324010, 4117022988
    !DD 4016133978, 4057176333, 4081210119, 4183410330, 4054406019, 4008415374, 4131217578, 4049176725
    !DD 4033804230, 4154677353, 4194818769, 4057689999, 4065887250, 4083913149, 4160269749, 4148719650
    !DD 4086572148, 4079152770, 4198797849, 4025836533, 4121774838, 4114818903, 4193265369, 4005720123
    !DD 4172736744, 4113446385, 4153872675, 4022863908, 4169665353, 4080875223, 4148976378, 4158173325
    !DD 4012107315, 4146530883, 4042645638, 4189878099, 4075365840, 4053276279, 4112504730, 4144260888
    !DD 4102144035, 4181673825, 4171915968, 4123257354, 4032551355, 4054454535, 4132616253, 4057321905
    !DD 4174490559, 4165419468, 4169862234, 4116771594, 4009920498, 4164231630, 4163597154, 4181713095
    !DD 4000268439, 4077171264, 4045424718, 4116626304, 4052701140, 4140380880, 4027965249, 4102323183
    
    END FUNCTION

    Leave a comment:


  • David Roberts
    replied
    I wondered if CALL DWORD might somehow be used for that application.
    CALL DWORD still uses PB parameters.

    Code:
    Is there a way to call an API function, namely queryPerfCntr, which takes a quad variable, but without declaring a quad variable?
    In my case, yes, since the TSC and QPC are effectively one in the same thing so I'd simply use RDTSC.

    Leave a comment:


  • David Roberts
    replied
    Rnd2, the basic version, and Rnd2C the crypto version.
    Why? Post #327 removes the object oriented Crypto and replaces it with a procedural Crypto. There is no longer a need for an option to turn Crypoto off if non PB9/CC5 or it isn't required because it is no longer reliant upon PB9/CC5 and does not encumber the speed.

    For non-XP/Vista users Rnd2mize is about twice as fast as Rnd2mizeCrypto. For Xp/Vista users Rnd2mizeCrypto is about twice as fast as Rnd2mize. The Crypto method available is determined during the initialization whether that be via PBMain or embedded within Rnd2.

    My PBMain is simply
    Function PBMain() As Long

    Rnd2SetUp
    ......
    ......
    ......
    Rnd2TidyUp

    End Function

    similar to when Crypto was using objects.

    Leave a comment:


  • John Gleason
    replied
    I see two variants: (a) David's pre-initialization version, and (b) possibly a more limited self-initialization version.
    It's funny, that's exactly what I was thinking earlier today--Rnd2, the basic version, and Rnd2C the crypto version. Both can happily coexist in the include file, and in Rnd2C Rnd2mize will be taken over by the crypto seed. The syntax of the two could be identical, or at most vary by Rnd2C needing the initial CASE 7 call and cleanup macro. << The actual case numbers could be adjusted if needed.

    I got the single declare version going (minus the queryPerfCnt call) and I was surprised to see a pretty big improvement. I don't want to say the % because I need to see if the algo is doing everything correctly first. After a bit more testing, I'll post it.

    Leave a comment:


  • Gary Ramey
    replied
    It is true that byval is not going to work and fulfill our design goals. After John's "heads-up" I eventually found failures for several parameter scenarios. And, I learned that it was easy to drop unexpectedly into the Select Case section of the function.

    Regarding the idea a few posts back about packing all variables into a storestate string, I can't imagine this would be faster. Right now, isn't the compiler precomputing the variable addresses for us? Isn't that faster than the CPU microcode needing to real-time compute the esi+ addresses for each opcode that would have get data in/out of the string?

    At this point we seem to have a good prng, with options for random seeding, with good features, and which is quite a bit faster that RND. The speeds already exceed the needs for my present applications. Are we ready to pull it together, retest every option once more, re-verify ent & diehard, and wrap it up? I see two variants: (a) David's pre-initialization version, and (b) possibly a more limited self-initialization version.

    After my mistake on byval, I've switched back to a methodical checklist to validate: RND(), toggled RND(), RND(0), RND(1,52), RND(52,1), RND(-minint,0), RND(-minint,+maxint), RND(0,+maxint), RND(0,0), RND(20,20), RND(-20,-20), RND(-20,-10), RND(-10, -20), and a range of RND(-int).

    Leave a comment:


  • John Gleason
    replied
    Thanks Dave, I'd never see that function before. That proves it methinks. I've got a question: Is there a way to call an API function, namely queryPerfCntr, which takes a quad variable, but without declaring a quad variable? I want to do something like queryPerformanceCounter [esi+12] instead of queryPerformanceCounter eq. I wondered if CALL DWORD might somehow be used for that application.

    Leave a comment:


  • David Roberts
    replied
    John, if you have a look at PB9's 'IsMissing' function you'll see that it doesn't work with Byval either.

    Leave a comment:


  • John Gleason
    replied
    Originally posted by David Roberts View Post
    John, I've been a bit busy this last few days and came back to knock out the revised Crypto code and may have missed something. I don't recognise the code you changed and it is not clear what previous alterations are to remain.

    I have just GPF'd.
    Dave and Gary, re. post #328 revised code: It doesn't work correctly with anything except 2 parameters as I just discovered, and Dave discovered previously. I still think BYVAL has a good chance so I'm going to test it using the original "how many parameters were passed" determination code shown below:
    Code:
      !lea eax, Two
      !mov ecx, [eax]
      !cmp ecx, 0
      !jne twoParams      ;if optional parameter Two passed, belt down to integer Rnd2 code section
      !lea eax, One
      !mov ecx, [eax]
      !cmp ecx, 0
      !je noParams        ;if no params, boogie down to noParams code
    added: still testing BYVAL and it seems so far to be ok except for "repeat last random value" which I'm looking into. Here is a timing:
    `
    Code:
    PB9 OPTIMIZE SPEED
    10,000,000 iterations
    RND:           1.000  Reference 2601 ms
    Rnd2 [0,1):    0.709
    Rnd2 (-1,1):   0.691
    
    *****Integer Mode*****
    RND(1, 52):   1.594
    Rnd2(1, 52):  0.612
    RND(-2147483648, 2147483647):   1.645
    Rnd2(-2147483648, 2147483647):  0.627
    SIZE optimization was even faster.

    added2: Dang, BYVAL failed again. Gary, you said to abandon it but I couldn't let the speed go. Well, now I have to let it go because it just won't work even with all I've tried. There might be a way that I can't find, but I guess it's time to move on.
    Last edited by John Gleason; 7 Nov 2008, 02:23 PM. Reason: added timing & 2nd note

    Leave a comment:


  • David Roberts
    replied
    John, I've been a bit busy this last few days and came back to knock out the revised Crypto code and may have missed something. I don't recognise the code you changed and it is not clear what previous alterations are to remain.

    I have just GPF'd.

    Leave a comment:


  • John Gleason
    replied
    This relates to Gary's post several back re. BYVAL function--the problem was not being able to distinguish zero. Solution is to use !LEA like we did originally rather than !mov at the top compares. The other changes lower in the function are fine and remain the same. Speed? well you be the judge, but better watch your socks again.
    Code:
      !lea eax, Two       ;go back to this
    '  !mov eax, Two      ;don't use this
      !cmp eax, 0         ;now it's comparing if there's a valid address
      !jne twoParams      ;IF OPTIONAL parameter Two passed, belt down TO INTEGER Rnd2 CODE section
      !lea ecx, One       ;same as above for Two
    '  !mov ecx, One      ;comment out
      !cmp ecx, 0
      !je noParams        ;IF no params, boogie down TO noParams CODE
    Code:
    PB9 optimize size
    RND:           1.000  Reference 2799 ms
    Rnd2 [0,1):    0.636
    Rnd2 (-1,1):   0.635
    *****Integer Mode*****
    RND(1, 52):   1.560
    Rnd2(1, 52):  0.556
    RND(-2147483648, 2147483647):   1.565
    Rnd2(-2147483648, 2147483647):  0.626
    
    
    PB9 optimize speed
    RND:           1.000  Reference 2674 ms
    Rnd2 [0,1):    0.668
    Rnd2 (-1,1):   0.665
    *****Integer Mode*****
    RND(1, 52):   1.673
    Rnd2(1, 52):  0.629
    RND(-2147483648, 2147483647):   1.634
    Rnd2(-2147483648, 2147483647):  0.571

    Leave a comment:


  • David Roberts
    replied
    ON CRYPTO

    Here is one way to have a procedural Crypto as opposed to an object Crypto giving access to pre PB8/CC5 users.

    There is now no need for a %USECRYPTO = 0 or 1.

    Right, here we go.

    Remove: %USECRYPTO =

    Remove: Macro Rnd2goCrypto and Macro Rnd2stopCrypto

    Add: Declare Function GetCryptoBytes( ByVal Which As Long, ByVal hProvProc As Long, x As Ext) As Long

    Add:
    Code:
    $MS_DEF_PROV = "Microsoft Base Cryptographic Provider v1.0"
    %PROV_RSA_FULL = 1
    %CRYPT_VERIFYCONTEXT = &hF0000000
    %XP = 1
     
    Function GetCryptoBytes( ByVal Which As Long, ByVal hProvProc As Long, x As Ext ) As Long
      Dim BinaryByte(9) As Static Byte
      Static hexSize As Dword
      hexSize = 10
      If Which = %XP Then
        Call Dword hProvProc Using RtlGenRandom( BinaryByte(0), hexSize )
      Else
        CryptGenRandom( hProvProc, hexSize, BinaryByte(0) )
      End If
      x = Peek( Ext, VarPtr( BinaryByte(0)))
    End Function
    At the head of Rnd2 add:
    Code:
    Static os As OSVERSIONINFO, strOS As String
    Static CryptoXP As Long
    Static hProvProc, hLib As Dword
    In the initialization section add:
    Code:
    os.dwOSVersionInfoSize=SizeOf(os)
    GetVersionEx os
    If os.dwPlatformId = %VER_PLATFORM_WIN32_NT Then
      strOS = Trim$(Str$(os.dwMajorVersion)) & "." & Trim$(Str$(os.dwMinorVersion))
      If strOS > "5.0" Then ' we have XP/Vista/Server 2003 or 2008
        CryptoXP = %XP
      Else
        CryptoXP = 0
      End If
    Else
      CryptoXP = 0
    End If
     
    ' hProvProc is either an address or provider depending upon whether the user has XP/Vista/Server 03 or 08 or not respectively.
    If CrypToXP = %XP Then
      hLib = LoadLibrary( "advapi32.dll")
      hProvProc = GetProcAddress(hLib, "SystemFunction036")
    Else
      CryptAcquireContext( hProvProc, ByVal %Null, ByVal %Null, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
      CryptAcquireContext( hProvProc, ByVal %Null, $MS_DEF_PROV, %PROV_RSA_FULL, %CRYPT_VERIFYCONTEXT )
    End If
    As you know, my initialization is done in PBMain via Case 7. There is some tidying up to do and I suggest this be done via Case 8 as follows:
    Code:
    Case 8
      If CryptoXP = %XP Then
        FreeLibrary hLib
      Else
        CryptReleaseContext hProvProc, 0
      End If
    So, we used to have Rnd2GoCrypto/Rnd2StopCrypto and I now have Rnd2SetUp/Rnd2TidyUp.

    Nearly there.

    In the Case 2 code replace GetCrypto.Bytes(rndE) with GetCryptoBytes(CryptoXP, hProvProc, rndE).

    With regard timing I was getting about a 15.8 x 1.000 Ref with Crypto and the faster generator and now I'm getting about 16.1 x 1.000 Ref so, nothing in it except, of course, all the other metrics are not encumbered by the presence of an object; which I was not expecting when I introduced Crypto. The bonus from this method is, as mentioned, Crypto is now available to users of the earlier compilers.

    PS: The faster generator takes about one quarter of the time of the slower generator.
    Last edited by David Roberts; 7 Nov 2008, 05:27 AM.

    Leave a comment:


  • John Gleason
    replied
    For the exponent ranges we're working with, the 2^-63 multiplication via FMULP is the same as taking the upper two bytes of rndE and subtracting 3F.
    Wow, I'd never have seen that. It's a real lesson in asm to me. I wish this following thought were so advanced.

    I'm in the process of tinkering with an idea that, before I spend too much time on it, I have to see if a significant speed up occurs. You may chuckle at this, but basically it is: What if we declare just one variable in the function? We're virtually all in asm anyway, so just make all the memory references relative to storeState, and make it encompass all 9 variables. Our whole declare then is:
    Code:
    STATIC storeState AS STRING * 60
    and we save 8 variables.

    At the top, we'll just !push esi: !lea storeState, esi then reference say mwcCarry with eg. !mov eax, [esi+8] instead of !mov eax, mwcCarry. You're probably now aren't ya?

    Leave a comment:

Working...
X