Announcement

Collapse
No announcement yet.

Pcg32 prng

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • David Roberts
    replied
    If you have downloaded the latest version since my last post you will need to do it again. Every time I upload I always download to make sure it will work for you. It failed. I was testing a new compiler and accidentally used it to compile the dll again. I knew something was wrong because the FRange was less than I was expecting according to post #7. Situation corrected and a new typical TestBed output posted above.

    I currently have nine versions of the C compiler, all of which will knock out 32-bit and 64-bit, and have developed a few add-ons for Paul Squires IDE one of which provides a quick way to switch between them - perhaps too quick. It can get a bit hairy sometimes but it keeps me occupied.

    My machine is a little faster with Win10 2004 and that is reflected in the SE throughput - now pushing 200MHz.

    I have also being looking at the latest generators from the Sebastian Vigna stable but Melissa O'Neill's PCG32 has nothing to worry about.

    Leave a comment:


  • David Roberts
    replied
    Update:

    RandomizePCG32 has been changed from
    Code:
    RandomizePCG32( Byval Seed As Quad, Byval Sequence As Quad )
    to
    Code:
    RandomizePCG32( szSeed as Stringz * 21, szSequence As Stringz * 21 )
    szSeed and szSequence may take numeric values of "0" to "18446744073709551615" ie 0 to 2^64-1 using a max of 21 characters including a null terminator.

    This is not me being clever - I came across the keyword ValULng, along with other Val variants, which converts a string to an unsigned 64-bit integer.

    FreeBASIC is 'Sub RandomizePGC32 Alias "RandomizePCG32"( Byref Seed As Zstring = "", Byref Sequence as Zstring = "" ) Export' so for cryptographic seeding we simply use RandomizePCG32 with Seed and Sequence defaulting to empty strings. I had forgotten that PB's dynamic strings are not the same as FB's dynamic strings but did not want to mess about with BSTR then remembered that PB's Stringz is compatible with FB's ZString.

    All the other features of RandomizePCG32 are the same as above.

    The new version is linked in the opening post, as a zipped folder, which includes TestBed.bas which considers the eight procedures available. PCG32.dll is now 24.5KB.

    Here is a typical output of TestBed.bas.
    Code:
    6 extended singles (32-bit granularity)
    
    .992647086502984
    .035648931749165
    .304272694280371
    .86826731543988
    .74727521976456
    .210632024332881
    
    6 doubles (53-bit granularity)
    
    .694024321588234
    .714896616228037
    .85373794823694
    .206337270386435
    .904211694257165
    .69395811801828
    
    6 Dwords
    
    1085861913
    1334470428
    1777458697
    2977381045
    3754155940
    650581345
    
    6 integral range (0,255)
    
    167
    238
    107
    212
    55
    83
    
    6 float range (0,255)
    
    100.40518755326
    218.529270758154
    209.626711781602
    36.5082135179546
    97.7995494520292
    117.802630844526
    
    Average of 10^8 single extended: .499923172740479
    
    Average of 10^8 Dwords: 2147731862.88096
    
    Average of 10^8 integral range (0,100): 49.99632717
    
    Average of 10^8 float range (0,100): 50.0052708656194
    
    Estimate of 'raw' throughput (Millions per second)
    
    DW: 272MHz
    SE: 198MHz
    D: 134MHz
    Range: 149MHz
    FRange: 107MHz
    PB's RND: 82MHz
    
    Back to start of single extended
    .992647086502984
    .035648931749165
    .304272694280371
    .86826731543988
    .74727521976456
    .210632024332881
    
    Burn 1000 single extended and take a snapshot
    .204452467849478
    .654834392713383
    .371247503207996
    .144226878182962
    .811277533881366
    .964529257267714
    
    Repeat burnt single extended
    .204452467849478
    .654834392713383
    .371247503207996
    .144226878182962
    .811277533881366
    .964529257267714
    
    All done - press a key
    Last edited by David Roberts; 2 Oct 2020, 08:51 AM.

    Leave a comment:


  • David Roberts
    replied
    I have managed to get PCG32FR from 76MHz to 110MHz, just short of a 45% increase. I am using an optimizing compiler and it is almost impossible to second guess what it gets up to even with our own asm code. If ever I use asm in a procedure I always write two versions, one with asm and the other with wholly BASIC. Not always but sometimes the wholly BASIC is faster. Time does not permit but one day I will disassemble the BASIC native code to see what the compiler is up to.

    Latest version uploaded.

    Leave a comment:


  • David Roberts
    replied
    I have been asked if it would be much of an effort to add a double with 53-bit granularity as opposed to 32-bit granularity. The reason for the request didn't seem to me to be particularly strong but since it was no effort it has been added, and called PCG32D. The throughput has not been halved, using two Dwords, as we call the engine twice from one function call. I would be interested what arguments others may have for using 53-bit granularity, my imagination is leaving me a little wanting.

    I have also included a floating-point range, called PCG32FR. Of course, we could write a function to do that using RND or a PCG32 value but that would probably slow things down. Having said that PCG32FR is not as fast as expected.

    The new declarations are:
    Code:
    Declare Function PCG32D Lib "PCG32.dll" Alias "PCG32D" As Double
    Declare Function PCG32FR Lib "PCG32.dll" Alias "PCG32FR" ( Byval First As Double, Byval Last As Double ) As Double
    The engine had 'Sequence Or 1' to ensure odd but this has been changed to 'Sequence' as the 'Or 1' is taken care of in RandomizePCG32 - I had forgotten to remove 'Or 1' in the engine. The throughput only gained 3% but is better than a poke in the eye with a sharp stick.

    I won't include another testbed but my results are as follows.
    Code:
    Estimate of 'raw' throughput (Millions per second)
    
    DW: 282MHz
    SE: 183MHz
    D: 111MHz
    Range: 180MHz
    Float Range: 76MHz
    RND: 82MHz
    As you can see 'D' is down but it is still 35% faster than RND. Float Range is disappointing but it is not exactly slow.

    Latest version uploaded.

    The dll is now a whopping 21KB!
    Last edited by David Roberts; 7 Apr 2019, 07:19 PM. Reason: Used RND(a,b) when I meant RND

    Leave a comment:


  • David Roberts
    replied
    Snapshot

    Two new functions: GetSnapshot; SetSnapshot

    GetSnapshot makes a copy of the current state vector.

    SetSnapshot returns the state vector to that taken at the last GetSnapshot.

    Just after the default RandomizePCG32 is executed a GetSnapshot is invoked.

    Latest version uploaded.

    Usage example:
    Code:
    #Compile Exe
    #Dim All
    #Include "Win32API.inc"
    
    Declare Function PCG32SE Lib "PCG32.dll" Alias "PCG32SE" As Double
    Declare Sub GetSnapshot Lib "PCG32.dll" Alias "GetSnapshot"
    Declare Sub SetSnapShot Lib "PCG32.dll" Alias "SetSnapshot"
    
    Function PBMain
    Local i As Long
    
    For i = 1 To 6
      Print PCG32SE
    Next
    Print
    
    SetSnapshot ' Back to start
    
    For i = 1 To 6
      Print PCG32SE
    Next
    Print
    
    For i = 1 To 1000
     PCG32SE
    Next
    
    GetSnapshot ' [1]
    
    For i = 1 To 6
      Print PCG32SE
    Next
    Print
    
    SetSnapshot ' Back to [1]
    
    For i = 1 To 6
      Print PCG32SE
    Next
    
    WaitKey$
    
    End Function

    Leave a comment:


  • David Roberts
    replied
    I finally got PractRand to test PCG32.dll, I have never tested a dll before.

    Here is a 1TB test, as you can see we have a small anomaly at 8GB. I have never seen a clean sweep to 1TB and not seen many with only one small anomaly, even Intel's RdRand will throw up two or three in a 1TB test.

    So, PCG32.dll passes with flying colours.

    If you ever use PractRand you may not be aware of the multithreaded switch. On my Intel i7 I get a 25% boost and with the 1TB test itself taking nearly 75 minutes it is a welcome improvement.

    Code:
    E:\PR64>PCG_Rng | Rng_test stdin32 -multithreaded
    RNG_test using PractRand version 0.94
    RNG = RNG_stdin32, seed = unknown
    test set = core, folding = standard (32 bit)
    
    rng=RNG_stdin32, seed=unknown
    length= 512 megabytes (2^29 bytes), time= 2.5 seconds
      no anomalies in 178 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 1 gigabyte (2^30 bytes), time= 4.9 seconds
      no anomalies in 192 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 2 gigabytes (2^31 bytes), time= 9.4 seconds
      no anomalies in 204 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 4 gigabytes (2^32 bytes), time= 18.0 seconds
      no anomalies in 216 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 8 gigabytes (2^33 bytes), time= 35.9 seconds
      Test Name                         Raw       Processed     Evaluation
      [Low1/32]BCFN(2+4,13-3,T)         R=  -8.2  p =1-1.0e-4   unusual
      ...and 228 test result(s) without anomalies
    
    rng=RNG_stdin32, seed=unknown
    length= 16 gigabytes (2^34 bytes), time= 71.2 seconds
      no anomalies in 240 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 32 gigabytes (2^35 bytes), time= 137 seconds
      no anomalies in 251 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 64 gigabytes (2^36 bytes), time= 281 seconds
      no anomalies in 263 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 128 gigabytes (2^37 bytes), time= 558 seconds
      no anomalies in 273 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 256 gigabytes (2^38 bytes), time= 1083 seconds
      no anomalies in 284 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 512 gigabytes (2^39 bytes), time= 2240 seconds
      no anomalies in 295 test result(s)
    
    rng=RNG_stdin32, seed=unknown
    length= 1 terabyte (2^40 bytes), time= 4490 seconds
      no anomalies in 304 test result(s)

    Leave a comment:


  • David Roberts
    replied
    Thanks, Dean.

    In testbed.bas the statements
    Code:
    ' Allow random Seed and Sequence.
    RandomizePCG32( 0, 0 ) ' We must use RandomizePCG32, there is no default state.
    can now be removed. The default is RandomizePCG32( 0, 0 ).

    We can, of course, invoke RandomizePCG32 at anytime, with RandomizePCG32( 123456, 654321 ) for example.

    Oh, I forgot to mention that there is no need to warm up the engine, 200 executions of PCG32DW are made at the end of the RandomizePCG32 procedure.

    I should also have mentioned that "C:\PBWin\Timers\MultipleTimersLite.inc" needs editing to wherever MultipleTimersLite.inc is located. It is another way to get folk to use that inc file, I get commission for every copy.

    Leave a comment:


  • Dean Gwilliam
    replied
    Thank you very much for all your work.
    Pcg32.prng looks to have a nice balance of everything.

    Leave a comment:


  • David Roberts
    started a topic Pcg32 prng

    Pcg32 prng

    *Really* minimal PCG32 code / (c) 2014 M.E. O'Neill / pcg-random.org
    Licensed under Apache License 2.0 (NO WARRANTY, etc. see website)

    PCG32.zip

    The unzipped dll is only 20.0KB.

    I could not port the source code from FreeBASIC to PowerBASIC because 64-bit unsigned integers are used and I don't want to mess about with 'workarounds'

    Originally posted by Lance Edmonds 2 May 2001
    64-bit unsigned variables are on the wish list already. I'll ask for a tick in your name to be added. Thanks!
    Not all ticks came to much.

    PCG32 is a 64-bit generator, that is with a period of 2^64, with a 32-bit output. Besides a seed input it has a sequence input both of which are 64-bit unsigned integers. PB cannot handle 64-bit unsigned integers so seed and sequence are input as Quads. Warning: Stick to 0 to 2^63-1 like glue, you may not get what you want otherwise.

    It is a quality generator and the FreeBASIC version passes PractRand to 1TB. The author and others have pushed it to 16TB with the original C code. I have yet to test the dll but, in the meantime, ran John Walker's ENT as I always do. The services of PractRand are not called upon on an ENT failure.

    Here is an ENT output:
    Code:
    E:\ENT>ent "100MB.dat"
    Entropy = 7.999998 bits per byte.
    
    Optimum compression would reduce the size
    of this 104857600 byte file by 0 percent.
    
    Chi square distribution for 104857600 samples is 261.08, and randomly
    would exceed this value 38.35 percent of the times.
    
    Arithmetic mean value of data bytes is 127.4956 (127.5 = random).
    Monte Carlo value for Pi is 3.141687360 (error 0.00 percent).
    Serial correlation coefficient is 0.000031 (totally uncorrelated = 0.0).
    No problems there.

    Sequence is converted internally to 'Sequence OR 1' as only odd sequences are allowed. Sequencing is not that common with PRNGs. A given seed can have a given sequence so for every seed value we can have upto 2^63 different sequences. With Rnd and most generators we have just the one sequence.

    At the moment we have three procedures: PCG32DW, RandomizePCG32, PCG32SE and PCG32R

    PCG32DW

    This simply outputs random Dwords. I doubt that there will be much call for this, it was written for PractRand testing and left in the code.

    RandomizePCG32( Seed, Sequence )

    We must use RandomizePCG32, there is no default state, I am working on that.

    We could have RandomizePCG32( 123456, 654321 ) and this is analogous to 'Randomize n' generating the same random numbers on each invocation.

    Randomize( 0, 654321) will deliver a random seed using CryptGenRandom. FB has not 'cottoned onto' BCryptGenRandom yet.

    Randomize( 123456, 0 ) will deliver a random sequence.

    Randomize ( 0, 0 ) will deliver both a random seed and a random sequence analogous to Randomize Timer, well not quite. Internally, a 64-bit unsigned integer is clipped to 2^63-1. With 2^63 sequences, that is over 9 million trillion, the chances are that we will be using a sequence that we have never used before and unlikely to ever use again.

    PCG32SE

    This outputs a Double random number with a granularity of 32 bits. The SE stands for 'Single extended', Single not as in Single precision but as only one 32-bit output is used. At this stage I am not providing 53-bit granularity which would require two 32-bit outputs. Had I done so I would have called it PCG32D.

    PCG32R( First, Last )

    This is analogous to RND(a, b)

    The following is a test bed. I am using MultipleTimersLite.inc. On one run I got:
    Code:
    DW: 274MHz
    SE: 174MHz
    Range: 177MHz
    RND:  82MHz
    So, SE is over twice the speed of RND. This is not as fast as CryptoRndII which uses a lot more memory via two buffers and streams random numbers into the buffers.

    Testbed.bas
    Code:
    #Compile Exe
    #Dim All
    #Include "Win32API.inc"
    #Include "C:\PBWin\Timers\MultipleTimersLite.inc"
    
    Declare Sub RandomizePCG32 lib "PCG32.dll" Alias "RandomizePCG32" ( Byval Seed As Quad, Byval Sequence As Quad )
    Declare Function PCG32DW Lib "PCG32.dll" Alias "PCG32DW" As Dword
    Declare Function PCG32SE Lib "PCG32.dll" Alias "PCG32SE" As Double
    Declare Function PCG32R Lib "PCG32.dll" Alias "PCG32R" ( Byval First As Long, Byval Last As Long ) As Long
    
    Function PBMain
    Local i, n as long
    Local tot as double
    
    InitializeTimers
    
    ' Allow random Seed and Sequence.
    RandomizePCG32( 0, 0 ) ' We must use RandomizePCG32, there is no default state.
    
    ' Knock out 10 DWORDs
    Print "10 DWORDS" : Print
    For i = 1 To 10
      Print PCG32DW
    Next
    Print
    ' Knock out 10 singles
    Print "10 SE" : Print
    For i = 1 To 10
      Print PCG32SE
    Next
    Print
    ' Knock out 10 range
    Print "10 Range" : Print
    For i = 1 To 10
      Print PCG32R(0,255)
    Next
    Print
    
    ' Knock out average of 10^8 DWORDs
    For i = 1 To 10^8
      tot += PCG32DW
    Next
    tot = tot/10^8
    Print "Average of 10^8 DWORDs: ";tot
    
    Print
    tot = 0
    ' Knock out average of 10^8 singles
    For i = 1 To 10^8
      tot += PCG32SE
    Next
    tot = tot/10^8
    Print "Average of 10^8 SE: ";tot
    
    Print
    tot = 0
    ' Knock out average of 10^8 range (0,100)
    For i = 1 To 10^8
      tot += PCG32R(0,100)
    Next
    tot = tot/10^8
    Print "Average of 10^8 range (0,100): ";tot
    
    Print
    Print "Estimate of 'raw' throughput (Millions per second)"
    Print
    
    StartTimer(0)
    For i = 1 To 10^8
      PCG32DW
    Next
    StopTimer(0)
    Print "DW: ";Using$( "###", int(100000/nTimetaken(0)) );"MHz
    
    StartTimer(0)
    For i = 1 To 10^8
      PCG32SE
    Next
    StopTimer(0)
    Print "SE: ";Using$( "###", int(100000/nTimetaken(0)) );"MHz"
    
    StartTimer(0)
    For i = 1 To 10^8
      PCG32R(0,100)
    Next
    StopTimer(0)
    Print "Range: ";Using$( "###", int(100000/nTimeTaken(0)) );"MHz"
    
    StartTimer(0)
    For i = 1 To 10^8
      tot = rnd
    Next
    StopTimer(0)
    Print "RND: ";Using$( "###", int(100000/nTimeTaken(0)) );"MHz"
    
    Waitkey$
    
    End Function
Working...
X