Announcement

Collapse
No announcement yet.

CryptoRndII

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CryptoRndII

    I ported my CryptoRnd to FreeBASIC.

    For those not acquainted with CryptoRnd it uses Microsoft's BCryptGenRandom, so Windows Vista is the minimum OS required. Two buffers are created and random numbers are available once the first buffer has been populated, with the second buffer being populated in a separate thread of execution. When the first buffer has been exhausted we switch to the second buffer and then start to re-populate the first buffer. The buffers are actually split in two and each half is populated by its own thread of execution.

    I recently learned of Thread Pooling, via Rick Kelly, so wrote CryptoRndII. CryptoRnd, with PowerBASIC, has a Single throughput of about 45MHz. CryptoRndII, with PowerBASIC, has a throughput of about 300MHz. PowerBASIC's RND has a throughput of about 80MHz. To put that into further perspective my CMWC256 comes in at about 142MHz. RND is a PRNG whereas CrytoRndII is a CPRNG. The test machine has an Intel i7-3770K @ 3.50GHz.

    The output is Dword, Single, Double and Range. The Dword was fed into PractRand and it passed upto one Terabyte of data.

    I may no longer be an active member here but I have not deserted and wanted to use CryptoRndII in some of my PowerBASIC apps but that would have required a rewrite. That would not have been a formidable task but the port of CryptoRnd to FreeBASIC was not exactly a 'walk in the park'. It was easier to create a dll. As with PowerBASIC, FreeBASIC creates small binaries - CrytoRndII.dll is only 20KB.

    There are two procedures which must be employed before and after random number generation. The first is InitializeCryptoBuffers. Without this the host application will hang on Windows 10 and generate a GPF on earlier Windows. The second is CleanUpCryptoRndII which closes open threads in the pool, the pool itself and the BCryptGenRandom handle.

    At the following link is a zipped folder, CryptoRndII.zip, which includes CryptoRndII.dll and TestBed.bas. TestBed.bas has the declares for all the public procedures and is a usage example as well as a test bed.

    CryptoRndII.zip
    Last edited by David Roberts; 1 Aug 2017, 11:44 AM.

  • #2
    Apologies: TestBed.bas uses my MultipleTimersQPC.inc for timing - I have it in my PBMain template. If you haven't got that then use whatever you use for milli-second timing.

    Comment


    • #3
      Is your DLL for speed or better security?

      If our computers flunk your RdSeed test posted below is our code insecure using CSPRNG?
      My Intel machines have all flunked so should I use your code in all cases?
      https://en.wikipedia.org/wiki/CryptGenRandom

      Code:
      Function PBMain
      Local CPUManID As String * 12
      Local CPUManIDPtr As Byte Ptr
      Local RdSeed As Long
      
      ' Get Manufactures Id String
      CPUManIDPtr = VarPtr(CPUManID)
      ! mov eax, 0 ' Get Vendor Id
      ! cpuid
      ! mov eax, CPUManIDPtr
      ! mov [eax], ebx
      ! mov [eax + 4], edx
      ! mov [eax + 8], ecx
      If CPUManID = "GenuineIntel" Then
        ' Is RdSeed supported?
        ! mov eax, 7 ' Get extended features
        ! mov ecx, 0
        ! cpuid
        ! test ecx, &h040000 ' Bit 18
        ! jz NoRdSeed
        RdSeed = -1
      NoRdSeed:
      End If
      
      If IsTrue RdSeed Then
        MsgBox "Your CPU supports RdSeed", , "RdSeed test"
      Else
        MsgBox "Your CPU does not support RdSeed", , "RdSeed test"
      End If
      
      End Function
      https://forum.powerbasic.com/forum/u...cussion-csprng
      Last edited by Mike Doty; 1 Aug 2017, 06:42 PM.
      The world is full of apathy, but who cares?

      Comment


      • #4
        Is your DLL for speed or better security?
        It was not written for either.

        You may recall that I have posted quite a bit on random number generation from using AES, RND2 with John Gleason, Complementary-Multiply-With-Carry CMWC256, xorshift128+ and xoroshiro128+.

        However, they are all PRNGs and should not be used in cryptographic work.

        I then wrote three crypto generators using RtlCryptGenRandom, BCryptGenRandom and Intel's RdRand. These CPRNGs are slow compared with PRNGs. In general, all CPRNGs are slower than PRNGs.

        CryptoRnd was written to give the BCryptGenRandom method a bit more speed. The original speed was a pedestrian 3MHz. When I introduced buffering I managed to get 45MHz. I was pleased with that as we now had a CPRNG that compared favourably with RND speed-wise but could be used in cryptographic work.

        RdSeed was designed by Intel to seed PRNGs. RdSeed has multiplicative prediction resistance and is not intended to be used as a random number generator per se. RdRand has additive prediction resistance and was designed as a random number generator per se.

        I don't have RdSeed on my machine either so posted 256 bit seed for PRNGs or encryption purposes using CBC-MAC AES on BCryptGenRandom to give "seed-grade entropy". That idea was from Intel for folk who did not have RdSeed.

        When I learned about Thread Pooling I thought I would give it a whirl to see if I could get CryptoRnd to run a bit faster.

        What I did not expect was to end up with one of the fastest random number generators on my machine and it is a CPRNG!

        With threads we have a create/destroy cycle. With thread pooling we don't and that is where the speed comes in because thread creation does not come cheap. Thread pooling was not designed for what I am using it for, but, hey, I don't care what it was designed for.

        So, use CryptoRndII if you require a cryptographic generator. As a bonus you will also get a blinding fast generator but that was not the original design brief.

        I wrote "one of the fastest random number generators on my machine" because I have one which is faster using PCG. That is coming in at 500MHz with my FreeBASIC implementation and five times faster than FreeBASIC's built in Mersenne Twister. However, PCG is another story and is a PRNG so should not be used for cryptograhic work.

        Of course, CryptoRndII cannot be used if we want to repeat a sequence. If we could repeat a sequence then it would not be cryptographic.

        Added: BTW, Mike, your link to Wikipedia refers to the generator found in Windows XP and earlier, going back to Windows 95. BCryptGenRandom was introduced in Windows Vista and we should be using that, if we can, as it includes the latest recommendations made by the NIST for generating cryptographic random numbers.
        Last edited by David Roberts; 1 Aug 2017, 09:35 PM.

        Comment


        • #5
          FreeBASIC uses import libraries. The last update for BCrypt.dll was before Windows 8 and BCryptGenRandom saw an update then. I don't think that the use of BCryptGenRandom in CryptoRndII has been affected but to make sure that the version of BCryptGenRandom is of the host machine I am now loading the dll myself. Testbed.bas has the throughput for Rnd added. There is an obvious delay waiting for the Rnd figure to show. New versions at the link in the opening post.

          Comment


          • #6
            As much as I like the speed of CPRNG generation using CryptoRndll, I believe that I saw a post in another forum advising NOT to use it in cryptographic functions. I'm using BCryptGenRandom for 32-bit DWORD values, and assuming that the WinAPI function IS safe to use for crypto. It generates about 4.5M values per second on a Win10 laptop (Intel i5-4300M @ 2.6GHz). Is this my best current option for Win7 and later OS?

            Comment


            • #7
              Originally posted by Jerry Wilson
              I believe that I saw a post in another forum advising NOT to use it in cryptographic functions.
              Funnily enough, that may have been me. I wrote this on January 21 this year:

              I am currently reading 'Serious Cryptography: A Practical Introduction to Modern Encryption' by Jean-Philippe Aumasson, which was very recently published. From a quality perspective, CryptoRndII is top drawer and it is very fast. However, even though it is a CPRNG, as opposed to a PRNG, it seems that my implementation has compromised the security aspect and should not be used in cryptographic work. The implementation is about speed and I treated the cryptographic aspect as a bonus. It was not a bonus - the cryptographic aspect went 'out of the window'.
              CryptoRNDII uses two 128KB buffers. Any unused numbers in the buffers are not exactly unpredictable. Of course, BCryptGenRandom uses a buffer but the intention is to use them straight away and not sit waiting until requested.

              Used as intended BCryptGenRandom is safe for cryptographic purposes and I have not read anything that suggests otherwise.

              Originally posted by Jerry Wilson
              Is this my best current option for Win7 and later OS?
              Quality goes without saying but to make sure nothing untoward crept into CryptoRNDII I gave it to PractRand to look at and it got to 1TB with only a few very minor anomalies. I would find a 'clean sheet' as suspicious and would expect a few very minor anomalies with quantum random numbers. For my PowerBASIC work, I don't have anything which gets close to it 'speedwise'.

              I do have a faster generator from pcg-random.org by Melissa O'Neill but that is written in FreeBASIC and I am using techniques which do not port easily into PowerBASIC. However, it is only marginally faster than CryptoRNDII.

              Comment


              • #8
                Update

                A new function has been added: CryptoSE. The SE stands for Single Extended. As with CryptoS a single Dword is used but the output is double precision retaining the 32-bit granularity. CryptoS, being a classical single, has 24-bit granularity. CryptoD, of course, uses two Dwords for it's 53-bit granularity. PowerBASIC's RND also uses a single Dword but outputs extended precision and, I assume, retains the 32-bit granularity.

                CryptoSE is marginally faster than CryptoS but only by a few percent.

                New versions of CryptoRNDII.dll and TestBed.bas are at the opening post's link.

                Comment


                • #9
                  Update

                  I did not think that the following was possible but I have a few tricks to learn yet over at FreeBASIC.

                  CLASS objects have not been implemented yet in FreeBASIC but it does have Constructors and Destructors. I am using a Constructor to invoke InitializeCryptoBuffers and a Destructor to invoke CleanUpCryptoRndII. These two procedures are no longer exported and are no longer required in our PowerBASIC code.

                  Two new functions have been added: CryptoSX has the same granularity as CryptoSE but the generators Dwords map into [-1,1) as opposed to [0,1); CryptoDX has the same granularity as CryptoD and that also maps into [-1,1). Of course, CryptoSX, for example, is effectively CryptoSE x 2 - 1 but we remain in the asm domain to keep the speeds up. SX and DX are slower than SE and D but they are still fairly fast when compared with PB's RND.

                  In theory, the average SX and DX should be close to vanishing and are working on it with 10^8 iterations.

                  The opening post's link has the update and a revised TestBed.bas.

                  Here is a typical TestBed output:
                  Code:
                  Throughput for CryptoS 321 MHz
                  Throughput for CryptoSE 338 MHz
                  Throughput for CryptoSX 284 MHz
                  Throughput for CryptoD 304 MHz
                  Throughput for CryptoDX 279 MHz
                  Throughput for CryptoR 255 MHz
                  Throughput for Rnd 82 MHz
                  
                  CryptoDW  2995288295
                  CryptoS  .8857191
                  CryptoSE  .506710216170177
                  CryptoSX -.754811347927898
                  CryptoD  .360919225287273
                  CryptoDX  3.51928905181831E-2
                  CryptoR  92
                  
                  Average CryptoS  .499994640519217
                  Average CryptoSE  .499999468525118
                  Average CryptoSX  3.27185713875713E-5
                  Average CryptoD  .499985550241817
                  Average CryptoDX -1.74738845383E-5
                  Average CryptoR 127.49351395
                  
                  Done. Press any key

                  Comment


                  • #10
                    STOP PRESS

                    Something is going wrong.

                    Here is CryptoSX. The first 16384 are -1
                    Code:
                     16380  -1
                     16381  -1
                     16382  -1
                     16383  -1
                     16384  -1
                     16385   .818555422592908
                     16386   .959809088148177
                     16387  -.931993467267603
                     16388   5.66826555877924E-2
                     16389  -.809050042647868
                     16390  -.543150191195309
                    In fact all Crypto* are affected.

                    It looks like an issue with WaitForThreadpoolWorkCallbacks. It seems to behave differently when called within a Contructor but I cannot fathom out why yet. Of course, with 10^8 iterations we would not realize that the first 16KB were wrong. I don't know what 16KB has to with it either.

                    Attached is PreviousCryptoRndII.zip which is the one where CryptoSE was added. No issues with that.

                    PreviousCryptoRdII.zip

                    Comment


                    • #11
                      The issue with WaitForThreadpoolWorkCallbacks is that I did not actually need it. There are 18 thread pool functions and as a concept is far more powerful then CryptoRndII's needs. CryptoRndII exploits one aspect of pooling, namely (from MSDN) : "An application that creates and destroys a large number of threads that each run for a short time. Using the thread pool can reduce the complexity of thread management and the overhead involved in thread creation and destruction."

                      I am now getting random numbers from the first one requested and there are no issues when we switch buffers - it looks seamless even though it is not. However, I will continue testing a variety of scenarios until I reckon all is as it should be.

                      Comment


                      • #12
                        All is not as it should be. Using a Constructor in a dll was a bad move. I have not seen anyone at FreeBASIC do that but that does not put me off trying. The Destructor, on the other hand, is doing as it should and there is now no need to use CleanUpCryptoRndII in our PowerBASIC code.

                        The opening post now is a link to the original idea plus CryptoSE (post #8), CryptoSX and CryptoDX (both in post #9). CleanUpCryptoRndII is no longer exported. Ignore the link in post #10. A new TestBed.bas is included.

                        That is me back out of PowerBASIC for a while. I am playing with the latest FreeBASIC build and the latest gcc backend optimizing compiler.

                        Comment


                        • #13
                          I was just about to refer to this thread in another thread but decided to check it out, links working and so on, before doing so.

                          The OP link is a zipped folder, but it was in a bit of a mess. I am certain that TestBed.bas was in the original zip and not TestBed.exe which I found. TestBed doesn't just test throughputs but also includes all the desired declarations. Perhaps I updated TestBed and transferred the exe and not the bas. Anyway, we now have TestBed.bas whch uses MultipleTimersLite.inc. I found a bak file for the latter - I must have been on the rum.

                          The dll was recompiled using a later compiler and is now coming in at a barmy 20KB!

                          I intended referring to this thread because CryptoRndII is thread safe - it does not have a state vector besides, obviously, not needing to be seeded either.

                          Comment

                          Working...
                          X