Announcement

Collapse
No announcement yet.

How to (simply) benchmark CPU performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to (simply) benchmark CPU performance

    Hi

    I sometimes get complaints about execution speed from my customers. So I had the idea to add a simple CPU benchmark into my application in order to find out how slow (fast) their processors are.

    My idea: count how many times a loop with a simple calculation can be executed in 0.05s (short enough to be hardly recognizable). Of course I know, that there are many other factors making a system slow or fast (e.g. disk access), but as a rough measure this would be a nice indicator.

    I order to get a more stable result I decided not to use TIMER, but QueryPerformanceCounter.
    Code:
    Function RelativePerformance() As single
    #Register None
    LOCAL qFreq     As Quad
    LOCAL qOverhead As Quad
    LOCAlqStart     As Quad
    local qStop     As Quad
    Local i         As Long
    LOCAl x         As Double
    
      QueryPerformanceFrequency qFreq
      QueryPerformanceCounter   qStart   ' Intel suggestion. First use may be suspect
      QueryPerformanceCounter   qStart   ' So, wack it twice <smile>
      QueryPerformanceCounter   qStop
      qOverhead = qStop - qStart         ' Relatively small, could e neglected for my problem
     
      QueryPerformanceCounter qStart
      i = 0
      DO
        i = i + 1
        QueryPerformanceCounter qStop
      LOOP UNTIl (((qStop - qStart - qOverhead)*1000/qFreq)>50)   '1/20 second
    END FUNCTION
    Hmm, this will not work at all. The problem seems to be different possible values for 'qFreq'. It will have a much higher value on old systems than on a modern system. And the calculation will behave completely different, making the results incomparable.

    Has anyone got an idea, how to find the aproximative CPU speed without this problem.

  • #2
    ... I had the idea to add a simple CPU benchmark into my application in order to find out how slow (fast) their processors are
    I think either "my computer" or Start/Control Panel/System will tell you that.

    If you want to do it from your program, the GetSystemInformation() function will give you "processor speed" and a whole lot of other info. But that's hard to use because you need the iShellFolder interface. I thought there was another API to get this but I can't find it right now.

    However......
    sometimes get complaints about execution speed from my customers.
    Looking at the CPU speed is not going to address those complaints. Assuming they are not complaining about execution being too fast, you'll just have to write faster code.

    MCM
    Last edited by Michael Mattias; 26 Feb 2009, 05:36 PM.
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


    • #3
      If you can give yourself a second--say just after startup--this is accurate within a few % on my machine.
      Code:
      #COMPILE EXE
      #DIM ALL
      
      FUNCTION PBMAIN () AS LONG
          LOCAL t AS QUAD, t2 AS DOUBLE
          TIX t
          t2 = TIMER
          DO
             IF TIMER - t2 >= 1 THEN EXIT DO
          LOOP
          TIX END t
          ? "Your cpu speed is approx." & FORMAT$(t, "0,") & " cycles per sec."
      
      END FUNCTION

      Comment


      • #4
        Btw, if you haven't got PB9/5 you can do it as shown below.
        Code:
        #COMPILE EXE
        #DIM ALL
        
        FUNCTION PBMAIN () AS LONG
            LOCAL t, t3 AS QUAD, t2 AS DOUBLE
        '    tix t
            !dw &h310f
            !mov t,    eax
            !mov t[4], edx
            t2 = TIMER
            DO
               IF TIMER - t2 >= 1 THEN EXIT DO
            LOOP
            !dw &h310f
            !mov t3,    eax
            !mov t3[4], edx
            t = t3 - t
        '    tix end t
            ? "Your cpu speed is approx." & FORMAT$(t, "0,") & " cycles per sec."
        
        END FUNCTION

        Comment


        • #5
          One way is to write a complex math routine in a loop and then test the time it takes to run the code using precision timers (not TIMER). The API provides precision timing functions.

          Now benchmark the routine on different CPU's and use the fastest CPU you have as the "base" benchmark.

          Now lets' say your base benchmark CPU is a 3 ghz CPU and it runs the code in 50 ms.

          If the code runs in 100 ms, then estimate (3 ghz) * (1/(100/50))

          or

          (3 ghz) times (1 divided by (time code is run divided by base code time))

          or

          3 x (1/(100/50))
          equals
          3 x (1/2) = 1.5 ghz

          If a CPU runs the code in twice the time of your base CPU, then it should be half the speed of the base CPU.
          Chris Boss
          Computer Workshop
          Developer of "EZGUI"
          http://cwsof.com
          http://twitter.com/EZGUIProGuy

          Comment


          • #6
            Two machines with the same CPU speed may give the same execution speed with a particular application. With another application one machine may beat the pants off the other machine.
            Why? Depending upon what the application is doing it may be because the 'faster' machine has a larger L2 cache, a faster front side bus, faster RAM and so on.

            Isolating the CPU speed is a pointless exercise in general in so far as an application's execution speed.

            Looking at the CPU speed is not going to address those complaints. Assuming they are not complaining about execution being too fast, you'll just have to write faster code.
            Yep.

            Comment


            • #7
              @Michael
              To be honest: I did not told you all...

              This code is running on systems in remote locations and sending regular reports to me. So I think the ".. 'my computer' or Start/Control Panel/System..." is not a feasible solution.

              As far as I know GetSystemInformation() will report processortype and –speed. But it is hard to compare e.g. Pentium, T7200 and Athlon's as each time 2GHz means something different...

              "...you'll just have to write faster code..." Well said – BIG smile! But if you start a thirdparty .DLL and it takes on your own machine 0.1 seconds to load and on one of your targets 5.5 and the manufacturerer claims this is impossible... Well than you have to proof him he is wrong... And understanding the customers CPU speed is just one small part of that process.

              @John Gleason
              John, can you please explain me the magic behind the TIX and/or Assembler statments stuff? I searched through my PBHelp and the forums, but found only calls for TIX...
              Will this count procesor cycles? I do think this will lead into the same problem as with GetSystemInformation(): an old Pentium4 running at 3GHz will outnumber an Athlon running at 2Ghz...


              I just found you need to be very careful what you test!
              I added a Multiplication and a counter (pse excuse the ugly coding...) to the LOOP and ran this on three different machines. See the results in the attached file... -> Pentium D that outnumberd my other processors becomes 50 times slower when using Floating point calculations...

              Hmm, with this I might have found a reason why this DLL loads so much slower...


              Code:
              #COMPILE  EXE
              #Register None
              #Dim      All
              #TOOLS    OFF
              
              '%DebugView = 0 'MyTrace erstellt keine Output für DebugView
              %DebugView = 1 'MyTrace erstellt Output für DebugView
               
              #Include "WIN32API.INC"
              
              SUB MyTrace (strTxt AS STRING)
                TRACE PRINT strTxt
                #IF %DebugView = 1
                  OutputDebugString BYCOPY("LMX-" & strTxt)
                #ENDIF
              END SUB
              
              Function PBMain() As Long
              #Register None
              DIM x AS LONG
              DIM y AS LONG
              LOCAL i as long 
              LOCAL t, t3 AS QUAD, t2, t4 AS DOUBLE
              local ix as integer
                for ix = 1 to 100
              '    tix t
                  !dw &h310f
                  !mov t,    eax
                  !mov t[4], edx
                  x = 1
                  y = 1.000001
                  t4 = TIMER
                  i = 0
                  DO
                    'trying to make sure, that the timer tick just changed value
                    'before I start benchmarking.
                    'Improves precision, but is not 100%, there are changes much 
                    'smaller than a whole tick! Grrrr!
                    t2 = TIMER
                  LOOP until t2<>t4
                  t2 = TIMER
                  DO
                    IF TIMER - t2 >= 0.1 THEN EXIT DO
                    x = x * y
                    i = i+1
                  LOOP
              '    tix end t
                  !dw &h310f
                  !mov t3,    eax
                  !mov t3[4], edx
                  t = t3 - t
              '    ? "Your cpu speed is approx." & FORMAT$(t/1000000000, "0,.00") & " cycles per sec."  & FORMAT$(t, "0,.00")
                  MyTrace "Your cpu speed is approx." & FORMAT$(i, "0,.00") & " cycles per sec."  & FORMAT$(t, "0,.00")
                next ix
              End Function
              Attached Files

              Comment


              • #8
                Walter, going back to you first post:
                The problem seems to be different possible values for 'qFreq'
                QueryPerformanceCounter gives us the value of the counter at the time of the query - the counter starting close to the system start. A subsequent query will then see a larger count. Comparing two machine's count in a given interval will be problematic if the frequencies are different. By dividing each interval count by the machine's respective qFreq will eliminate that problem since cycles/(cycles/second) => seconds. You should not have a problem then by using QPF.

                qFreq may vary between Window sessions but Microsoft guarantees that it will not alter during a Windows session. It follows then that cycle counts per given interval may vary between Window sessions but this too is eliminated by dividing by qFreq.

                With multiple core processors each core has its own counter and they are not guaranteed to be in sync. For a sufficiently small interval it is possible to get a negative cycle count. In this case we should use SetProcessAffinityMask. Added: In your case a better approach would be to test in a separate thread and use SetThreadAffinityMask.

                TIX, by the way, reads the Time Stamp Counter. On my machine the QPC and the TSC have the same frequency. With multiple core processors each core has its own TSC and, as with QFC, they are not guaranteed to be in sync.
                Last edited by David Roberts; 27 Feb 2009, 07:12 AM. Reason: SetThreadAffinityMask

                Comment


                • #9
                  Slightly related, a lesson to remember:

                  Wikipedia - Turbo Pascal - Issue with CRT unit on fast processors

                  Bye!
                  -- The universe tends toward maximum irony. Don't push it.

                  File Extension Seeker - Metasearch engine for file extensions / file types
                  Online TrID file identifier | TrIDLib - Identify thousands of file formats

                  Comment


                  • #10
                    Walter, I see now what you're trying to do. My code above was simply attempting to find the clock speed of the tested computer's processor. The asm code I posted reads the time stamp counter of the cpu and tells you how many clock ticks have passed between readings by using two QUAD t[x] variables. As Dave mentioned, TIX does that too and puts the result in a single QUAD if you have PB9/5.

                    Comment


                    • #11
                      >To be honest: I did not told you all...

                      Trust me, you're not first.

                      But I still don't see why you are spending time trying to determine the CPU speed. If you have a remote user with speed complaints, you can always walk him thru getting that number from My Computer or Control Panel or wherever that is.

                      But where does that get you?

                      "Application runs slow on 400 Mhz CPU." And? What are you doing to do about it?

                      Without seeing the application I can't say that "upgrade to faster CPU" won't solve this problem: the user may be CPU bound, true; but he might be I-O bound, he might be RAM-bound (too little installed RAM forcing numerous swap-in/swap-out), he might be network-bound, he might be running the world's ugliest and most instrusive anti-virus software.....

                      So you can upgrade the CPU. However, if you go this route I'd suggest you have ready an explanation for your boss or client why that did NOT solve the problem.

                      MCM
                      Michael Mattias
                      Tal Systems (retired)
                      Port Washington WI USA
                      [email protected]
                      http://www.talsystems.com

                      Comment


                      • #12
                        Walter, PowerBASIC does include some tools to benchmark the execution speed of your code. Have a look at the following in the docs:

                        #TOOLS
                        PROFILE
                        TRACE
                        #OPTIMIZE
                        Bernard Ertl
                        InterPlan Systems

                        Comment


                        • #13
                          But if you start a thirdparty .DLL and it takes on your own machine 0.1 seconds to load and on one of your targets 5.5 and the manufacturerer claims this is impossible
                          FWIW, slowness loading a DLL dynamically sounds more like a memory (RAM or disk) -shortage problem than a CPU-speed problem.

                          Pneumonia and the common cold exhibit many of the same symptoms; however, the proper treatments are very different.
                          Michael Mattias
                          Tal Systems (retired)
                          Port Washington WI USA
                          [email protected]
                          http://www.talsystems.com

                          Comment


                          • #14
                            >PowerBASIC does include some tools

                            Well, doctor, yes it does, but I think we should also run this test...

                            Add Process Memory Usage Report to any program 1-12-04

                            If we get lots of swapping (page faults) then that points to a shortage-of-memory problem.

                            I'll bill your insurance carrier direct for the consultation.

                            MCM
                            Michael Mattias
                            Tal Systems (retired)
                            Port Washington WI USA
                            [email protected]
                            http://www.talsystems.com

                            Comment


                            • #15
                              Fwiw, here's a one-second floating point on a few operators benchmark that seems to give consistent results:
                              Code:
                              #COMPILE EXE
                              #DIM ALL
                              
                              FUNCTION PBMAIN () AS LONG
                                  LOCAL t2 AS DOUBLE, f1, f2, f3 AS EXT, q AS QUAD
                                  t2 = TIMER
                                  f1 = .1234
                                  f2 = .4567
                              '   process set priority %REALTIME_PRIORITY_CLASS  'PB9/5
                                  DO
                                     f3 = f1 + f2
                                     f3 = f1 * f2
                                     f3 = f1 / f2
                                     f3 = f1 \ f2
                                     f3 = f1 MOD f2
                                     f3 = f1 - f2
                                     f1 = f1 + .0001
                                     f2 = f2 + .0001
                                     INCR q
                                     IF TIMER - t2 >= 1 THEN EXIT DO
                                  LOOP
                              '   PROCESS SET priority %normal_PRIORITY_CLASS
                                  ? "Your cpu ran the loop " & FORMAT$(q, "0,") & " times in one sec."
                              
                              END FUNCTION

                              Comment


                              • #16
                                Seems to me that the "speed" perceived by the user encompasses everything that the OS is doing. Perhaps it is also observed by the user to include slowdown in printing, mouse movement, screen repainting, key-to-screen update response time, etc.

                                As alluded to in earlier posts, the user's observations would include (in addtion to RAM, cache, swapfile, etc.) ALL the running processes. These will differ from one machine to another, from one set of tasks to another, and from one time to another.

                                I don't see how, as shown above, testing one program's timing would:
                                1. produce meaningful (useable) results
                                2. enable you as the programmer to make relevant changes to your code.

                                Rather than focusing on the program code's speed, I would think it made more sense to run a monitor program that begins by recording the number and names of running processes and their utilization of memory, cache, and CPU, then performs a select series of tasks that do such things as load a DLL, read and write from disk, etc, and observes the changes to the memory, cache, and CPU resources used. This would create a relevant body of factual information, which would serve as the basis for analysis and (perhaps) conclusions. Once the actual cause for the slow response has been quantified, potential solutions could be suggested.

                                I would think that only under such a controlled test of the entire system would there be any value in a time/speed number. At the very least, you'd be more nearly measuring the things that the user was experiencing, which:
                                1. addresses their concern
                                2. let's them know you take their concern seriously, even if you can't improve their experience in your code
                                3. might help them justify upgrading to a faster platform...

                                What's that quote about knowing what you can change, what you can't, and being able to tell the difference? I think that applies here.

                                Comment


                                • #17
                                  @John (and all otheres)

                                  In my case I do have the advantage, that on our clients system there will not be the usual junk of (enuser) programs running. There is ONE main application, a remote program and the usual OS stuff (including network, virus scanners, ...). But I'm aware, that even in such 'simple' environments big differences are possible!

                                  After having tried to find out about CPU speed and getting such different results I tend to accept, that measuring CPU speed is insufficent to predict system performance!

                                  While I think John's idea sounds good, I do think at the same time, such a tool becomes quite complicated and it will be very hard to produce reliable results. Just start task manager on a system and watch how CPU usage changes over time. And Task Manager does not even show everything! With modern processors (multi core) this becomes even more complicated...

                                  I think we will go back to the basics: I do measure how long it takes to load this .DLL (on my machine 0.1-0.2s, some customers 3s, even 5.5 seconds..). If it takes a lot longer (over 0.5s) on the customers machine, we will recommend to set up a new machine (they all have the ressources for that...). We will install our software and show them the difference.
                                  We are in the lucky situation, that they do get this .DLL directly fom the manufacturerer, so they will not blame us...

                                  Thanks to all that tried to help and gave input on finding a solution.

                                  Walter

                                  Comment


                                  • #18
                                    Computer speed test

                                    There is a really good speed test on the Internet and it is free.
                                    I have been using it for a long time to test the speed of the Gamer's machine I build.
                                    It will test everything and give you a rating with similar computers comparaison.

                                    http://www.pcpitstop.com
                                    Old QB45 Programmer

                                    Comment


                                    • #19
                                      I think we will go back to the basics: I do measure how long it takes to load this .DLL (on my machine 0.1-0.2s, some customers 3s, even 5.5 seconds..). If it takes a lot longer (over 0.5s) on the customers machine, we will recommend to set up a new machine (they all have the ressources for that...). We will install our software and show them the difference.

                                      We are in the lucky situation, that they do get this .DLL directly fom the manufacturerer, so they will not blame us...
                                      This is the kind of "consulting service" I have spent twent-five plus years trying to eliminate from the information technology industry.

                                      First of all, as I pointed out earlier, slowness in the load may be related to something not specific to the hardware. But you did not test that, did you?

                                      Second of all, as I pointed out earlier, slowness in the load may be related to a shortage of memory, not a shortage of CPU speed. But you did not test that, did you?

                                      Third, as I pointed out earlier, slowness in the load/application may be related to the way your code is written. But you did not have that tested, either, did you?

                                      For you to recommend a new computer without making these tests is totally and absolutely unprofessional.

                                      And for you to justify taking these actions by noting you are "lucky" enough that you can blame failure on a "third party" if your so-called "recommendation" does not work out just boggles my mind.

                                      It's inexcusable.
                                      Michael Mattias
                                      Tal Systems (retired)
                                      Port Washington WI USA
                                      [email protected]
                                      http://www.talsystems.com

                                      Comment

                                      Working...
                                      X