Announcement

Collapse
No announcement yet.

Threads and Performance (there are limits)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Threads and Performance (there are limits)

    At times I have seen posts about some using multiple threads in their apps, sometimes in the dozens of threads.

    I have commented on this before saying that one must be very careful to not use too many threads. Some may take this with a "grain of salt", but I finally found this warning in writing in the API docs which verifies this is a concern.

    Read the short topic on MSDN API docs:
    http://msdn.microsoft.com/en-us/libr...59(VS.85).aspx

    Notice the sentence:

    However, you must use caution when using multiple threads in an application, because system performance can decrease if there are too many threads.
    A simple statement, but worth remembering.

    So before using dozens of threads in your apps thinking "if one thread is good" then "many threads must be great", reconsider your approach.

    Now if the use of too many threads can slow performance, it is also reasonable to conclude that too many unnecessary context switches between threads can also degrade performance. This is why one book I use a reference on using threadsm suggest keeping all GUI calls in the main thread (that means all DDT GUI commands and things like SendMessage calls). SendMessage from a thread forces a context switch to the thread (or primary thread) which created the window.

    So if you want the best performance possible, keep worker threads to a minimum.
    Chris Boss
    Computer Workshop
    Developer of "EZGUI"
    http://cwsof.com
    http://twitter.com/EZGUIProGuy

  • #2
    Chris,
    So if you want the best performance possible, keep worker threads to a minimum.
    That's no better that saying "More threads improves performance". The minimum number of threads to get the job done might not give the best performance.
    As with most things, it depends on the situation. Sometimes more threads are a good idea, sometimes they aren't.

    Paul.

    Comment


    • #3
      They key is to determine the minimum number of threads needed to accomplish the task.

      The reason I bring this up is that I have seen some post code examples with dozens of threads with little though of the consequences of using so many threads.

      The excellent book "Multithreading Applications in Win32, the complete guide to Threads" (authored by Jim Beverage and Robert Wiener) discusses this quite well, discussing an easy mistake of using too many threads, when none or just a few would suffice. At first glance some solutions appear to make Thread seem inviting, but one can create worse problems by using threads or too many threads.

      Because Threads are easy to create, using PB commands, this does not mean there are no consequences and serious considerations.

      I am not against the using of threads, but simply want to help new PB'ers to understand the consequences of using threads.

      Threads are very useful for say polling devices in the background (ie. serial port) and things like that. But often I see code where Threads are manipulating the GUI (Forms/Controls). If the use of too many threads can degrade performance, using threads to manipulate the User Interface via DDT or API commands/functions (ie. SendMessage) can have even greater performance problems.

      The book I mentioned discusses this and "strongly" recommends doing all GUI stuff from the primary thread. You can do calculations and stuff for the GUI in a worker thread, but the primary thread should make all the GUI calls. Now of course it you create a window in a Thread and provide its own message loop for that thread, you can manipulate that window without little performance loss from the thread, since they are in the same thread. But most code I see is not creating custom window classes in their own threads. They are creating the GUI in the primary thread and then using a thread to modify the GUI.

      Knowing that even Microsoft understands that too many threads can decrease performance, helps one realize that the using of threads should be done judiciously.

      So for new PB'ers, excited about using threads, it is a valuable principle to follow:

      "Use as few worker threads as possible"

      "If it can be done without a thread, then don't use a thread thinking it will improve performance"

      "Keep GUI stuff in the primary thread and use worker threads for non-GUI tasks"

      Some may disagree with this recommendation, but I think there is good evidence here that they are good principles to follow.
      Chris Boss
      Computer Workshop
      Developer of "EZGUI"
      http://cwsof.com
      http://twitter.com/EZGUIProGuy

      Comment


      • #4
        Chris,
        Some may disagree
        I do!

        They key is to determine the minimum number of threads needed to accomplish the task.
        The ideal is to determine the number of threads that will accomplish the task effectively, taking into account the requirements of the user and the resources available. The minimum number is not a good target since almost any task can be accomplished with one thread, though not necessarily efficiently.

        Waiting for I/O devices such as serial ports, USB ports, mouse, keyboards and hard disks is an obvious use for a thread which will increase overall performance of a program.

        But multi cored CPUs open up lots of possibilities for breaking up programs into threads and gaining huge increases in performance. Many CPU intensive programs only use a fraction of the CPU power on a multi core computer. A single thread will only use a single core, leaving the other cores idle. Splitting the task into 2 threads or more could allow the work to be spread across all cores and double, triple or octuple the performance (octuple? The Intel i7-870 can run 8 threads at once.)


        For non-intensive tasks then the overhead of switching between threads is only a few microseconds so as long as you aren't trying to switch threads 10,000 times a second, you'll probably not notice any problem.

        system performance can decrease if there are too many threads
        For what it's worth, I would think that "System performance" in this case means the responsiveness of the system.
        If you have the typical 15ms time slice and open 20 threads of equal priority then the thread taking user input will only get seen for one in 20 time slices, about every 0.3s which will make the system appear sluggish. The solution then is to reduce the priority of the busy thereads so the user input thread gets preference.
        Sytem performance is also affected by the extra stack used by the thread. Each thread is given its own stack of 1MB so 500 threads will consume 500MB of RAM. That could slow you down a bit but, as most threads don't need that much stack space it can be reduced to 128k which is less of a problem as 500 threads would then comsume a negligible 64MB or RAM.

        Paul.

        Comment


        • #5
          I suspect this is poking a stick at something I posted the other week and one of the questions I raised in that thread was how to spawn them in, effectively, a simple queue to avoid the context switching. It turns out to be true a multitude of threads wasn't the best solution (in that case) and I now spawn (number of CPU)+1, the +1 being the control thread.

          I'm working on solving NP complete problems and they take a lot of processing but even the simple CPU+1 solution doesn't consider what else the computer may be required to do at the same time so then we have to start messing with thread priorities to leave some "room" for Outlook or Word or Firefox.

          As with all things programming, each case will be different and not all answers will be wrong. The beauty of PB is it makes it easy to try different solutions. Ultimately, the best result will be derived from the a) most accurate(bug free) and best algorithm code. A poor algorithm won't be made any better just because it's running across multiple cores/cpus/threads.
          Neil Croft (cissp)

          Comment


          • #6
            One thing many don't take into consideration is that we tend to "test" our software on our own PC's with few programs running at the same time. I know when I am in a "deep" coding session, I turn off my anti-virus software, don't run any more programs that absolutely necessary.

            Even then when I am programming, I likely have at least 3 or 4 programs running (PB, the program in development, MS SDK docs, Windows explorer).

            Now when I am not programming, I at times have a good number of programs running at the same time. IE Explorer is a resource hog.

            The point is, the when our software is being used by end users, they may have a good number of applications running at the same time.

            Now add to this, with Windows XP and Vista, the number of services running in the background are quite a lot. Personally I turn off as many unneeded services as possible.

            If all software developers added twice the number of threads in their apps that what is really absolutely necessary multiplied by the number of applications and services running, it could produce a significant slow down in system performance.

            I have another PC, with Windows 95. I used to have 95 on about 200 mhz system. I upgraded it to a 500 (or 600) mhz system. Windows 95 uses a lot less stuff in the background and it runs much faster and is more responsive (to the user) than does my Windows XP system with a 2.4 ghz CPU. Each new iteration of Windows backsteps by using more and more resources/services. As CPU's get faster, the end user result is not faster, but often slower, because the "bog" of the "newer" OS slows things down.

            Software should be developed to be as optimized as possible for speed (and memory) despite the great increases of the hardware side of it.

            This is one reason I like PowerBasic so much, because I can write "lean and mean" (small and fast) applications. Much like Powerbasic, I am one of those who "counts" cycles (speed) and times the software I write. I actually get out my stop watch and time stuff.

            The proper understanding of how threads impact software and how to ultilize them properly for optimal speed is important to me. I just wanted to share a little "commentary" on the subject for the many newbies who visit the forum.
            Chris Boss
            Computer Workshop
            Developer of "EZGUI"
            http://cwsof.com
            http://twitter.com/EZGUIProGuy

            Comment


            • #7
              Threads are very useful for say polling devices in the background (ie. serial port) and things like that. But often I see code where Threads are manipulating the GUI (Forms/Controls).
              Polling is a bad idea in my book (but who is to say if something happened if you do NOT look for it????)
              Coming from a Serial Port (or some sort of Device signal me that something happened) I can TOTALLY agree that threads are sometimes the way to go.

              Other people have "Some Event" and then do something...OK I will byte...how do you know the event happenned when you did not look for it??? (aka, Events themselves must be some sort of polling, because if I were deaf or mute, I would not know to turn around to pay attention to something unless I was tapped on the shoulder to turn around and look)

              Too many looks..."Yep that would make me look jumpy, and people wonder why I am looking around all the time"

              My realm of programming usually contains some sort of attached device (so I am CONSTANTLY wondering...."Did something happen? should I react to it?")
              Most programs are internal to their own box...(computer), but what if you have to talk to 2 or more boxes? (simple example...Computer = 1 box, Port = 1 box, attached device = yet another box) and then link the 3 to my program (that could be considered a rosetta stone to interpret the 3 and "Just make it worK")
              Engineer's Motto: If it aint broke take it apart and fix it

              "If at 1st you don't succeed... call it version 1.0"

              "Half of Programming is coding"....."The other 90% is DEBUGGING"

              "Document my code????" .... "WHYYY??? do you think they call it CODE? "

              Comment


              • #8
                I use multiple threads on most of my network scanning type tools. The number of threads can range anywhere from 5 to 100 depending on the application. Many things can affect thread performance. (speed of the CPU, other applications running, OS version, etc) So I make the max number of threads adjustable by the user. Because one PC might see high CPU usage at 25 threads and on another newer PC might barely touch the CPU. So the key is not hard coding the max number of threads, but let the user adjust them to fit each PC. (as long as your program documentation explains the pros/cons and the suggested max range. My programs will also give an extra warning message if someone picks to high a number.)

                As for "keep the worker threads to a minimum", on applications like network scanners, that would not make any sense. Because the minimum would be using only one thread to ping each IP address, but since each ping can take a couple seconds to timeout using only one thread would take forever, and since waiting on the network to respond does not use the local CPU, I can have more than 50 threads pinging at the same time and barely see any CPU usage. I have other applications that the threads are actively processing and if I adjust them past 10 threads I see diminished overall speed on the application.

                So I must agree with Paul. It is highly dependent on each application.

                Chris, I have one application that I default to 50 threads (user adjustable) and I have ran it at over 200 with very little CPU usage and no diminished overall speed. So when you make it sound like "dozens" it way too many, I have to say it depends much on what the threads are doing. If they are waiting for outside responses like a network ping, you can get away with more than if they are all actively processing.
                "I haven't lost my mind... its backed up on tape... I think??" :D

                Comment


                • #9
                  Chris,
                  What is the copyright date on the book you are quoting?

                  James

                  Comment


                  • #10
                    1998 was the latest version of the book.

                    It covers Windows NT so the basic principles have not changed with Windows XP/Vista.

                    Of course CPUs have changed and improved multithreading, including multiple CPU's.

                    I am sure a lot depends upon how much you do in Threads. If one doesn't do a lot in a thread and it calls Sleep (pass control to other threads), the CPU usage may not be high.

                    If one writes very task intensive code in threads, then many threads could cause some bottlenecks in speed.

                    Regardless though, the rule of using the minimum number of threads necessary for the task, will always produce the fastest executing code.

                    Now what is necessary could be 10 or 20 threads. It all depends upon what the tasks are. Yet, if one could accomplish the same thing with half as many threads (no matter how many) then you will get better performance.
                    Chris Boss
                    Computer Workshop
                    Developer of "EZGUI"
                    http://cwsof.com
                    http://twitter.com/EZGUIProGuy

                    Comment


                    • #11
                      Originally posted by Chris Boss View Post
                      1998 was the latest version of the book.

                      It covers Windows NT so the basic principles have not changed with Windows XP/Vista.

                      Of course CPUs have changed and improved multithreading, including multiple CPU's.

                      I am sure a lot depends upon how much you do in Threads. If one doesn't do a lot in a thread and it calls Sleep (pass control to other threads), the CPU usage may not be high.

                      If one writes very task intensive code in threads, then many threads could cause some bottlenecks in speed.

                      Regardless though, the rule of using the minimum number of threads necessary for the task, will always produce the fastest executing code.

                      Now what is necessary could be 10 or 20 threads. It all depends upon what the tasks are. Yet, if one could accomplish the same thing with half as many threads (no matter how many) then you will get better performance.
                      While it might give good general guide lines it is way too dated to have much credibility with today's architecture.
                      If you are still writing for Win95 and P3's maybe...
                      I think you need to find a more current publication to support your thread ideas. And yes I saw the MSDN one liner, which appears to be just more generalizations.

                      If I'm writing for a QUAD core on VISTA64 I doubt very much I can overload the system unless I'm go completely overboard.
                      As others have said YMMV.

                      James

                      Comment


                      • #12
                        Originally posted by jcfuller View Post
                        While it might give good general guide lines it is way too dated to have much credibility with today's architecture.
                        So what in the architecture has changed?

                        And don't go quoting chip generations and OS versions - they're minor improvements in implementation, not architectural changes.

                        Architecturally, nothing much has changed. The last major jump in the OS portion of the architecture was the change from Win9x to WinNT. (Which has better scheduling of threads, better resource protection across threads, and is generally better in every conceivable way for multi-threaded programming.)

                        The last major jump in the architecture of the CPU was arguably the Pentium Pro/Pentium II. (Although the Pentium 4 tried to make some minor changes, and failed miserably.)

                        Windows 7 won't change the way Windows handles threads. Windows 7 on a Core 2 Duo will likely take a similar number of cycles to switch thread contexts per pipeline that a Pentium II did. The penalties are pretty much the same, too - especially in scale.

                        For example, access to system RAM from the CPU may be faster by absolute measurement - but by scale (of clock cycles per access to L1 cache, L2 cache or RAM) access times follow a fairly consistent trend whether it's a 64Mb Pentium II machine or an 8Gb Core 2 Duo.

                        The only difference is that overall the newer machine is faster, and has more USABLE CPU pipelines.

                        (Hyperthreading, introduced with the P4, is the aberration in that trend. Each hyperthreaded core is not a real core, has huge context switching costs, and they combine to deliver very little in the way of a performance increase.)

                        From 1998 to 2009, the difference architecturally is minimal. The only difference practically - that is, in implementation - is that in 1998, a machine with multiple cores also meant a machine with multiple CPUs.
                        Now we're packing multiple cores into one CPU, which makes them much more cost-effective and therefore much more affordable.

                        But when it comes to how they behave, you are confusing implementation of the architecture with the architecture itself.

                        If you are still writing for Win95 and P3's maybe...
                        Only the retirement of Win9x matters in that statement.

                        I think you need to find a more current publication to support your thread ideas. And yes I saw the MSDN one liner, which appears to be just more generalizations.
                        Any edition of Peter Norton's Programmer's Guide to the PC will still give you perfectly valid advice on programming a parallel or serial port - and many PCs still have a parallel port.

                        The age of the book doesn't mean it's wrong if nothing's changed. And I maintain that nothing much has changed architecturally.

                        If I'm writing for a QUAD core on VISTA64 I doubt very much I can overload the system unless I'm go completely overboard.
                        OK, so write a thread that reads 20Mb of file data into an array, sorts it, and then outputs it to another file.
                        Do you think Vista makes a difference to that operation?
                        Personally, I doubt it.

                        Do four cores make a difference?
                        Absolutely. In 1998 you could run one or maybe two of the threads, and you'd hit the RAM bandwidth limit of the machine, then the disk bandwidth limit (assuming high-performance disks on a decent bus like SCSI).
                        With a modern four core machine, you can now run a third or fourth thread before hitting either of those bandwidth limits. Probably disk first, but it depends on the chipset handling the RAM...

                        The basic principles that Chris refers to are still very much in effect - because the architecture hasn't changed.

                        Having more cores available on most machines doesn't mean that we've changed the architecture. That option was always there, it just required multi-CPU machines.

                        Threads are handled the same way they always were by the architecture. That hardware which can handle more of them is now cheaply available is an implementation change, not an architectural one.

                        And even then, it's not a change you can rely upon. If I recall correctly, the lowest-spec Atom chips in netbooks are still single core, and that won't change for a year or so. Assuming multi-core everywhere will still bite you on new machines, let alone the installed base.

                        If you write code according to the principles Chris outlines, then the code will work very nicely on all machines - both now and in the future, and even on machines and OSes from the dim distant past of 1998.

                        If you write code assuming you have four cores, your software will likely roll over and die on a modern netbook, let alone a machine from 1998...
                        Hobby Programmer! Please be kind!

                        Comment


                        • #13
                          And don't go quoting chip generations and OS versions - they're minor improvements in implementation, not architectural changes.
                          Just because the chip designers were careful to present you, the programmer, with a familiar interface to their chip so your programs will still work does not mean that the underlying architecture is unchanged or has only had minor improvements.
                          The Romans had multistorey buildings with central heating and hot water on tap but you wouldn't argue that the difference between a Roman villa and a modern skyscraper is one of minor changes in implementation of the same basic architecture.

                          Paul.

                          Comment


                          • #14
                            The recommendations made I think are still quite valuable.

                            Every PC, no matter whether it has one CPU or a four CPU's still has some limitation. True it may take longer to overload a particular PC with more CPU's, but at some point it will max out and performance will suffer.

                            The problem is "software is not a island", meaning it does not run by itself.

                            If every programmer felt they could push the hardware as much as like without any backlash, then you would see real problems with even state of the art PC's.

                            A good example is the use of memory. Programmers started feeling that new computers have so much more memory that efficient use of memory was no longer a concern. All the software started requiring more and more memory. Now when add up all the programs running at the same time on a PC, even with gigs of memory, you still see bottle necks.

                            A Windows 95 PC with a 500 mhz CPU, 128 meg Ram will run better (appear to to the user) than some of the new 3 ghz CPU's with 2 gig Ram. Why ?

                            Because older software (which did a lot as far as the tasks) was designed to use less memory and resources.

                            We don't benefit from the improvements in hardware, because software keeps getting worse and worse (bigger, more bloated).

                            Threads are no different. If all programmers felt, "run as many threads as you like" then all the advances in multithreading CPU's will be lost.

                            Its kind of like cars.

                            In the 70's the "gas crunch" made manufactures produce more fuel efficient cars. As things settled, they went back to making bigger more gas eating cars.

                            I laugh when I see commercials on TV advertising the latest "hybrid" cars getting an "amazing" 35 miles per gallon.

                            I have a 1990 Geo Prizm with 334,000 miles on it that still gets 40 mpg.
                            I have a 1996 Saturn SL2 which gets in the high 30's mpg (close to 40).

                            Did I miss something there ?

                            They were making fuel efficient cars 20 years ago and they weren't hybrids nor expensive.

                            Computers/Software have the same problem.

                            Everybody wants to make it bigger and better with little concern about using the resources efficiently. There is no reason why programmers can't write software which efficiently uses the resources available.

                            Threads are abused. They are useful when used properly, but one should consider the ramifications of using too many.
                            Chris Boss
                            Computer Workshop
                            Developer of "EZGUI"
                            http://cwsof.com
                            http://twitter.com/EZGUIProGuy

                            Comment


                            • #15
                              Computers/Software have the same problem.

                              Everybody wants to make it bigger and better with little concern about using the resources efficiently
                              Amen, my brother.
                              Michael Mattias
                              Tal Systems (retired)
                              Port Washington WI USA
                              [email protected]
                              http://www.talsystems.com

                              Comment


                              • #16
                                Code:
                                with little concern about using the resources efficiently
                                My point is that minimising thread use may not be efficient. More threads is not inherently less efficient.

                                Paul

                                Comment


                                • #17
                                  semantics and generalizations


                                  They key is to determine the minimum number of threads needed to accomplish the task.
                                  Change to:

                                  They key is to determine the minimum number of threads needed to accomplish the task efficiently.

                                  wasn't there something about a 1/10th rule, I can't remember loaded windows and now my mind has to be rebooted

                                  Comment


                                  • #18
                                    Originally posted by Ronald Robinson View Post
                                    semantics and generalizations




                                    Change to:

                                    They key is to determine the minimum number of threads needed to accomplish the task efficiently.

                                    wasn't there something about a 1/10th rule, I can't remember loaded windows and now my mind has to be rebooted
                                    Change to the key is to determine the optimum number of threads needed to accomplish the task efficiently.

                                    As it's apparent Intel aren't going to ramp clock speeds up much moving forward but are going to add cores to "improve" performance, it's up to programmers to develop parallel code wherever speed of execution is key.

                                    It may be that a single threaded solution is, by total cycles used, more "efficient" but if it means the user waits longer and the computer is effectively running at 50% (or 25% on a quad core) then it isn't efficient by any business measure. User time is expensive. Even if multi threading only knocks a third off the end to end execution time of a task, that is time given back to the end user.
                                    Neil Croft (cissp)

                                    Comment


                                    • #19
                                      The "1/10 second rule" is the rule which says you don't want to block a GUI thread for that long; that is, "do something" in response to a user action which will take that long to complete and during which no other user actions can be processed.

                                      I'm sure you've seen the symptoms of violating this rule:
                                      - Click "do it" and until it's done your screen does not repaint if you move it, at which time you get a "white square" where that screen used to be.
                                      - Click "do it" and apparently nothing happens, so you click again. Suddenly the "do it" is done, but now it starts "doing it" again! (finally got around to processing your "re-click").

                                      Which is why this demo exists: GUI + Worker Thread + Abort Demo 11-24-07

                                      MCM
                                      Michael Mattias
                                      Tal Systems (retired)
                                      Port Washington WI USA
                                      [email protected]
                                      http://www.talsystems.com

                                      Comment

                                      Working...
                                      X