Announcement

Collapse
No announcement yet.

Thread Synchronization - Is This Safe?

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Thread Synchronization - Is This Safe?

    I have a need to serialize work in multiple threads or functions that can be simultaneously called by multiple threads (e.g. maintaining an embedded database). I was playing with a mutex (signaling) but also looked at using a critical section that is shared by multiple threads/functions. In my testing, the CS is MUCH faster that the mutex. In the target application, synchronized operations (e.g. create, read, write, view, delete) are very short (usecs) but they have to be serialized. I'm trying to determined if this approach is safe to use.

    Code:
    #COMPILE EXE
    #DIM ALL
    
    #INCLUDE "win32api.inc"
    
    
    '// We all hate global variables but they are useful for this test (and other stuff).
    GLOBAL mx     AS LONG               ' Global mutex
    GLOBAL cs     AS CRITICAL_SECTION   ' Global critical section shared by multiple threads
    GLOBAL gCnt   AS DWORD              ' Unsafe counter to tally operations
    GLOBAL tLoops AS LONG               ' Loop in threads to simulate serialized work operations
    
    
    '// Threadsafe printing so we can display a few readable results.
    SUB tPrint (BYREF sMsg AS STRING) THREADSAFE
      PRINT sMsg
    END SUB
    
    
    '// This is a non-threadsafe function that represents serialized work in multiple threads.
    SUB DoWork(BYREF sMsg AS STRING)
    '  sleep 0        ' Will SLEEP force context switches? Should not be important for this test
    '  tPrint sMsg    ' Use for very small numbers of operations
      INCR gCnt       ' Increment the unsafe work operation counter
    END SUB
    
    
    '// This represents multiple threads or multiple functions that can be called by multiple
    '   threads that must serialize work on global data. Each function either waits for a
    '   global mutex to be signaled or waits on the availability of a global critical section.
    THREAD FUNCTION myThread(BYVAL tNum AS LONG) AS LONG
      REGISTER n AS LONG
    
      '// Randomize the starting time for each thread.
      SLEEP RND(1, 100)
    
      '// Loop tLoops times doing some simple operation.
      FOR n = 1 TO tLoops
        EnterCriticalSection cs
    '    WaitForSingleObject mx, 999999
          DoWork STR$(tNum) & STR$(n)   ' Simlulate some work by calling a non-threadsafe function
    '    ReleaseMutex mx
        LeaveCriticalSection cs
      NEXT n
      tPrint "Thread" & STR$(tNum) & " finished."
    END FUNCTION
    
    
    FUNCTION PBMAIN () AS LONG
      LOCAL tNum  AS LONG   ' Number of worker threads to start
      LOCAL tCnt  AS LONG   ' Number of work operations to do in each thread
      LOCAL ret   AS LONG
    
      '// Set loop params for number of work operations.
      tCnt = 10
      tLoops = 50000
      PRINT "Expected safe work operations =" tCnt * tLoops
    
      '// Use random delays in the working threads.
      RANDOMIZE TIMER
    
      '// Initialize global synchronization vars.
      InitializeCriticalSection cs
      mx = CreateMutex ("", 0, "")
    
      '// Start the working threads. Pass thread number to each for reporting.
      PRINT
      PRINT "Press key to get results after threads have completed..."
      PRINT
      FOR tNum = 1 TO tCnt
        THREAD CREATE myThread(tNum) TO ret
        THREAD CLOSE ret TO ret
        RESET ret
      NEXT n
    
     '// After threads complete, press a key to get results.
      WAITKEY$
      PRINT
      PRINT "Final work operation count =" gCnt
    
    Exitmain:
      WAITKEY$
    
    END FUNCTION
    Thanks,
    Jerry

  • #2
    I didn't test your code, just looked at it and didn't see an error.
    This program will context-switch using SLEEP and not give threads time to allocate.
    Unremark %ThreadSafe fixes both issues

    Code:
    '%ThreadSafe=999
    #INCLUDE "win32api.inc"
    
    %Threads = 26
    GLOBAL gs AS STRING
    '------------------------------------------------------
    #IF %DEF(%ThreadSafe)
    SUB Writer(x AS LONG) THREADSAFE
     gs+=CHR$(x+64)
    END SUB
    
    #ELSE
    SUB Writer(x AS LONG)
     gs+=CHR$(x+64)
    END SUB
    #ENDIF
    '------------------------------------------------------
    THREAD FUNCTION MyThread(BYVAL x AS LONG) AS LONG
     SLEEP 10
     Writer x
    END FUNCTION
    '------------------------------------------------------
    FUNCTION PBMAIN () AS LONG
    
     LOCAL x AS LONG
     DIM hThread(1 TO %Threads) AS LONG
     FOR x = 1 TO %Threads
      THREAD CREATE MyThread(x) TO hThread(x)
     NEXT
     WAITFORMULTIPLEOBJECTS BYVAL %Threads, BYVAL VARPTR(hThread(1)), %TRUE, %INFINITE
    
     IF LEN(gs) <>  %Threads THEN
       ? USING$("&  (length #)",gs,LEN(gs)),%MB_ICONERROR,"Error"
     ELSE
       ? gs,,"Success"
     END IF
    
     FOR x = 1 TO %Threads
      THREAD CLOSE hThread(x) TO hThread(x)
     NEXT
    
    END FUNCTION

    Comment


    • #3
      Test many times. Without threadsafe runs, but terminates without displaying results.
      Code:
      #INCLUDE "win32api.inc"
      %Threads = 26
      GLOBAL gs AS STRING
      
      SUB Writer(x AS LONG) THREADSAFE
       gs+=CHR$(x+64)
      END SUB
      
      THREAD FUNCTION MyThread(BYVAL x AS LONG) AS LONG
       SLEEP 10
       Writer x
      END FUNCTION
      
      FUNCTION PBMAIN () AS LONG
       LOCAL x,loops AS LONG
       DIM hThread(1 TO %Threads) AS LONG
      FOR loops = 1 TO 30
       FOR x = 1 TO %Threads
        THREAD CREATE MyThread(x) TO hThread(x)
       NEXT
       WAITFORMULTIPLEOBJECTS BYVAL %Threads, BYVAL VARPTR(hThread(1)), %TRUE, %INFINITE
       gs+=$CR
       FOR x = 1 TO %Threads
        THREAD CLOSE hThread(x) TO hThread(x)
       NEXT
      NEXT

      Comment


      • #4
        Would it not be more straightforward to just write the "Dowork" procedure in a threadsafe manner using your favorite choice of methods?

        Just ask yourself, "what resource (which might be a variable) is to be protected, and where is it vulnerable?

        Also remember the known limitation with the THREADSAFE directive! (The use of a semaphore prevents any reentrant execution even on the same TOE).
        Michael Mattias
        Tal Systems Inc.
        Racine WI USA
        mmattias@talsystems.com
        http://www.talsystems.com

        Comment


        • #5
          Thanks for the feedback. The need here is for protection of a common "resource" - in this application, a many-to-one data buffer and an embedded database used for managing network sessions. Multiple threads may need access to the resources to perform multiple actions (e.g. create session, read session data, update session data, disable session).

          Fortunately, there is no reentrancy, however, actions need to be serialized - such as preventing a write to a session/record by one thread that has been closed by another. I can do this by supporting all of the actions in a single threadsafe function (critical section and branches to specific "work" code), but the function gets really large and painful to maintain. That is why I was thinking about using a single critical section that is shared by multiple functions - with focused code in each function.

          This approach may be a bit of an unusual approach but does seem to work in some preliminary testing (above test code). I'm hoping that experts can tell me if I'm making a mistake here (I'm kind of a weekend warrior when it comes to this stuff).

          Comment


          • #6
            owever, actions need to be serialized - such as preventing a write to a session/record by one thread that has been closed by another. I


            Well, if you want to SERIALIZE - first come, first served - then you don't even need threadsafe, you just need a queue:
            Anonymous Pipe as Job Queue Demo 10-29-03


            (You'll probably need to add to the queue in a threadsafe fashion).

            Somewhere around here, too, is a demo of a FIFO queue using the built-in (PB/Win 10+) PB collection objects.

            All that said, it appears you want to do something like...
            Code:
              IF object_has_some_property  ("record that has not been closed") (threads don't "close objects" procedures do)
                  Proceed
              ELSE
                ???
             END IF
            .. which is not what I would refer to as simply "serialization." I'd actually refer to it as "a design challenge."

            MCM


            Michael Mattias
            Tal Systems Inc.
            Racine WI USA
            mmattias@talsystems.com
            http://www.talsystems.com

            Comment


            • #7
              Just to expand on my note above:

              Multiple threads may need access to the resources to perform multiple actions (e.g. create session, read session data, update session data, disable session).
              Threads do not access resources... procedures access resources. "there are ways" to ensure a certain operation occurs only in some specific thread context but it's kind of weird how you have to do that.

              I don't see a problem with a single TOE handling all the operations:

              ....create session, read session data, update session data, disable session
              ====>
              Code:
              THREAD FUNCTION ProcessASession (param as LONG) AS LONG
               CALL CreateSesssion (params)
               CALL readSessionData(params)
               CALL UpdateSessionData(params) 
               CALL DisableSession (params)
              
              END FUNCTION
              Each one of the specific operations will execute in the thread context of the "ProcessASession" function which called it.

              If you ensure your four "do the work" functions use only stack-based data AND are not going to contend for some resource you don't even have to code those procedures with any synchronization objects.

              But from what you have described, "something" here is sharing a resource and synchronization is needed..Insufficient detail provided.


              MCM
              Michael Mattias
              Tal Systems Inc.
              Racine WI USA
              mmattias@talsystems.com
              http://www.talsystems.com

              Comment


              • #8
                Here is a very easy way by locking and unlocking a file.
                It also prevents other processes from accessing at the same time. See thread #2 and read the comments.
                https://forum.powerbasic.com/forum/u...er-made-easier

                Comment


                • #9
                  Here is a very easy way by locking and unlocking a file.
                  And here is a wonderful* text explanation re why you do that when sharing a file

                  "Fundamentals Of Multi-User Programming." Article published in December 1995 issue of "BASICally Speaking" magazine discussing the principles of writing multi-user programs; code samples in BASIC. Rich Text format; placed in the Public Domain June 2005
                  http://www.talsystems.com/tsihome_ht...rogramming.rtf

                  MCM
                  * Selection of adjective not from unbiased or neutral observer.


                  Michael Mattias
                  Tal Systems Inc.
                  Racine WI USA
                  mmattias@talsystems.com
                  http://www.talsystems.com

                  Comment


                  • #10
                    MCM's code snippet may well be the best approach. I was just looking at a way to provide multiple services (individual functions) without the need to call two functions each time a specific service (function) is required. Sharing a critical section came to mind, but I'm not 100% sure that it's safe or has other downsides - other than unusual coding style.

                    But from what you have described, "something" here is sharing a resource and synchronization is needed..Insufficient detail provided.
                    The applications that I'm playing with are an IoT/M2M client framework (API) and an associated message relay server. The relay servers (load sharing pair) must each support at least 128,000 concurrent client connections (sessions) and must be able to relay at least 50,000 encrypted messages per second between clients. So, there is a need for speed in the design of the code - especially in the server. Multiple servers/clients can be run on the same host (VM not required) so I also tend to think about context switching and page faults (memory caching) quite a bit.

                    There is some file I/O (client database) in the relay server but the transaction rate is much lower than 50K/sec (only used for client registration/release). Record locking would be nice but file locking may be fine at these lower rates. I've been working with multi-user/threaded databases since I started using QuickBASIC and PDS 30 years ago, so have a pretty good handle on that stuff.

                    One of the "resources" being shared by multiple threads is a memory-resident table (array and index) in the relay server that tracks session data for each of the registered clients. It uses a lot of pointers and some ASM to minimize latency. Within the server, messages must be received, decrypted, authenticated, re-encrypted, and resent in under 20 usec under full load. In the middle of this is the need to handle routing and class-of-service restrictions for each message. The session table is, typically, accessed at least two times for each relayed message.

                    The FIFO data buffer is used to queue decrypted network messages and message metadata in the client API. The buffer must be able to support multiple inputs and a single output (e.g. multiple sessions used by a client application).

                    While I've done quite a bit of reading on many programming topics over the past many years, this project is definitely a challenge for me. The assistance of experts in this forum is always appreciated.

                    Comment


                    • #11
                      If using hundreds of thousands of threads and speed is that critical, I would avoid using THREADSAFE, Critical Sections
                      and anything that will slow down other threads. This is an example of using a unique element for each thread number.
                      With this function when JOIN$(gs(),"") = STRING$(Threads,"D") all threads have finished or use a global counter.
                      This also gets around needing waitformultipleobjects with the 64 thread limit per group.
                      This technique can be used to update other global arrays using the thread number for an element.
                      Might even look at THREADED variables.

                      Code:
                      FUNCTION WorkerFunction(threadnum AS LONG) AS STRING 'THREADSAFE not needed with this global array
                       gs(threadnum) = "D" 'represent done status
                      END FUNCTION

                      Comment


                      • #12
                        If using hundreds of thousands of threads and speed is that critica.... ,
                        I think I would avoid using hundreds of thousands of threads, period.

                        I would avoid using THREADSAFE, Critical Sections and anything that will slow down other threads..
                        With a good synchronization object design, you only wait when you need to.... that is, a good design never waits needlessly!

                        FWIW, I still do high-level design work.

                        MCM
                        Michael Mattias
                        Tal Systems Inc.
                        Racine WI USA
                        mmattias@talsystems.com
                        http://www.talsystems.com

                        Comment


                        • #13
                          >I think I would avoid using hundreds of thousands of threads, period.
                          Not sure he has that ability with the project.
                          I'm guessing he is using Windows Server.

                          100,000 connection handles with PowerBASIC. I have never come anywhere close to this.

                          What I'm saying is that he is dealing with over 100,000 threads so normal processing will not be fast enough.
                          The code in Worker1 should be avoided because threads will be continually waiting.

                          No globals is ideal, but unique elements in global arrays will not need to be protected.
                          WaitForMultipleObjects with 100,000 threads will need a special routine.
                          Worker2 does not need to sync objects.

                          The SLEEP statements in the worker functions were added to show Worker1 will often fail if not THREADSAFE.

                          Code:
                          #COMPILE EXE
                          #DIM ALL
                          #INCLUDE "win32api.inc"
                          GLOBAL gs AS STRING
                          GLOBAL gs2() AS STRING
                          '---------------------------------------------------------------------------------
                          FUNCTION PBMAIN () AS LONG
                           LOCAL x AS LONG
                           DIM hThread(1 TO 60) AS LONG
                           DIM gs2(1 TO 60)      AS STRING
                           FOR x = 1 TO 60
                            THREAD CREATE MyThread(x) TO hThread(x)
                           NEXT
                           WaitForMultipleObjects BYVAL 60, BYVAL VARPTR(hThread(1)),%TRUE,%INFINITE
                          '---------------------------------------------------------------------------------
                           ? gs
                           ? JOIN$(gs2(),"")
                          END FUNCTION
                          '---------------------------------------------------------------------------------
                          THREAD FUNCTION MyThread(BYVAL x AS LONG) AS LONG
                           Worker1 x
                           Worker2 x
                          END FUNCTION
                          '---------------------------------------------------------------------------------
                          SUB Worker1(x AS LONG) THREADSAFE
                           SLEEP 10
                           gs = gs + "This is line" + STR$(x) + $CR 'not optimized
                          END SUB
                          '---------------------------------------------------------------------------------
                          SUB Worker2(x AS LONG)
                           SLEEP 10
                           gs2(x) = "This is line"   + STR$(x) + $CR  'not optimized
                          END SUB

                          Comment


                          • #14
                            The relay server application uses less than 16 threads. Client network connections on the server use UDP (a DTLS variant) and a single socket with overlapped I/O. There can be multiple threads processing received messages from the socket (an application thread pool), with the "session table" I mentioned earlier used to keep state information for each session. There's also some "housekeeping" threads (e.g. session inactivity timeout detection) that operate on the same table. Hence, the need for thread synchronization.

                            My initial testing, using a critical section to protect the session table, seems to work well at around 100K transactions/sec (simulating 50K relayed messages/second). Right now, the relay server is running on a hefty quad-core Intel system with either Win7 Pro or Win10 Pro. Since I don't have to use thousands of sockets, those OS seem fine for testing.

                            The only waiting that happens in the server is the UDP socket waiting for a message to arrive. The socket uses asynchronous notification (callback or signalling) to hand the message off to "downstream" threads/functions that do the heavy lifting. Sockets with overlapped I/O are really fast!

                            Comment


                            • #15
                              THREADSAFE can be much faster than Critical_Section.
                              You are using much faster overlapped functions so I won't post my tcp client and server.

                              Code:
                              #INCLUDE "win32api.inc"
                              GLOBAL gCS AS Critical_Section
                              GLOBAL g AS LONG
                              '-------------------------------------------
                              FUNCTION PBMAIN () AS LONG
                               LOCAL x AS LONG
                               LOCAL q1,q2 AS QUAD
                              InitializeCriticalSection gCS
                               TIX q1
                               FOR x = 1 TO 1000000
                                ThreadSafeSub
                               NEXT
                               TIX END q1
                              
                               TIX q2
                               FOR x = 1 TO 1000000
                                CriticalSUB
                               NEXT
                               TIX END q2
                              DeleteCriticalSection gCS
                               ? USING$("q1=#,   q2=#,",q1,q2),,USING$("#,",g)
                              END FUNCTION
                              '-------------------------------------------
                              SUB CriticalSUB
                               EnterCriticalSection gCS
                               INCR g
                               LeaveCriticalSection gCS
                              END SUB
                              '-------------------------------------------
                              SUB ThreadSafeSub THREADSAFE
                               INCR g
                              END SUB

                              Comment


                              • #16
                                THREADSAFE can be much faster than Critical_Section.
                                Um, that's a little too broad a statement.

                                The THREADSAFE procedure director provided by PB ensures only one user of a procedure at a time regardless of calling thread and regardless of the number of instructions in that procedure.

                                The CRITICAL_SECTION will only cause a wait between calls to EnterCritcicalSection() andLeaveCriticalSection(), which might only be a couple of lines of source code... or it could be (in theory) hundreds of lines; and will not suspend a re-entrant call on the same thread of execution,

                                Can THREADSAFE be faster than using a CRITICAL_SECTION? Sure... because you can code EnterCriticalSection() and LeaveCriticalSection() across multiple procedures. But you can construct scenarios where using a CRITICAL_SECTION (or another synchronization object) can be faster than using THREADSAFE.

                                The THREADSAFE procedure director and the CRITICAL_SECTION are just too dissimilar to perform any kinds of comparisons for speed or anything else. .. and in any case should never be considered alternatives to each other... the best or correct synchronization object to use is always "application-dependent."

                                MCM
                                Michael Mattias
                                Tal Systems Inc.
                                Racine WI USA
                                mmattias@talsystems.com
                                http://www.talsystems.com

                                Comment


                                • #17
                                  ThreadSafe is faster than Enter/Leave critical section even with the extra overhead of calling threadsafe sub or function.
                                  In this example the threadsafe sub is called 10,000,000 times and is still faster.
                                  Code:
                                  #INCLUDE "win32api.inc"
                                  GLOBAL gCS AS Critical_Section
                                  GLOBAL g AS LONG
                                  %Loops = 10000000
                                  '-------------------------------------------
                                  FUNCTION PBMAIN () AS LONG
                                   LOCAL x AS LONG
                                   LOCAL q1,q2 AS QUAD
                                  InitializeCriticalSection gCS
                                  
                                   TIX q1  'faster
                                   FOR x = 1 TO %Loops
                                    ThreadSafeSub
                                   NEXT
                                   TIX END q1
                                  
                                   TIX q2 'use critical section
                                   FOR x = 1 TO %Loops
                                    EnterCriticalSection gCS
                                    INCR g
                                    LeaveCriticalSection gCS
                                   NEXT
                                   TIX END q2
                                  
                                  DeleteCriticalSection gCS
                                   ? USING$("ThreadSafe=#,   Critical Section=#,",q1,q2),,USING$("#,",g)
                                  END FUNCTION
                                  '-------------------------------------------
                                  SUB ThreadSafeSub THREADSAFE
                                   INCR g
                                  END SUB

                                  Comment


                                  • #18
                                    Michael M is correct. The statement is too broad. Critical Section is often faster.

                                    Comment


                                    • #19
                                      I agree. I created another model and found that, while using THREADSAFE was often faster than a CS, varying a some factors can change the performance including:
                                      • Number of threads contending for the same resource
                                      • Amount of time that the target resource is "busy" (calling threads are blocked/queued)
                                      • Intermediate FUNCTIONS that pass multiple arguments and return results
                                      For my application, it seems that marshaling requests through an intermediate THREADSAFE function or embedding a global CS in the multiple worker threads provides a small and fairly equal performance benefit.

                                      Comment


                                      • #20
                                        Just note that...
                                        The PB THREADSAFE Procedure director and the Windows' CRITICAL_SECTION synchronization object (using EnterCriticalSection() and LeaveCriticalSection()) do different things and therefor should not be performance-compared.

                                        Also note the THREADSAFE director is implemented (wrongly IMO and I have reported this) by PowerBASIC using a semaphore, meaning the code not only blocks attempts to access a procedure by another thread of execution but also blocks re-entrant calls on the same TOE and so can deadlock your process if not used very carefully.

                                        Michael Mattias
                                        Tal Systems Inc.
                                        Racine WI USA
                                        mmattias@talsystems.com
                                        http://www.talsystems.com

                                        Comment

                                        Working...
                                        X