Announcement

Collapse

Forum Guidelines

This forum is for finished source code that is working properly. If you have questions about this or any other source code, please post it in one of the Discussion Forums, not here.
See more
See less

Sidewinder Sort – 2 to 3 times faster than PB sort

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PBWin/PBCC Sidewinder Sort – 2 to 3 times faster than PB sort

    Sidewinder Sort – 2 to 3 times faster than PB sort

    This is a Trie/Bucket sort

    About 2.4 times faster than PB sort.
    Over 3 times faster for string > 200 characters.
    The longer the strings, the wider the spread.

    Note: this is purely for those who like to play around with sort algorithms.
    On 100,000 strings, you’ll save a fraction of a second.

    This is a slight improvement over the “slot sort” I posted before.
    (same sort with a little tuning)

    The Burst Sort used to be the champ.
    It was about twice as fast as the Quick Sort.
    There are now variations of the Burst Sort that are 4 to 5 times faster than Quick Sort.
    These sorts are also Trie based.
    They are very complex sorts that focus on keeping the work in the CPU cache.
    http://www.cs.mu.oz.au/~rsinha/paper...Zobel-2006.pdf

    This sort uses recursive calls to build a Trie tree on the program stack.

    I had a request for information on how to convert it to an UCase sort.
    I tried, but couldn’t do it.

    Public domain – use at your own risk

    .
    Last edited by Stanley Durham; 9 Aug 2009, 12:48 PM.
    stanthemanstan~gmail
    Dead Theory Walking
    Range Trie Tree
    HLib ~ Free Data Container Lib ~ Arrays, Lists, Stacks, Queues, Deques, Trees, Hashes

  • #2
    Code:
    'pb 5/9
    'SidewinderSort5.inc
        '
        '   Sidewinder String Sort
        '
        '   !!! zero-based arrays only !!!
        '
        '   the sort moves sideways through the string array, one character depth at a time
        '   - now makes three passes
        '   - uses two temporary arrays
        '       pArr2: used to store string pointers in proper bucket order
        '           pointers copied back to original array after bucket built
        '       charArr(): hold character at current depth - character ASCIIZ value used as indes
        '           charArr(i) = Peek(Byte, @pArr[i] + depth)
        '           storing index avoids having to to do expensive operation twice; Peek(Byte, @pArr[i] + depth)
        '               Peek(Byte, @pArr[i] + depth) = get byte from string ptr in @pArr[i] at depth
        '
        '   move all null string to left side of array before recursive sort starts - final destination
        '   Passes:
        '       1. get count for each different kind of character, at a given depth, for each string in array segment
        '           1a. build bucket information - where each string belongs in array
        '       2. goes through array and stores each string ptr in temp array in proper bucket
        '           (this turned out to be faster than swapping string pointers in place)
        '       3. copies string pointers back into original array
        '               now in sorted order for current depth
        '           3a. recursively sort each bucket
        '               skip bucket zero (end-of-string) now in final position
        '               skip buckets with only one string - in final position
        '
        '   if 5 strings in the array start with "a" (lowest strings in array)
        '       on the first pass; those 5 string would be on the left side of the array
        '           we then recursively work on those 5 strings, next character down, before moving to the next lowest group in the array
        '
    '
    #If Not %Def(%WINAPI)
        #Include Once "WIN32API.INC"
    #EndIf
    '
    Type SidewinderSortBuckeyInfoT
        bucketSize(255) As Long
        bucketStart(255) As Long
        bucketEnd(255) As Long
    End Type
    '
    '
    'tix test macors
    '''Macro McSidewinderTixStart1 = Local qTix_Counter1 As Quad : Tix qTix_Counter1
    '''Macro McSidewinderTixEnd1
    '''    Tix End qTix_Counter1
    '''    #Debug Print "Tix #1 = "+ Format$(qTix_Counter1, "#,")
    '''End Macro
    '''Macro McSidewinderTixStart2 = Local qTix_Counter2 As Quad : Tix qTix_Counter2
    '''Macro McSidewinderTixEnd2
    '''    Tix End qTix_Counter2
    '''    #Debug Print "Tix #2 = "+ Format$(qTix_Counter2, "#,")
    '''End Macro
    '
    '
    Sub SidewinderSort(arr() As String)
        Register i As Long
        Register leftSide As Long
        Local arrUBound, rightSide, depth As Long
        Local pArr, pArr2 As Long Ptr
        Local dummy(), charArr() As Long
        '
        arrUBound = UBound(arr)
        '
        If arrUBound < 1 Then Exit Sub
        '
        ReDim charArr(arrUBound)
        ReDim dummy(arrUBound)
        '
        rightSide = arrUBound
        leftSide = 0
        depth = 0
        pArr = VarPtr(arr(0))
        pArr2 = VarPtr(dummy(0))
        'swap any null strings to far left - they will be in their final position
        For i = 0 To rightSide
            If @pArr[i] = %NULL Then
                If i <> leftSide Then Swap @pArr[i], @pArr[leftSide]
                Incr leftSide
            End If
        Next i
        SidewinderSortRecursive pArr, pArr2, charArr(), depth, leftSide, rightSide
    End Sub
    '
    '
    '-------------------------------------------------------------------------------------------------------------------------------------------
        ' internal use
    '-------------------------------------------------------------------------------------------------------------------------------------------
    '
    Sub SidewinderSortRecursive(ByVal pArr As Long Ptr, ByVal pArr2 As Long Ptr, charArr() As Long, ByVal depth As Long, ByVal leftSide As Long, ByVal rightSide As Long)
        Register i As Long
        Register char As Long
        Local lastBucketStart, lastBucketSize As Long
        Local buckets As SidewinderSortBuckeyInfoT
        '
        'Pass #1: get character counts
        '   10 strings begin with "A"
        '   5 strings begin with "B"
        '   ...etc
        '   determine how big a bucket we need for each character
        '       - at the current character depth of the array being sorted
        '       - in the current segment of the array being sorted
        '           - on the first call - we move through the whole array
        For i = rightSide To leftSide Step -1
            charArr(i) = Peek(Byte, @pArr[i] + depth)
            Incr buckets.bucketSize(charArr(i))
        Next i
        '
        '
        'build bucket info
        '   determine the start position of each bucket
        lastBucketStart = leftSide
        lastBucketSize = 0
        For char = 0 To 255
            If buckets.bucketSize(char) Then
                buckets.bucketStart(char) = lastBucketStart + lastBucketSize
                buckets.bucketEnd(char) = buckets.bucketStart(char) - 1
                lastBucketStart = buckets.bucketStart(char)
                lastBucketSize = buckets.bucketSize(char)
            End If
        Next char
        '
        '
        'Pass #2: store each string in proper bucket
        For i = rightSide To leftSide Step -1
            char = charArr(i)
            Incr buckets.bucketEnd(char)
            @pArr2[buckets.bucketEnd(char)] = @pArr[i]
        Next i
        '
        '
        'Pass #3: restore buckets to original array in sorted order
        For i = rightSide To leftSide Step -1
            @pArr[i] = @pArr2[i]
        Next i
        '
        '
        'recursively sort each bucket on next character down
        '   we skip zero - null on end of string
        '       those strings have reached their final destination
        For char = 255 To 1 Step - 1
            'if a bucket only has one string - no need to sort - string in final destination
            If buckets.bucketSize(char) > 1 Then
                SidewinderSortRecursive pArr, pArr2, charArr(), depth + 1, buckets.bucketStart(char), buckets.bucketEnd(char)
            End If
        Next char
    End Sub
    '
    '
    stanthemanstan~gmail
    Dead Theory Walking
    Range Trie Tree
    HLib ~ Free Data Container Lib ~ Arrays, Lists, Stacks, Queues, Deques, Trees, Hashes

    Comment


    • #3
      'test program
      Code:
      #PBForms CREATED V1.51
      'pbwin 9
      $TestSource = "SidewinderSortTest.bas"
      $TestTitle = "Sidewinder Sort Test"
      #Compile Exe "SidewinderSortTest.exe"
      #Dim All
      #Optimize Speed
      #Include Once "SidewinderSort5.inc"
      '
      #PBForms BEGIN INCLUDES
      #If Not %Def(%WINAPI)
          #Include Once "WIN32API.INC"
      #EndIf
      #Include Once "PBForms.INC"
      #PBForms END INCLUDES
      #PBForms BEGIN CONSTANTS
      %Dlg1    =  101
      %BtnTest = 1002
      %Lbx1    = 1001
      #PBForms END CONSTANTS
      Declare CallBack Function ShowDlg1Proc()
      Declare Function ShowDlg1(ByVal hParent As Dword) As Long
      #PBForms DECLARATIONS
      Global lbx As LBxI
      Global tmr As TimerI
      Function PBMain()
          ShowDlg1 %HWND_Desktop
      End Function
      '
      Sub BtnTest(ByVal hDlg As Long)
          Local i, j, arrUBound As Long
          Local a(), b() As String
          Local s As String
          Local pbTime, ssTime As Double
          Local pbTix, ssTix As Quad
          '
          'test settings
          Local Count As Long          : Count = 100000        'number of random strings to build
          Local shortestString As Long : shortestString = 10   'shortest random string
          Local longestString As Long  : longestString = 50    'longest random string
          Local lowestChar As Long     : lowestChar = 32       'ASC() smallest random character
          Local highestChar As Long    : highestChar = 127     'ASC() largest random character
          '
          'if Count <= 500 then random strings will be displayed
          '   any Count < 100,000 is to small to show any difference
          '
          Randomize
          arrUBound = Count - 1
          ReDim a(arrUBound)
          ReDim b(arrUBound)
          lbx.Clear(2)
          lbx.Add("fill arrays")
          For i = 0 To arrUBound
              s = ""
              For j = 1 To Rnd(shortestString, longestString)
                  s += Chr$(Rnd(lowestChar, highestChar))
              Next j
              'make sure we have a couple of NULL string in here
              If i = 100 Or i = 200 Then s = ""
              a(i) = s
              b(i) = s
          Next i
          '
          '
          'PB sort
          lbx.Add("")
          lbx.Add("PB Sort")
          tmr.Start()
          Tix pbTix
          Array Sort a()
          Tix End pbTix
          tmr.Stop()
          lbx.Add(tmr.Get())
          pbTime = tmr.GetTime
          '
          '
          'Sidewinder sort
          lbx.Add("")
          lbx.Add("Sidewinder Sort")
          tmr.Start()
          Tix ssTix
          SidewinderSort b()
          Tix End ssTix
          tmr.Stop()
          lbx.Add(tmr.Get())
          ssTime = tmr.GetTime()
          '
          '
          lbx.Add("")
          lbx.Add("see if anything out of order")
          For i = 0 To arrUBound
              If a(i) <> b(i) Then
                  ? "out of order at: " + Str$(i)
                  Exit For
              End If
          Next i
          '
          '
          lbx.Add("")
          lbx.Add("Tix PB: " + Format$(pbTix))
          lbx.Add("Tix SW: " + Format$(ssTix))
          lbx.Add("TIme PB: " + Format$(pbTime, "####.##########"))
          lbx.Add("Time SW: " + Format$(ssTime, "####.##########"))
          lbx.Add("Sidewinder Sort is " + Format$(pbTime/ssTime) + " times faster than PB sort")
          '
          '
          If Count <= 500 Then
              lbx.Add("")
              lbx.Add("====================")
              For i = 1 To Count
                  lbx.Add("")
                  lbx.Add("PB: = '" + a(i) +"'")
                  lbx.Add("SS: = '" + b(i) +"'")
              Next i
          End If
          '
          '
          lbx.Add("")
          lbx.Add("done...")
      End Sub
      '
      CallBack Function ShowDlg1Proc()
          Select Case As Long CbMsg
              Case %WM_InitDialog
                   lbx = Class "LBxC"
                   lbx.INI(Cb.Hndl, %Lbx1)
                   lbx.SetHorizontal(1000)
                   tmr = Class "TimerC"
              Case %WM_NCActivate
                  Static hWndSaveFocus As Dword
                  If IsFalse CbWParam Then
                      hWndSaveFocus = GetFocus()
                  ElseIf hWndSaveFocus Then
                      SetFocus(hWndSaveFocus)
                      hWndSaveFocus = 0
                  End If
              Case %WM_Command
                  Select Case As Long CbCtl
                      Case %BtnTest
                          If CbCtlMsg = %BN_Clicked Or CbCtlMsg = 1 Then
                              BtnTest(Cb.Hndl)
                          End If
                  End Select
          End Select
      End Function
      Function ShowDlg1(ByVal hParent As Dword) As Long
          Local lRslt  As Long
      #PBForms BEGIN DIALOG %Dlg1->->
          Local hDlg   As Dword
          Local hFont1 As Dword
          Dialog New hParent, $TestTitle, 67, 61, 341, 241, %WS_Popup _
              Or %WS_Border Or %WS_DlgFrame Or %WS_Caption Or %WS_SysMenu Or _
              %WS_MinimizeBox Or %WS_ClipSiblings Or %WS_Visible Or %DS_ModalFrame _
              Or %DS_3DLook Or %DS_NoFailCreate Or %DS_SetFont, _
              %WS_Ex_ControlParent Or %WS_Ex_Left Or %WS_Ex_LtrReading Or _
              %WS_Ex_RightScrollbar, To hDlg
          Control Add ListBox, hDlg, %Lbx1, , 5, 5, 330, 210, %WS_Child Or _
              %WS_Visible Or %WS_HScroll Or %WS_VScroll Or %LBS_Notify Or _
              %LBS_NoIntegralHeight, %WS_Ex_ClientEdge Or %WS_Ex_Left Or _
              %WS_Ex_LtrReading Or %WS_Ex_RightScrollbar
          Control Add Button,  hDlg, %BtnTest, "Test", 275, 220, 60, 15
          hFont1 = PBFormsMakeFont("Courier New", 9, 400, %FALSE, %FALSE, %FALSE, _
              %ANSI_CHARSET)
          Control Send hDlg, %Lbx1, %WM_SETFONT, hFont1, 0
      #PBForms END DIALOG
          Dialog Show Modal hDlg, Call ShowDlg1Proc To lRslt
      #PBForms BEGIN CLEANUP %Dlg1
          DeleteObject hFont1
      #PBForms END CLEANUP
          Function = lRslt
      End Function
      Class LBxC
          Instance meHDlg As Long
          Instance meID As Long
          Interface LBxI
              Inherit IUnknown
              Method INI(ByVal hDlg As Long, ByVal Id As Long)
                  meHDlg = hDlg
                  meID = Id
              End Method
              Method SetHorizontal(ByVal Count As Long)
                  Local hCntrl&
                  Control Handle meHDlg, meID To hCntrl&
                  SendMessage hCntrl&, %LB_SETHORIZONTALEXTENT, Count, 0
              End Method
              Method Clear(Opt doEventsCount As Long)
                  ListBox Reset meHDlg, meID
                  If VarPtr(doEventsCount) Then me.DoEventsCount(doEventsCount)
              End Method
              Method Add(ByVal s As String, Opt doEventsCount As Long) As String
                  ListBox Add meHDlg, meID, s
                  If VarPtr(doEventsCount) Then me.DoEventsCount(doEventsCount)
              End Method
          End Interface
          Class Method DoEventsCount(ByVal Count As Long)
              Local i As Long
              For i = 1 To Count
                  Dialog DoEvents
              Next i
          End Method
      End Class
      Class TimerC
          Instance startTime As Double
          Instance stopTime As Double
          Interface TimerI
              Inherit IUnknown
              Method Start()
                  startTime = Timer
              End Method
              Method Stop()
                  stopTime = Timer
              End Method
              Method Get() As String
                  Method = "    Time: " + Format$(stopTime - startTime, "#####.###############")
              End Method
              Method GetTime() As Double
                  Method = stopTime - startTime
              End Method
          End Interface
      End Class
      stanthemanstan~gmail
      Dead Theory Walking
      Range Trie Tree
      HLib ~ Free Data Container Lib ~ Arrays, Lists, Stacks, Queues, Deques, Trees, Hashes

      Comment

      Working...
      X