Announcement

Collapse
No announcement yet.

Sort blocks of memory of unknown size

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sort blocks of memory of unknown size

    Hi all,

    imagine you have a big block of allocated memory of whatever (binary)
    You need to sort such memory by block of X bytes without knowing X at compile time but only at run-time

    For example you have 1Mb divided in blocks of 100 bytes or 1000 bytes or whatever size not known at compile time

    Imagine your data is stored at pDataStorage, you have lItems items each of ElementSize size.

    Quick and dirty solution would be to create a copy of the data in blocks of ElementSize into a dinamic string array, use ARRAY SORT ... and put back data into original place
    Code:
              Dim TempStr(1 To lItems) As String
              For Counter = 1 To lItems
                TempStr(Counter) = Peek$(pDataStorage + ElementSize * (Counter - 1), ElementSize)
              Next                                                                                                        
              Array Sort TempStr()
              For Counter = 1 To lItems
                Poke$ pDataStorage + ElementSize * (Counter - 1), TempStr(Counter)
              Next
    But this is a waste of time and memory. For few MB of memory this is not a problem and quite quick but for some hundred of MB ... no way.

    Ideal would be to have the possibility to define an ABSOLUTE array of STRINGS * ElementSize but this is not possible because Power Basic Compiler wants to know ElementSize at compile time. So you cannot declare something like:
    Code:
              ReDim TempStr(1 To lItems) As String * ElementSize AT pDataStorage
              Array Sort TempStr()

    Ideas?

    Thanks in advance
    Eros
    thinBasic programming language
    Win10 64bit - 8GB Ram - i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB

  • #2
    You are asking how many angels can fit on the head of a pin but you don't know how big the pin is.

    If you don't know the size of the memory at compile time, it has to be available from "something" at run time.

    If you can learn these things at runtime, sorting is easy... but without them, you can't sort .. or do anything else for that matter.

    End of discussion.


    But if we go back to post 1...

    You need to sort such memory by block of X bytes without knowing X at compile time but only at run-time
    The rest of your post "sounds like" you might know enough. "Enough" is two things: record size; plus either number of records or total size.

    If you know enough, sorting is easy. Best technique depends on total size. (and also highly subjective).
    MCM

    Comment


    • #3
      Hi Eros,

      Well, if you are not allergic to C, then you could use qSort and optionally bSearch from msvcrt.dll.
      This sort is pretty well designed and, if more complex to use than PowerBASIC's ARRAY SORT, will give different possibility. It is of course important to declare those functions with CDECL.

      All you need: qSort(MemoryPointer, ItemCount, ItemSize, CODEPTR(SorterRoutine))
      Also, you design what you want in the sorting routine, so you have the power.

      Pierre

      Code:
      #COMPILE EXE '#Win 9.07#
      #DIM ALL
      #INCLUDE "Win32Api.inc"
      
      $AppName      = "C qSOrt and bSearch"
      '$CRT_DLLNAME = "msvcrt40.dll"
      $CRT_DLLNAME  = "msvcrt.dll"
      
      GLOBAL ItemSize AS DWORD
      
      DECLARE SUB QSort CDECL LIB "msvcrt.dll" ALIAS "qsort" _ 'CDECL parameters on the stack from right to left
      (BYREF bbase AS ANY, BYVAL num AS DWORD, BYVAL SIZE AS DWORD, BYVAL compare AS DWORD)
      
      DECLARE FUNCTION BSearch CDECL LIB $CRT_DLLNAME ALIAS "bsearch" _ 'CDECL parameters on the stack from right to left
      (BYREF KEY AS ANY, BYREF BASE AS ANY, BYVAL num AS DWORD, BYVAL WIDTH AS DWORD, BYVAL compare AS DWORD) AS DWORD
      '_____________________________________________________________________________
      
      FUNCTION Sorter CDECL(BYVAL ps1 AS DWORD, BYVAL ps2 AS DWORD) AS LONG
      
       FUNCTION = 2 - CompareString(%LOCALE_USER_DEFAULT, %SORT_STRINGSORT, _ %NORM_IGNORECASE
                                    BYVAL ps2, BYVAL ItemSize, BYVAL ps1, BYVAL ItemSize) 'Swap 1 and 2 for descending sort
       'CSTR_LESS_THAN = 1 (str 1 less than 2), CSTR_EQUAL = 2 (equal str), CSTR_GREATER_THAN = 3 (str 2 less than 1)
      
      END FUNCTION
      '_____________________________________________________________________________
      
      FUNCTION PBMAIN() AS LONG
       LOCAL pFound    AS ASCIIZ POINTER
       LOCAL sToSearch AS STRING
       LOCAL pMem      AS DWORD
       LOCAL ItemCount AS DWORD
       LOCAL index     AS LONG
      
       ItemCount = 26 'Alphabet
       ItemSize  = 16
      
       pMem = LocalAlloc(%LMEM_FIXED OR %LMEM_ZEROINIT, ItemCount * ItemSize) 'Here is our memory
      
       POKE$ pMem, SPACE$(ItemCount * ItemSize) 'Fill all with spaces if you need to do more test without zero characters but they are allowed
       FOR index = 0 TO ItemCount - 1
         POKE BYTE, pMem + index * ItemSize, 64 + 26 - Index 'Set every record from "Z" to "A"
       NEXT
      
       ? "Before qSort> " & PEEK$(pMem, 1), , $AppName 'Show "Z"
      
       QSort(BYVAL pMem, ItemCount, ItemSize, CODEPTR(Sorter)) 'Array first byte pointer, Array element count, Array element size
      
       ? "After qSort> " & PEEK$(pMem, 1), , $AppName 'Show "A"
      
       sToSearch = "E" & SPACE$(ItemSize - 1) 'Now let's search for the letter "E" and spaces...
       pFound = BSearch(BYVAL STRPTR(sToSearch), BYVAL pMem, ItemCount, ItemSize, CODEPTR(Sorter))
       IF pFound THEN
         'Show "bSearch found E at 0x2CE8E8, index is 4"
         ? "bSearch found " & RTRIM$(sToSearch) & " at 0x" & HEX$(pFound) & ", index is " & FORMAT$((pFound - pMem) / ItemSize), , $AppName
       ELSE
         'Show "bSearch failed for E" on error
         ? "bSearch failed for " & sToSearch, , $AppName
       END IF
      
       LocalFree(BYVAL pMem)
      
      END FUNCTION
      '_____________________________________________________________________________
      '

      Comment


      • #4
        Michael Mattias
        yes I know all what is needed at runtime otherwise ... it is not possible to go on.
        But because
        Code:
         ReDim TempStr(1 To lItems) As String * ElementSize AT pDataStorage
        with ElementSize defined at runtime is so straightforward I was thinking to have missed something in PB documentation.

        Pierre Bellisle
        Thanks a lot, I didn't thought about C Runtime.
        No I'm not allergic to it at all, I have some programs using it.
        In this case I need to have fine control in my program installation and redistribution so I will avoid this dependency.


        I'm going with a personal qSort function by chunks of memory blocks using peek/poke

        Thanks again
        Eros
        thinBasic programming language
        Win10 64bit - 8GB Ram - i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB

        Comment


        • #5
          > I'm going with a personal qSort function

          That's what I was going to suggest. But maybe...

          > peek/poke

          You might be able do it with ARRAY SORT...CALL and avoid all of the memory-block-swapping. If at runtime you have (say) 250 segments to be sorted, create a numeric array of 250 elements and fill it with the integers 1-250. Tell ARRAY SORT...CALL to sort that array, and, inside the custom sort function, do the peeks and comparisons. "Is segment 1 larger than segment 2?" etc. etc. You'd end up with a sorted numeric array that would tell you the order in which you need to access the segments. After the SORT, if element 1 contains (say) the value 13, then the 13th segment is the first-sorted.

          Added: This is conceptually similar to the way PB sorts dynamic string arrays. The actual strings are never moved, PB simply swaps around the descriptors that point to the strings. The numeric array above is filled with "pointers" that point to segment numbers instead of memory locations. In fact, if you wanted to, you could pre-fill the numeric array with the actual memory locations of the segments (for PEEK) instead of integers, and get the same result. "Is the segment at location X larger than the segment at location Y?"
          "Not my circus, not my monkeys."

          Comment


          • #6
            Hi Eros,

            Q> "avoid this dependency."
            If I'm right, "msvcrt.dll" is included with every Windows system from 95 OSR2 up to 10...
            so there is no real big risk...

            Q> "using peek"
            Isn't it a little "waste of time and memory", the thing you said you want to avoid?
            Unless you use it only to swap data for sorting purpose.
            Note that you could still use CompareString() in your custom sort routine to avoid useless compare data copy,
            even if you use ARRAY SORT in the way Eric suggest.
            I hope you will come back with the result, always interesting to see how other guy are doing it...

            Pierre

            Comment


            • #7
              > with ElementSize defined at runtime is so straightforward I was thinking to have missed something in PB documentation.

              Understandable. Unfortunately "fixed-length string" means fixed at compile time. That allows the compiler to pre-calculate a ton of stuff, avoiding runtime computation and making fixed-length strings extremely fast.
              "Not my circus, not my monkeys."

              Comment


              • #8
                Re 'variable fixed length strings' ... you "might" be able to do something with FIELD variable types and the FIELD statement. (Which I have never used).

                These support the use of a variable for the 'record length.'

                Comment


                • #9
                  Originally posted by Pierre Bellisle View Post

                  Q> "avoid this dependency."
                  If I'm right, "msvcrt.dll" is included with every Windows system from 95 OSR2 up to 10...
                  so there is no real big risk...
                  Yes I saw something like that but also something like this https://blogs.msdn.microsoft.com/old...411-00/?p=1273

                  Originally posted by Pierre Bellisle View Post
                  I hope you will come back with the result, always interesting to see how other guy are doing it...
                  Yes, when I will have a decision I will post here.
                  thinBasic programming language
                  Win10 64bit - 8GB Ram - i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB

                  Comment


                  • #10
                    Originally posted by Michael Mattias View Post
                    Re 'variable fixed length strings' ... you "might" be able to do something with FIELD variable types and the FIELD statement. (Which I have never used).

                    These support the use of a variable for the 'record length.'
                    Will investigate. Never used FIELD too.
                    thinBasic programming language
                    Win10 64bit - 8GB Ram - i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB

                    Comment


                    • #11
                      If Raymond says it, then it's true.
                      It's a redistributable.
                      Thank for the link Eros.

                      Comment


                      • #12
                        I already have a solution but I'm searching for a clever (possibly very clever) idea

                        So I'm reading again and again PB help and scanning PB forum searching for something I still do not know about PB.

                        PB Help at DIM ... AT statement says:
                        "The most common use of an absolute array is when manipulating Visual Basic
                        arrays directly from a DLL. This involves obtaining a pointer to the array, the
                        element size, and the number of elements. With this information, an absolute
                        array can be dimensioned in PowerBASIC and the array memory manipulated
                        directly.
                        Another common use involves using a large dynamic or fixed-length
                        string memory block, overlaid with an absolute numeric array."

                        The RED part is true ONLY if "element size" is applied at compile time.
                        thinBasic programming language
                        Win10 64bit - 8GB Ram - i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB

                        Comment


                        • #13
                          Deleted for clarity, see my post below.
                          Last edited by Eric Pearson; 2 Feb 2018, 09:24 AM. Reason: Brevity
                          "Not my circus, not my monkeys."

                          Comment


                          • #14
                            The RED part is true ONLY if "element size" is applied at compile time.
                            Why is this a problem? LONGs, INTEGERs, etc are known sizes at compile time. "Z" and fixed length strings, and UDT variables are declared in source code so are also known at compile time.

                            Using DIM AT to make a array of dynamic strings is not an option. A separate array of string headers could be created and populated with starting element size and pointers at run time. Any editing that changes length of an element would, of course, "shred" the original long string that we would have DIMed AT if we could have. But one long string can be rebuilt by concactinating the strings refered to by the modified headers in the array. ((and this is about how a string array works in first place.))

                            Cheers,

                            P.S. fly in string array ointment. Aren't there nuls after dynamic strings that we don't usually worry about???
                            Dale

                            Comment


                            • #15
                              He wants it to be done at runtime, and he thought that he could not.

                              As I understand it, dynamic strings need not be involved.
                              "Not my circus, not my monkeys."

                              Comment


                              • #16
                                Oops, I think I misunderstood along the way. I'll remove my previous post.

                                You are correct, Eros, you can't use DIM AT because you don't know the block size ahead of time and you can't possibly have a UDT for every possible length. If you get a runtime block size of 113 bytes (or whatever) the plan would fail.

                                I'm tellin' ya, ARRAY SORT CALL is one of my favorite tools.
                                "Not my circus, not my monkeys."

                                Comment


                                • #17
                                  Did not say they (dynamic strings) had to be involved; but other types sizes are known at compile time, so what will the unknown size thing be refered to AS????

                                  Something like -
                                  s$(x) = MID$(@Pointer, Start& , Count&)
                                  Pointer is varptr of long fixed string or strptr of long d string.

                                  Where Count& is unknown at compile time, and s$() is dynamic storage vars with length being variable at run-time, and Start& calculated for each "x"

                                  Cheers,
                                  Dale

                                  Comment


                                  • #18
                                    Yes yes, I know how to use DIM ... AT. I think is the statement I use more in my 150k lines of PB project.
                                    Using DIM .. AT with numbers is just a "no problem at all". It is always easy when you know you data size at compiler time. Isn't it?

                                    I avoid MID$ whenever possible: too slow, it copies data in/out from the cpu to memory so I cannot use in my project. I move memory only when absolutely needed.

                                    Anyway, thanks.
                                    thinBasic programming language
                                    Win10 64bit - 8GB Ram - i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB

                                    Comment


                                    • #19
                                      "The most common use of an absolute array is when manipulating Visual Basic arrays directly from a DLL"

                                      I don't know who said that but I've used absolute arrays since the 1980s (yes they WERE available in GW-BASIC!) but NEVER used them to manipulate VB arrays (under Windows these are safearrays).

                                      If you want to "manipulate" safearrays the better way (IMO) is to use the functions provided by PowerBasic Inc. in file VBAPI32.INC, which is installed in your WinApi folder. You can, of course, write your own functions to do these things because safearrays are a standard feature of Windows... but why reinvent the wheel?

                                      Comment


                                      • #20
                                        Using one quick sorting function code from this forum is one solution.
                                        If you want to use ArraySort, you could build a pointer array to your records and then use the "Custom sort" option...
                                        Same strategy as in code above using pb function...

                                        Code:
                                        #COMPILE EXE '#Win 9.07#
                                        #DIM ALL
                                        #INCLUDE "Win32Api.inc"
                                        
                                        TYPE RecordPointer
                                          pRecord AS DWORD
                                        END TYPE
                                        
                                        $AppName = "ArraySort"
                                        
                                        GLOBAL ItemSize AS DWORD
                                        '_____________________________________________________________________________
                                        
                                        FUNCTION RecordArraySort(Param1 AS RecordPointer, Param2 AS RecordPointer) AS LONG
                                        
                                         FUNCTION = 2 - CompareString(%LOCALE_USER_DEFAULT, %SORT_STRINGSORT, _ %NORM_IGNORECASE
                                                                      BYVAL Param2.pRecord, BYVAL ItemSize, BYVAL Param1.pRecord, BYVAL ItemSize) 'Swap 1 and 2 for descending sort
                                         'CSTR_LESS_THAN = 1 (str 1 less than 2), CSTR_EQUAL = 2 (equal str), CSTR_GREATER_THAN = 3 (str 2 less than 1)
                                        
                                        END FUNCTION
                                        '_____________________________________________________________________________
                                        
                                        FUNCTION PBMAIN() AS LONG
                                         LOCAL pMem      AS DWORD
                                         LOCAL ItemCount AS DWORD
                                         LOCAL Index     AS DWORD
                                        
                                         ItemCount = 26 'Alphabet
                                         ItemSize  = 16
                                        
                                         pMem = LocalAlloc(%LMEM_FIXED OR %LMEM_ZEROINIT, ItemCount * ItemSize) 'Here is our memory
                                        
                                         FOR index = 0 TO ItemCount - 1 'Fill all alphabet
                                           POKE BYTE, pMem + index * ItemSize, 64 + 26 - Index 'Set every record from "Z" to "A"
                                         NEXT
                                        
                                         DIM RecordArray(0 TO ItemCount -1) AS RecordPointer 'Build a fixed lenght record pointer array
                                         FOR index = 0 TO ItemCount - 1
                                           RecordArray(index).pRecord = pMem + index * ItemSize 'Put a pointer to a record
                                         NEXT
                                        
                                         ? "Before Sort> " & PEEK$(RecordArray(0).pRecord, 1), %MB_TOPMOST, $AppName 'Show "Z"
                                         #IF %PB_REVISION <= &H0907
                                         ARRAY SORT RecordArray(), USING RecordArraySort()
                                         #ELSE
                                         ARRAY SORT RecordArray(), CALL RecordArraySort()
                                         #ENDIF
                                         ? "After Sort> " & PEEK$(RecordArray(0).pRecord, 1), %MB_TOPMOST, $AppName 'Show "A"
                                        
                                         LocalFree(BYVAL pMem)
                                        
                                        END FUNCTION
                                        '_____________________________________________________________________________
                                        '

                                        Comment

                                        Working...
                                        X