Announcement

Collapse
No announcement yet.

array descriptors

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    array descriptors

    Has anyone come up with a way to make arrays completely portable, in the sense that they could be copied in and out of external media without having to be re-dimensioned (no. of entries + type declaration)? I can see the problem with UDTs, that the array descriptor knows the repeat length but not its structure - but no matter, that mapping can be applied later...

    Why does PB have to rebuild the array descriptor instead of just allowing it to be copied?

    Is there anything in V9 which will help with this - I'm taking V9 features on a need-to-know basis, wish I had time to do otherwise!

    #2
    Why does PB have to rebuild the array descriptor instead of just allowing it to be copied?
    Because a PB Array (capitalized) is a compile-time "thing" only... it is a way the compiler provides to easily access arrays (small a) in a logical fashion, an array (small a) being the common term for "multiple data items of the same size arranged in contiguous storage. "

    I don't know what you are trying to do, but this sounds like a storage challenge, not a compiler usage challenge.

    If you wish to make table data portable, you can always design your own storage, which will look suspiciously like what the proprietary array descriptor probably looks like:
    Code:
    TYPE DimensionType
         lbound   As LONG
         nElement AS LONG 
    END TYPE 
    
    TYPE DescriptorType
       datatype     AS LONG 
       nDimension  AS LONG 
       Dimension (1:%MAX_DIMENSIONS) AS DimensionType
    END TYPE
    
    LOCAL descriptor as descriptorType
        Fill UDT with info 
    
    OPEN "myfileonanymedia" for BINARY as hFile
    PUT  hFile,  Descriptor
    S$  =  (all data as a string) 
    PUT$  hFile, S$
    MCM
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


      #3
      Forgot: another reason for not 'copying' the array descriptor is because I am sure somewhere in the PB array descriptor is a starting address for the contiguous data... which is meaningless except in the current process.
      Michael Mattias
      Tal Systems (retired)
      Port Washington WI USA
      [email protected]
      http://www.talsystems.com

      Comment


        #4
        Like Michael says, PB likely has to know at least the starting address, so I doubt there is a way to copy the descriptor. But as shown below, the descriptor can be created in about the same time it would take to copy it from a file, maybe less.
        Code:
        #COMPILE EXE
        #DIM ALL
        
        FUNCTION PBMAIN () AS LONG
        
            LOCAL arrStr AS STRING
            OPEN "c:\binaryArrayImage.dat" FOR BINARY AS #1 'media file
            arrStr = STRING$(10000000, 129)                 'this holds your array image
            DIM extVal(999999) AS EXT AT STRPTR(arrStr)     'absolute array makes descriptor for PB use in just several ticks
            ? STR$(extVal(11345))                           'here's a value
            PUT #1,, arrStr                                 'or PUT #1,, extVal() can be used
                                                            'now your array is stored as an exact image
            RESET extVal()                                  'RESET the array and...
            RESET arrStr                                    'RESET the string to prove this works
            SEEK #1, 1                                      'go back to beginning of array image data
            GET$ #1, 10000000, arrStr                       'read you array binary data image
            REDIM extVal(999999) AS EXT AT STRPTR(arrStr)   'again absolute array makes descriptor for PB use in just several ticks
            ? STR$(extVal(11345))                           'same value as above
        
        END FUNCTION

        Comment


          #5
          Unless you create your own "stored" format as MCM has suggested, data on an external media (Disk/CD/DVD ...) is going to be devoid of "array descriptors". It isn't there!

          If you want to verify this, use a tool like WinHex and actually look at the data on disk as stored by most any application that uses typical array data storage methods. It is without array descriptors. Dynamic string data will have a string terminator, and may have additional information if it is stored in packed string format (see the JOIN$ help on the BINARY option for an explanation)

          With fixed length data, numerical or string, you could dump into a memory block and possibly DIM ... AT, but as for saving any appreciable time ... nothing you will really see. Array descriptors for fixed length numerical or string data are not necessarily the same as for a dynamic string array. Why? Because fixed length data can be a simple, calculated offset from the start of data.

          Dynamic string arrays, on the other hand, require a string handle variable for each string that stores the address of where the string data starts. Immediately preceding that address location, PB currently stores the string length, this is not guaranteed to always be the case, but it is just on observation on the current dynamic string array arrangement. What is more is that as you may be working with the array, PB may reclaim and adjust memory locations for the string data and then the string handle is updated to the appropriate new memory address. Again, one should be aware that exactly how PB (or for that matter most any compiler) manages the string data is not published and is always in the subject to a publisher's internal change situation.
          Rick Angell

          Comment


            #6
            Originally posted by Richard Angell View Post
            Unless you create your own "stored" format
            exactly what I'm doing.

            Thanks all!

            Comment


              #7
              Perhaps a more pertinent question, what kind of data are you storing?
              Rick Angell

              Comment


                #8
                Originally posted by Richard Angell View Post
                Perhaps a more pertinent question, what kind of data are you storing?
                Not pertinent to my three questions, but... All sorts of stuff, basically the whole application state when I save it. Some udts, some udt arrays, mostly dynamic string arrays in which the elements contain UDT arrays.

                Comment


                  #9
                  Unfortunately the mathematical theorists decreed many years ago "thou shall not have arrays in a data base, you must create a seperate table" obviously forgetting that there are many common instances where the programmer always knows in advance the fixed size of the array, like months in the year.
                  So you need to create your own storage, PB will save and read back UDT's without any problems.

                  Comment


                    #10
                    Some udts, some udt arrays, mostly dynamic string arrays in which the elements contain UDT arrays.
                    Since UDT arrays are best stored as binary, and have the structure essentially hard coded in your program to access them it would seem to make little sense. Otherwise you need to load in BINARY mode treating and access "members" by pointer + calculated offset. Will this be faster, doubtful, but portable without knowing the structure and using a generic routine to use your special header info ...probably. As far as dynamic strings, plenty of work to do by yourself, with no appreciable benefit readily visible ... is there? Using a BINARY mode ou to PB packed string format may help, but your reading in, unless you code in assembler, will likely not match the speed of PB's native, if ever. I mean the lat time I had to do the roll your own array descriptor bit was like a quarter of a century plus ago before BASIC got all the nice array management tools ... of which PB is really optimized in handling well

                    That said, if you are doing an all-in-one relational DB scheme, then of course you need to store information in a header(s) that gets updated as transactions are made and you save the whole enchilada.

                    But addressing your first 3 questions, which have been addressed by Michael, et.al.

                    (1) Only for very special needs, otherwise nothing is really tto be gained BECAUSE array descriptors just clue the compiled library routine to calculate or reference where in memory the data is located. That memory address is not guaranteed to be the same in a Windows application at each invocation.

                    (2) From day one, most compliers have never guaranteed that their array descriptor table is going to remain the same eternally. Design Time -vs- Run Time issues. To be truly portable, one would have to have an industry standard, but take the case of dynamic data. The content of the string handle "array", the memory address where the string data is located, which can no only be different each time the program is loaded, but will change if you change the array data, add or delete it. When the address is determined during the data load then there is fast, pointer direct access to the runtime dynamic string data.

                    (3) I can not hink of anything added in PBWin 9.0 that changes the situation, simply because the status quo has proven to be best for the vast majority of needs.
                    Rick Angell

                    Comment


                      #11
                      Rick, thanks again for your reply. What I'm bitching about is just the necessity of storing the ARRAYATTR info seperately when really it is part of the whole array data structure. If I had access to the internals I would not use the current address when reloading the data, honest - at least, not very often.

                      As to how I will do it, for dynamic string arrays, given the array size, and the VARPTR of the LBOUND, the array can be stored in a BLOB. To extract it, I need the size, so that memory can be allocated, and the number of elements so that I can DIM the array AT a suitable address. So I'm storing four seperate things, the type, the size, the count, and the data, which are all attributes of the same array.

                      Rather than complicate the issue for UDT arrays, I think I'll just wrap 'em in a string and save the string in another BLOB, along with the length and something to say it's a string.

                      To have the descriptor and array in one contiguous block of memory would be useful, even better to let the compiler do it alll for me, just point at the object in a buffer and have the array, string or whatever reconstituted without having to write code to dig out the array type, element count, etc and then allocate memory and DIMing the array:

                      Code:
                      <size> = STORE <object> AT <address> TO <result>
                      <enumerated type> = FETCH STATIC|LOCAL|GLOBAL <object> FROM <address> TO <result>
                      It's the kind of thing that would add significantly (IMO) to the usefulness of the compiler.

                      I never saw the point of making life easier for the compiler, or harder for the coder.

                      Comment


                        #12
                        Originally posted by John Petty View Post
                        ..."thou shall not have arrays in a data base, you must create a seperate table"
                        BLOBs to that!

                        Comment


                          #13
                          Originally posted by John Gleason View Post
                          ...the descriptor can be created in about the same time it would take to copy it from a file...
                          equally one could just add up all the LENs of the string elements and add no.elements*4 for the lengths. The advantage being that no file is created.

                          Comment


                            #14
                            Originally posted by Michael Mattias View Post
                            Because a PB Array (capitalized) is a compile-time "thing" only...
                            so where does ARRAYATTR get its info?

                            Originally posted by Michael Mattias View Post
                            somewhere in the PB array descriptor is a starting address for the contiguous data... which is meaningless except in the current process.
                            I would not object to a little redundancy.

                            Actually the best solution is to have the compiler do the whole job as per my post above.

                            Comment


                              #15
                              Otherwise you need to load in BINARY mode treating and access "members" by pointer + calculated offset
                              Not necessarily....

                              Memory Mapped Files instead of RANDOM disk file access 5-8-04

                              MCM
                              Michael Mattias
                              Tal Systems (retired)
                              Port Washington WI USA
                              [email protected]
                              http://www.talsystems.com

                              Comment


                                #16
                                To have the descriptor and array in one contiguous block of memory would be useful,
                                Um, I think that was the solution I proposed. From that structure you could write a FUNCTION to "do it all" in one shot:

                                Code:
                                FUNCTION LoadmyArray (szFilename, array() AS LONG*) AS LONG 
                                ' ******************************
                                  some code goes here
                                ' ******************************
                                END FUNCTION
                                * could actually return ANY type of array here.
                                Well.. OK, know *I* could do it, because I have.
                                even better to let the compiler do it alll for me
                                Real Men.....

                                (he said, knowing he's leaving for the day in about three minutes)

                                MCM
                                Michael Mattias
                                Tal Systems (retired)
                                Port Washington WI USA
                                [email protected]
                                http://www.talsystems.com

                                Comment


                                  #17
                                  Michael,

                                  I purposely left it out knowing you would put it on the table . Figured it was a better than 9:1 odds.

                                  Chris,

                                  Why not try both ways and time it with the CPU's performance counter, then post it, (I think some others tried this in the last few years.) Make sure your data can handle all the potential memory management issues the compiled library code already provides for, as well as dynamic strings that do change their length and number in the program. IOW, a real test.
                                  Rick Angell

                                  Comment


                                    #18
                                    Originally posted by Richard Angell View Post
                                    Why not try both ways and time it with the CPU's performance counter, then post it, (I think some others tried this in the last few years.)
                                    Try what?

                                    Originally posted by Richard Angell View Post
                                    Make sure your data can handle all the potential memory management issues the compiled library code already provides for, as well as dynamic strings that do change their length and number in the program. IOW, a real test.
                                    Sorry, I have no idea what you mean.

                                    Comment


                                      #19
                                      Originally posted by Chris Holbrook View Post
                                      Try what?
                                      Time one array load and access cycle using only PB's commands to load and set-up nd access the array -vs- datafile with an array info prefix of your construction being doing the same thing using your own scheme.

                                      [quote Sorry, I have no idea what you mean.[/quote]
                                      Dynamic String arrays management keeps used space to a minimum as changes are made to the strings stored in it. So if a change is a shorer sring it can reclaim the unused space, if longer, maybe move one string to unused space freeing a larger group of bytes for the changed string to be moved into. This is what a string engine does for arrays so as not to have to allocate more memory when it is not necessary, IOW real-time memory optimization
                                      Rick Angell

                                      Comment


                                        #20
                                        Originally posted by Richard Angell View Post
                                        Time one array load and access cycle using only PB's commands to load and set-up nd access the array -vs- datafile with an array info prefix of your construction being doing the same thing using your own scheme.
                                        OK, see what you mean. I have no need to do this, as a) I have no choice as to how I code the array to blob load and save, b) in this instance, performance is if little concern.

                                        Originally posted by Richard Angell View Post
                                        Dynamic String arrays management keeps used space to a minimum as changes are made to the strings stored in it.
                                        ... which is why I leave memory allocation to the string engine and locate my UDT arrays inside strings.[/QUOTE]

                                        Comment

                                        Working...
                                        X
                                        😀
                                        🥰
                                        🤢
                                        😎
                                        😡
                                        👍
                                        👎