Announcement

Collapse
No announcement yet.

Some good free indexing code?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Some good free indexing code?

    I would like to create me a fast storage for cells with native code.
    btree or whatever.
    But adding/modifying and obtaining must be very fast.
    Also a large capacity would be nice.

    A memory mapped file comes to mind where indexes are stored in memory pointing to the mmf?
    hellobasic

  • #2
    Would a hash or COM dictionary object be okay for your needs rather than a btree implementation? I've posted a very fast hash source code in the source code forum and Jose has code to manipulate the COM Dictionary object.
    Paul Squires
    FireFly Visual Designer (for PowerBASIC Windows 10+)
    Version 3 now available.
    http://www.planetsquires.com

    Comment


    • #3
      Must be really fast and able to hold a few million cells.
      The requirements i posted are steep, a problematic combination of fast and lot.
      I have alternatives but some are as additional library, not really desired.

      A com interface where cells are instantiated separately may not be fast due memory allocation.
      Can you fill ~1.000.000 cells in less than ~ 1 a 2 seconds?
      hellobasic

      Comment


      • #4
        Describe data and indexing or sorting required.

        ie., is this a "load in sorted order (one time)" or is it a "load so I can find data by key (on demand)?"

        Describe data type (integer, string, etc) and if applicable the key type and if key is part of data or independently-created.

        If it's a simple key=data thing which must be maintained while progam does other things, then you can just store in a pair of PB synchronized arrays*. Use binary search to find data by key, example in source code forum, search "binary" in subject.

        * which could be mmf (memory) objects if use of STATIC or GLOBAL variables not desireable.



        MCM
        Michael Mattias
        Tal Systems (retired)
        Port Washington WI USA
        [email protected]
        http://www.talsystems.com

        Comment


        • #5
          Edwin
          Your speed requirements are pretty tough. If you look in the Café forum “Sorting 1 Petrabyte” you will find a speed example of my engine. The times are actually for a total reindex or building of an index from an existing file. If you are going to add 1 million records at a time then that’s probably the fastest way to go. The index structure is a B+ tree which can be read forwards or backwards with equal speed as well as obviously randomly, with easy and fast add or delete and reuse of deleted data space. Multi user of course with various locking options.
          Each table can have up to 25 indexes and each index can comprise up to 5 fields. The basic difference from say X base is that the table (data file) contains no information about its structure or indexes, just data, and numeric values can be stored in native binary format rather than converting everything to ASCII, in fact if the index fields are kept constant you can store different types of records within the same table as was done in many older languages ie RPG2 so Blobs don’t need a separate file and of course you can have arrays within a table.
          Happy to send you a copy of the DLL (115K) and usage instructions, free for personal use of course.
          John
          Last edited by John Petty; 5 Dec 2008, 10:23 AM.

          Comment


          • #6
            com interface where cells are instantiated separately may not be fast due memory allocation.
            Can you fill ~1.000.000 cells in less than ~ 1 a 2 seconds?
            Define "fill cell."
            Michael Mattias
            Tal Systems (retired)
            Port Washington WI USA
            [email protected]
            http://www.talsystems.com

            Comment


            • #7
              i use sqlite with an in-memory database. it works great. i use it to cache the results of repetitive sql server look-up queries, sorting the contents of text files read into memory, etc.

              -don
              Don Dickinson
              www.greatwebdivide.com

              Comment


              • #8
                sorry, didn't read the "native code" requirement. sqlite would require an extra dll so probably isn't what you want/need.
                -don
                Don Dickinson
                www.greatwebdivide.com

                Comment


                • #9
                  How about DVBtree?
                  It could be modified to use an array instead of disk.
                  Last edited by Mike Doty; 15 Dec 2008, 03:55 PM.
                  The world is full of apathy, but who cares?

                  Comment


                  • #10
                    There is no way Edwin will be able to fill 1,000,000 records in 1 to 2 seconds using a balanced btree algorithm (even if it is array based). There is just too many calculations that need to be done to ensure the tree remains balanced upon each insert (node splits, etc...).
                    Paul Squires
                    FireFly Visual Designer (for PowerBASIC Windows 10+)
                    Version 3 now available.
                    http://www.planetsquires.com

                    Comment


                    • #11
                      How about a virtual grid like PBVList?
                      The world is full of apathy, but who cares?

                      Comment


                      • #12
                        I notice Edwin has not clarified his request. But let me offer:

                        If you have 1,000,000 data items which need to be sorted every time the program runs, I'd suggest they be STORED sorted or indexed.

                        PB's PowerTree works real nice. I think it's only 99 bucks with no charge to redistribute as part of your larger application.

                        MCM
                        Michael Mattias
                        Tal Systems (retired)
                        Port Washington WI USA
                        [email protected]
                        http://www.talsystems.com

                        Comment


                        • #13
                          Good suggestion. He wants free and native code which makes this much harder.
                          The world is full of apathy, but who cares?

                          Comment


                          • #14
                            It was a request for the future, i may ever need it and was interested what would 'fly bye'

                            I can buy anything but i want plain code
                            However, don't make it a big deal yet, the limits still remain though.

                            This topic: maybe later to reopen?
                            hellobasic

                            Comment


                            • #15
                              >He wants free and native code

                              That's twice this term has come up in this thread.

                              No, not "free" ; that much I understood.

                              But since when does 'native code' mean "no DLLs?" What do you think is in a DLL, "non-native" code? Were it that it could not execute!

                              Besides, Real Men aren't afraid of DLLs.


                              MCM
                              Michael Mattias
                              Tal Systems (retired)
                              Port Washington WI USA
                              [email protected]
                              http://www.talsystems.com

                              Comment


                              • #16
                                When someone asks for code they want source code. If they ask for native code they probably mean "source code."
                                Thanks for keeping us honest that source is not the same as native.
                                The world is full of apathy, but who cares?

                                Comment


                                • #17
                                  If EXE size is not an issue Edwin, save the data (sorted) as RCDATA and just block-copy it in at runtime.

                                  No DLLs, and you already have the code to retrieve it.
                                  Michael Mattias
                                  Tal Systems (retired)
                                  Port Washington WI USA
                                  [email protected]
                                  http://www.talsystems.com

                                  Comment


                                  • #18
                                    Native as in native PowerBASIC code..
                                    However, let's drop it for a while.
                                    hellobasic

                                    Comment


                                    • #19
                                      Edwin,

                                      Take a look at SkipLists, at link.

                                      Regards,

                                      Marcel
                                      If something must done the proper way you must do it yourself.

                                      Comment


                                      • #20
                                        New link.

                                        Originally posted by Marcel Kollenaar View Post
                                        Edwin,

                                        Take a look at SkipLists, at https://www.cs.cmu.edu/~ckingsf/bioi.../skiplists.pdf

                                        Regards,

                                        Marcel
                                        If something must done the proper way you must do it yourself.

                                        Comment

                                        Working...
                                        X