Announcement

Collapse
No announcement yet.

Bringing a small lookup database into memory

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bringing a small lookup database into memory

    Hi!

    I have an Access database with a simple lookup table
    in it. The structure looks like this (about 13,000 rows):

    field1,field2
    714551,949551
    714312,909312
    714313,909313
    714314,909314
    714315,909315

    Basically I do a lookup on field1
    to get field2.

    Is there a way to load this data
    into memory and do the lookups
    in memory versus doing the lookups
    against an Access database. It
    seems to me that the disk access
    against a database file is much more
    time intensive than doing it in memory.
    Would I bring this data into 2 big
    arrays where Field1 is in Array1 and
    Field2 is in Array2. An array search
    is basically sequential, correct? So,
    if I have 10,000 elements in the array
    I would potentially have to look at
    10,000 elements to find that last lookup
    value. This is unlike a database that offers
    indexing.

    I guess, which is fastest?
    a. Database file with indexes accessed via a file system
    b. Data loaded into an array doing a sequential search
    c. Or, is there some other way to do it in memory?

    Any ideas on this subject would be highly appreciated.
    Thanks in advance.

    Scott

    Scott Wolfington
    [url="http://www.boogietools.com"]http://www.boogietools.com[/url]

  • #2
    b. Data loaded into an array doing a sequential search

    Use this with indexing and you'll get the best speed, if that is
    what you want. If you load everything into an array, sort it and
    then loop through it to get the addresses for changes, like where
    does the items that start with a B begin, where does C begin, etc,
    you can achieve a fantastic speed, using ARRAY SCAN.



    ------------------

    Comment


    • #3
      Thanks for the tip Borje. The array is working
      great. I tried your suggestion about sorting and
      indexing the array "like where does the items
      that start with a B begin, where does C begin,
      etc." It worked, but I didn't really notice an
      improvement in performance. Maybe this is because
      I had to use a few more lines of code
      (ie. if I had 714551 as the number I was searching
      on I'd convert it to a string, grab the first
      character "7" using left$, look that up in my
      "list of indexes" array to see where the 7's start,
      then perform the ARRAY SCAN function with the index
      option). I think I'll end up searching the entire
      array each time.

      One more quick question. The data in my array is
      already sorted. Is the ARRAY function doing a sequential
      search behind the scenes or is it using a "sort routine"
      to find the item being searched for? It's VERY fast.

      Thanks a lot,
      Scott

      Scott Wolfington
      [url="http://www.boogietools.com"]http://www.boogietools.com[/url]

      Comment


      • #4
        Assuming that "equals" or not found" are the only conditions you are looking for (i.e., you are not looking for "greater than"),
        you can sort the array on the "search for" key and do a binary search. Much, much faster than a sequential search (e.g., ARRAY SCAN).

        I have some code around here somewhere to to a binary search of an array. I sent it to Alan Earnshaw for "The Code Corner" but he decided not to print it. If you're interested, I'll see if I can find it and post in over in the Source Code Forum.

        MCM

        Michael Mattias
        Tal Systems (retired)
        Port Washington WI USA
        [email protected]
        http://www.talsystems.com

        Comment


        • #5
          Originally posted by Michael Mattias:

          I have some code around here somewhere to to a binary search of an array. I sent it to Alan Earnshaw for "The Code Corner" but he decided not to print it. If you're interested, I'll see if I can find it and post in over in the Source Code Forum.
          Hey Michael,

          That would be GREAT!

          It sounds like that would work great for the lookup I'm
          doing in this particular array because the values are,
          indeed, unique. I have another array that I'll be doing
          lookups in as well but they won't be unique values. For
          example,

          Fld1,Fld2
          714551,949551
          714551,909551

          could occur. If this occurs then there is other processing
          I do. The binary search probably wouldn't work in this case,
          or would it? It would still be "an equals or not equals"
          situation I guess. Hmmm...

          I look forward to seeing your code. Thanks!

          Regards,
          Scott
          Scott Wolfington
          [url="http://www.boogietools.com"]http://www.boogietools.com[/url]

          Comment


          • #6
            I've been rather impressed with the ARRAY handling in PBCC and
            PB DLL. Recently loaded a 110MB string array into "memory",
            sorted it, and wrote it back to disk. Since I have only 64MB
            PB must have some nice memory algorithms to enable such code
            to work.

            I only mention it because it is another alternative to binary
            search in BASIC code. Assuming all of your Field1 values
            are numeric you could just load the entire 13,000 record
            database into an array and simply index to find Field2. (if
            there are alphabetics in Field1 it could be converted to a
            number using a base 36 number system - less if there are only
            a few different alpha characters, like just A or B).

            Here is a sample using your DB samples. You might test it to
            see how fast or slow it is.
            ----------------------------------------------------------------
            #COMPILE EXE
            FUNCTION PBMAIN() AS LONG
            DIM A(74312:714551) AS LONG
            A(714551)=949551
            A(714312)=909312
            A(714313)=909313
            A(714314)=909314
            A(714315)=909315
            i&=714313
            MSGBOX STR$(A(i&))
            END FUNCTION
            ------------------------------------------------------

            ------------------

            Comment

            Working...
            X