Announcement

Collapse
No announcement yet.

ADO, read records faster?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ADO, read records faster?

    I wonder if it would be better to change my code to early bound instead of late bound.
    I am looking for some data reading improvement.
    The rebuild would take some time, something i want to avoid if the improvement is little.
    hellobasic

  • #2
    I don't know how you can speed up specifically the "read".

    By the time you are doing MoveNext the recordset has been created, and best guess is Windows has some caching mechanism for that, too... i.e., I doubt you are going back to disk for each fetch.

    That is, the number of COM methods you execute or properites you get is probably very small relative to the time whatever DBMS or file system takes to actually generate the recordset.

    But, if I read the printed manual correctly (and it was a heck of a lot easier to get this out of the printed manual than the help file).... you accomplish "early binding" with the dispatch interface simply by including an INTERFACE defintion block with a positive integer in acute parentheses following... which is what the PB COM Browser does, which is how I created Generic 'ADO' Connection and Query Tester (CC 5+/Win 9+) 11-02-08

    Now, if you want to talk the differnce between the dispatch interface and the direct interface, that may or may not be worth looking at... because as I understood the manual, once you put that number in the IDBIND interface definition, all the "lookup property or method by name stuff" is handled at compile time and does not become a run-time consideration.

    Of course, you might also improve your peformance by...
    - Changing DBMS or file systems
    - Changing your overall design of the application
    - Changing the specific access methods you use.

    Code not shown. DBMS/File System not specified. Application not described.

    MCM
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


    • #3
      Ah yes, the newer IDBIND (whatever) may have been optimized for this.
      (I was thinking late bound but memberid's stored/cached here..)

      - Changing your overall design of the application
      Yes, there is some overhead on another level.

      - Changing the specific access methods you use.
      That maybe a transition to early bind calls.
      We use forward-only and thus a record will only be read once.
      hellobasic

      Comment


      • #4
        When using Automation, all the methods and properties are called through IDispatch.Invoke, that requires a dispID (dispatch identifier) as one of the parameters. When using early binding, the compiler retrieves the dispID from the interface declaration; when using late binding, it has to add a call to IDispatch.GetIDsOfNames to retrieve it at runtime. Therefore, using early binding is a bit faster because you avoid the call to GetIDsOfNames.

        Direct interface calls are faster than Automation because the methods and properties are called directly, like with an standard function, although using double indirection, and also the parameters are strongly typed, not variants, and therefore don't need to be converted.

        The time you will save depends on how many calls your application will do.
        Forum: http://www.jose.it-berater.org/smfforum/index.php

        Comment


        • #5
          Zillions of fielddata retrieval..
          I mean, a DB with ~6 fields and a resultset of > 100000 (up to 500000 sometimes) records (consolidated!).

          I know it's worth the exploration but rewriting existing code is not that easy.
          The code needs to remain working and i'll need to rewrite it in one pass.
          And if it does not increase speed.. ?

          Oh btw, different databasetypes are beeing used with this codepart.

          Direct calls may require some unicode conversion(?)
          hellobasic

          Comment


          • #6
            Direct calls may require some unicode conversion(?)
            Of course, unless the parameter is a Variant, in which case PB will do the conversion for you. I wish that the new version of PB will support Unicode natively.
            Last edited by José Roca; 8 Jun 2009, 12:40 PM.
            Forum: http://www.jose.it-berater.org/smfforum/index.php

            Comment


            • #7
              mean, a DB with ~6 fields and a resultset of > 100000 (up to 500000 sometimes) records (consolidated!).
              With result sets of 100K-500K rows I seriously doubt the choice of retrieval methods is going to be all that significant.

              Of course if it's a PERCEPTION thing, maybe you could do something to get, say, one tenth of the records at a time, and then feed that info off and process it asynchronously (anoither thread of execution, or even another process).

              But the SELECT for all the rows is going to take whatever it takes, and you are not going to be able to start fetching until the recordset is ready for you. Maybe tuning your SQL to get ten little recordsets will work out better for you than getting one big one.

              SQL not shown.

              MCM
              Michael Mattias
              Tal Systems (retired)
              Port Washington WI USA
              [email protected]
              http://www.talsystems.com

              Comment


              • #8
                It is possible that creation of the recordset takes up to 2 a 3 minutes.
                This a rare occasion though, the server can be flawed but more often, the DB engine is prehistoric.
                I mean, i get DBF based engines today for applications which are updated regurarly but their drivers are not.

                In this case it was SQL server and also took > 1 minute to come up with a recordset.
                That is not the point but accumulating all data took 25 minutes.
                Ok, that part is now rewritten but imagne that you can bring back this from 4 to 2 minutes.
                This may not appear as a problem to you as user but imagne i bring it back from 25 minutes to 10?
                Still way off but if direct calls can influence the process..?
                Rewriting this code to early binding and gain 5% is no good, to much work at this time.

                It's guessing...
                hellobasic

                Comment


                • #9
                  In this case it was SQL server and also took > 1 minute to come up with a recordset.
                  That is not the point but accumulating all data took 25 minutes
                  It took 25 minutes to "accumulate" a half million rows with six columns once you started fetching? I mean, 500,000 'move next' with six 'move variant to a string array" each just does not take 25 minutes. No way.

                  Show offending code. You have got to be doing something you did not mean to do in there.

                  Try running that same query against the demo I posted link to.
                  [ADDED]
                  Heck, just REPLACE your query function with mine! It's all set up to use as an #INCLUDE file!
                  Last edited by Michael Mattias; 8 Jun 2009, 04:11 PM.
                  Michael Mattias
                  Tal Systems (retired)
                  Port Washington WI USA
                  [email protected]
                  http://www.talsystems.com

                  Comment


                  • #10
                    >Try running that same query against the demo I posted link to.
                    Where is that..?

                    I think i write me a simple test app and see what it does.

                    Note that these are servers, they are often accessable via internet.
                    The DB here was at least 12GB (as i heard)
                    The query is not such an issue imo, the recordset may take up to 2 minutes, i wouldn't mind.
                    Once i get access to the first record, that time is more important to me.
                    The reason is that queries in some cases can not be optimized with certain drivers.
                    Time-outs and to many task to do make them abort, i am not talking about SQL server of course.
                    But the command time may differ per engine anyway.

                    The throughput may indeed be an issue here, we'll see..
                    hellobasic

                    Comment


                    • #11
                      >Where is that..?

                      Link in Post#2.
                      Repeated:

                      http://www.powerbasic.com/support/pb...ad.php?t=39036
                      Michael Mattias
                      Tal Systems (retired)
                      Port Washington WI USA
                      [email protected]
                      http://www.talsystems.com

                      Comment


                      • #12
                        Ok, today i got the chance to work this out.
                        I compared our main app with a conversion tool i ever wrote.
                        Both are using the same ADO technique and the main app retrieval time was over 4 minutes (one year) while the conversion tool did it in 40 seconds.
                        So.. it's somewhere else since retrieving 100000 records in 40 secs is good enough for now.

                        Thanks!

                        hellobasic

                        Comment


                        • #13
                          I may be revisiting this shortly. I have an application coming up where I will regularly be getting 90K to 100K rows back. (It's a batch application so I won't be TOO concerned if it takes a couple of minutes).

                          While I have used the above query function 'for real' with the Jet (Access), Oracle (both MSDAORA and OraOLEDB.Oracle) and Sql/Server (sqloledb) OLE providers, ODBC data sources and the Jet provider against an Excel spreadsheet file, I have never had this many rows to return before.

                          (This app will be Oracle)

                          So we''ll see what happens at those higher volumes....

                          MCM
                          Michael Mattias
                          Tal Systems (retired)
                          Port Washington WI USA
                          [email protected]
                          http://www.talsystems.com

                          Comment


                          • #14
                            The number of records should only influence the execution time to collect the data.
                            Actual data retrieval should be much less of an issue for the driver.

                            What concerns me a little is wrongly used parameters for the ADO connection.
                            like server and local flag and such.
                            I am trying to understand how to get me a real forward only recordset.
                            During the command execution it really differs what you get for resultset.
                            A full snapshow, a keyset (index to row) and maybe even a better situation.
                            I am not sure if there is an even better scenario than keyset.
                            (This is how i call it)

                            Just like i said, i am interested in fast data retrieval and can be forward only.
                            It maybe that even the command time may be reduced by using better flags
                            But in my case i won't have fun to present the end-users with checkboxes to improve a connection.
                            It must be something working for all situations, like i currently have.
                            (But i doubt i have a FO recordset at this time, i threat it this way though)

                            If you need clearance on the words i choose in this message.. let me know.
                            hellobasic

                            Comment


                            • #15
                              The type of cursor supported (eg forward only) can be influenced by the OLE provider in use, but the sqlOledb provider used with SQL/Server (assumed in use, code not shown) should support a forward-only cursor.

                              You can set the CursorType property of the recordset object to %adopenForwardOnly. This has to be done before the recordset is opened; once the recordset is opened this property is read-only....

                              ... SO.....

                              ...to do this you may have to execute the SQL statement thru the open method of the RecordSet object (with the CursorType property set as desired) ... rather then via the Execute method of the Connection object as shown in my demo and perhaps (code not shown) what you are doing now.

                              There may be a way to set the cursor type for the recordset returned by the Connection.Execute method but I can't find that in my ADO reference manual. Or maybe there's an option to do this using the Command object.(I'll let you look that one up yourself).

                              MCM
                              Michael Mattias
                              Tal Systems (retired)
                              Port Washington WI USA
                              [email protected]
                              http://www.talsystems.com

                              Comment


                              • #16
                                Perhaps to speed up the OVERALL application, you need to add one or more indices or even VIEWs to that DBMS....

                                SQL not shown. Current schema not shown.
                                Michael Mattias
                                Tal Systems (retired)
                                Port Washington WI USA
                                [email protected]
                                http://www.talsystems.com

                                Comment


                                • #17
                                  I can show code and queries what i want but it has no use since it is common code for all kinds of databases.
                                  Only those flags i mentioned could have some impact.
                                  The queries are written to be used on a low capacity server, for example, we try to use group by but if the driver get's exhausted (during development) we go back to read all records.
                                  Sorting is also one of the options to abandon right away.

                                  Again, this is not code for one specific in-house task, it will run on client's computers and they setup the connection themselves.
                                  We write code for Exact and other applications but under water we use a common ADO(or ADO.NET if fails) custom com interface.

                                  I would like to focus on the connectionsettings i mentioned earlier.
                                  Imo is the overhead from using late bound calls not a serious problem but something we could do.
                                  As we have seen, we can gain a lot by improving the parts which move the fielddata into memory and the communication between the two apps.
                                  Here is the actual bottleneck.

                                  Reading 100000 in 40 seconds is good but i also know it could be somewhat better.
                                  5 minutes is way to long and bringing this back to under 2 minutes would be great.

                                  Thanks for helping, maybe you have some thought about optimizing the forward only stategy.
                                  This is my main concern.
                                  (A good connection may reduce the data collection)
                                  hellobasic

                                  Comment


                                  • #18
                                    >Only those flags i mentioned could have some impact.

                                    I believe that is a poor assumption on your part.

                                    >Reading 100000 in 40 seconds is good...

                                    No, it isn't.

                                    >... but i also know it could be somewhat better.

                                    Yes, it could.

                                    >5 minutes is way to[o] long ...

                                    Yes, it is.

                                    >and bringing this back to under 2 minutes would be great

                                    No, it would not be great; it would be better than 5 minutes.

                                    If you have a performance problem.. and you must, or you would not have even started this thread... you owe it to yourself and your users to look at every possible bottleneck in the system; to focus only on "late bound vs. early bound function calls" or "forward-only cursor" does everyone a disservice.

                                    First step of course is to identify WHERE the bottlenecks are.

                                    Might there be something you can do vis-a-vis late-bound vs. early-bound function calls, or cursor type? Sure, there might be.

                                    But might there not also also be changes to the SQL used? Changes to the database? Changes to the way your network moves the data from the server to the user? Changes to the design of your application which would eliminate - or make more efficient - the retrieval of the data required by the application?

                                    (Hint: "Yes.")

                                    Tunnel vision is a Good Thing only when you are in a tunnel.


                                    MCM
                                    Michael Mattias
                                    Tal Systems (retired)
                                    Port Washington WI USA
                                    [email protected]
                                    http://www.talsystems.com

                                    Comment


                                    • #19
                                      You did not understood, the parts worth looking for via this board would be the things we can share, like ideas about the connectionpart and evt the late bound issue.
                                      However i cannot share the code and certainly not the other code (where the real bottlenecks are).
                                      These bottlenecks must be explored but it has no use to discuss them here unless it is a common part we could have a discussion about.
                                      So far i didn't had the time to gather this info anyway.

                                      The forward only investigation i am talking about is simply a thing i always wanted to discuss but note that everything currently is stable and i can not simply make all kinds of changes without severe testing.

                                      The duration from 5 to 2 minutes is a good thing... for the end-user, technically speaking it may be not so optimized, i know that.
                                      First important thing is that the customer is happy.
                                      If this app changes each week they will kill me at some point.
                                      hellobasic

                                      Comment


                                      • #20
                                        Edwin,

                                        How long it takes to display the same recordset in TSQL?
                                        You should get the same or just a little bit longer retrieval time from your application.

                                        Peter Redei

                                        Comment

                                        Working...
                                        X