Announcement

Collapse
No announcement yet.

Database with calculations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Database with calculations

    I have several million latitude/longitude pairs in a database table, and I need to sort them according to their distance from a target location, for example my house. It's easy enough to write an external program that performs Great Circle calculations and updates a DistanceFromTarget field in the database, but updating millions of single records takes a long time. I want to be able to specify a lat/lon target and have the database do the calculations when I perform a query.

    How would I approach that?
    "Not my circus, not my monkeys."

  • #2
    Hello Eric, is the target location variable? If the target location is static then just saving an extra field in the table (as you said) with the distance to the target location will allow you to even sort the records by distance. That would be the simplest solution.

    If the target location is variable, newer versions of MySQL allow to implement external custom functions, im not sure but I think you can even use PowerBASIC to compile a c++ like funciion (CDECL) to make the calculations and feed the database output, unless MySQL runs the interpreted code instead of using an actual external module (DLL). Unfortunately i only know the topic in theory and have never actually done it my self, so i cannot provide you with an example.

    Added:
    Take a look at this:
    https://dev.mysql.com/doc/extending-...dding-udf.html
    www.pluribasic.com
    www.patreon.com/pluribasic

    Comment


    • #3
      > variable?

      Yes, that's the goal.
      "Not my circus, not my monkeys."

      Comment


      • #4
        How about input from and output to csv? Import, export is fast - at least it is on SQL Server.

        Comment


        • #5
          Without knowing what database engine, it's impossibe to recommend an approach.
          Some allow User Defined Function, others don't. Those that do have very different capabilities.

          Do you need all of those millions of locactions with a distance or is there a maximum number of records or maximum distance you are intrested in?


          ISTM that an initial filter of min/max lat/long to select all records within a rectangle around the location would let you reduce the number of complex GC calcluations.

          Comment


          • #6
            Maybye you can define a VIEW... which is basically a standing query with the DB doing all the work.

            If the "target location" is constant than this will work, but if variable it won't/

            Still, no reason you can't have the DB return the distance as a calculated column, just like you'd return an aggregate: .

            Code:
               SELECT Point1, Poimt2, (Point2- Point1) AS distance  from tablename where condition order by distance desc
            Some [DBs] allow User Defined Function, others don't.
            Assuming your DB supports it, this was my thought, too*; but you could define a parameterized procedure returning the distance.

            As a function:
            Code:
            SELECT Point1, Point2, f_distance (point1, Point2) as dist_between from tablename where condition order by dist_between
            OR
            As a procedure:
            Code:
               CALL  procname (point1, point2, distance_between)
            /*  distance_between is an output or I-O param
            Just keep your condition such that you are not always doing the whole database

            but updating millions of single records takes a long time.
            What updating? No need to update the DB with this distance from a run-time specified point.. ?????

            Overall.. if it's not the desire to make your question short, sounds like you may be trying to do too much here.

            But either in-line SQL in a SELECT statement, a PROCEDURE or a FUNCTION can accomplish your goal of "letting the DB do the work."

            MCM
            * given my druthers I'd use Oracle, which does allow the definition of a FUNCTION.

            .
            Michael Mattias
            Tal Systems (retired)
            Port Washington WI USA
            [email protected]
            http://www.talsystems.com

            Comment


            • #7
              > SELECT Point1, Poimt2, (Point2- Point1) AS distance
              Unfortunately, oit more like

              It's actually more like:
              Select lat,lon, ? as lat2, ? as lon2,(2 * arcsin(SQR((SIN((lat - lat2) / 2)) ^ 2 + COS(lat) * COS(lat2) * (SIN((lon - lon2) / 2)) ^ 2)) * 3437.74677078493925) as distanceNM
              where lat/lon are radians

              Which probably doesn't work because you generally can't bind a parameter to a select column

              Comment


              • #8
                The more I think about this, the more I like the idea of getting the several million lat/lon and placenames into an array and doing the calculation and sort in PB (again prefiltering/ bounds checking if possible to reduce the number of subsequent GC calcuations)

                Either with an array of distances and the Name/lat/lon as a TAGARRAY or a custom sort GC function

                I'm pretty sure that would be faster than doing the calculations with the DB engine.

                Comment


                • #9
                  You may like to consider a "sector" approach: https://docs.microsoft.com/en-us/arc...longitude-data

                  Comment


                  • #10
                    Is this what you are looking for:
                    https://daynebatten.com/2015/09/lati...-distance-sql/

                    Comment


                    • #11
                      Thanks all! After reading your replies I'm thinking that I already have a pretty good solution. I pre-filter with two pre-calculated "hemisphere" fields (N/S and E/W), then filter with a rectangle, do the math on the subset to convert the rectangle to a circle around the target, and insert the results into the database. I guess I just need to work on automating the process; making the database do the work isn't really going to improve anything.

                      Thanks for the feedback!
                      "Not my circus, not my monkeys."

                      Comment


                      • #12
                        Hi Eric - You’re probably doing this already, but just in case you aren’t. Make sure to use “BEGIN TRANSACTION” before doing any database inserts/updates and “END TRANSACTION” after your final database insert/update. I’ve found in my own experience that this makes a huge difference in the performance of DB writes. Don’t wrap individual DB updates between Begin/End as that defeats the performance boast. Obviously that latter doesn’t apply if you are making use of the rollback functionality of the DB.

                        Comment


                        • #13
                          Which probably doesn't work because you generally can't bind a parameter to a select column
                          You cannot bind parameters to ANY database object. The DBOs must be fixed at preparation ("database internal compilation") time.


                          Michael Mattias
                          Tal Systems (retired)
                          Port Washington WI USA
                          [email protected]
                          http://www.talsystems.com

                          Comment


                          • #14
                            Make sure to use “BEGIN TRANSACTION” before doing any database inserts/updates and “END TRANSACTION” after your final database insert/updatte. I’ve found in my own experience that this makes a huge difference in the performance of DB writes.
                            That "may" be because some transaction-capable DBs or user installation defaults [sometimes these install defaults are buried and not all that easy to find) will treat an entire session (user logon thru user logoff) as a single transaction. In this case, there can be a whole lot of journalling going on, and that takes time. Note also, some DBs will consider a successfully executed SELECT statement to be "automatic END TRANSACTION, COMMIT" - again, soemetimes controlled by user installation default. This would start a NEW transaction on the next operation and there will be no journals to update.

                            So maybe just slipping in a SELECT would relieve the pressure

                            Don’t wrap individual DB updates between Begin/End as that defeats the performance boast.
                            BEGIN TRANSACTION/END TRANSACTION should not be used as a "performance" tool; it is a data integrity tool!

                            Your best source of enhancing the performance of DML statements is to work with a good DBA. I'm not a DBA myself, but I have done work for several firms who provided access to their DBAs, even for outside consultants like I was,

                            (I remember the guys at Allstate Insurance in Northbrook IL were especially helpful to me. That's just a good a word for them as they took the time to help me "tune" my queries and other DML statements. )
                            Michael Mattias
                            Tal Systems (retired)
                            Port Washington WI USA
                            [email protected]
                            http://www.talsystems.com

                            Comment


                            • #15
                              > you generally can't bind a parameter to a select column
                              Originally posted by Michael Mattias View Post

                              You cannot bind parameters to ANY database object. The DBOs must be fixed at preparation ("database internal compilation") time.
                              "Never say never".

                              It depends on the RDBMS and what you are doing in it. It is quite possible to pass a column name as a parameter to a stored procedure and have the SP build and execute a query based on it.

                              Comment


                              • #16
                                Originally posted by Michael Mattias View Post
                                BEGIN TRANSACTION/END TRANSACTION should not be used as a "performance" tool
                                "is a" <> "cannot be beneficial for other purposes".

                                If one side effect of a BEGINTRANSACTION... END TRANSACTION is a 1- 2 orders of magnitude decrease in execution time (which I've seen in practice), it is well worth using as a "performace tool".

                                Comment


                                • #17
                                  It depends on the RDBMS and what you are doing in it. It is quite possible to pass a column name as a parameter to a stored procedure and have the SP build and execute a query based on it.
                                  I have done that, passing either table names OR column names to a procedure and using in a query built in said procedure; But tat is not the same as binding a parameter to a prepared statement... it's more like a MACRO than anything else.. or passing an argument or setting an environment variable for use in a batch/command file. .

                                  There is no parameter binding to database objects in any DML statement ( SELECT, INSERT, UPDATE, DELETE + catalog functions such as CREATE TABLE). I don't want anyone thinking it is possible. Think about whar PREPAREing a DML statement does and you'll say, "Of course" after a little while.


                                  MCM

                                  Michael Mattias
                                  Tal Systems (retired)
                                  Port Washington WI USA
                                  [email protected]
                                  http://www.talsystems.com

                                  Comment


                                  • #18
                                    Sorted a file with millions of 100-byte records (on any positions) in a few seconds that easily won a challenge using OptTech Sort. Might be of interest.

                                    Post #38. With newer SSD's and faster CPU's the times would be dramatically reduced from this test in 2012.
                                    https://forum.powerbasic.com/forum/u...-external-sort

                                    Comment


                                    • #19
                                      Ooh.. I have Optech sort for MS-Dos. I $LINKED it into some PB/DOS programs I wrote... hmm.. when was that? ..well, more than two weeks ago anyway.

                                      ISTR OptTech Sort was the Cadillac of third-party sorts at the time.

                                      Michael Mattias
                                      Tal Systems (retired)
                                      Port Washington WI USA
                                      [email protected]
                                      http://www.talsystems.com

                                      Comment


                                      • #20
                                        I have used Opttech (two "t"s) sort for Dos and Windows as well for decades. Awesome product. All my current sorting needs are fulfilled by PB Collections but would not hesitate to use OTSort again.

                                        Comment

                                        Working...
                                        X