Announcement

Collapse
No announcement yet.

Server memory tricks

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Server memory tricks

    I have a CGI.exe that produces results from a "database". The results are up to 60Mb in size and are returned to the PB client app that requested them.

    The results can be thought of as one big STRING that is parsed by the client.
    I decided to break the string up into packets of say 64k (give me a better number) and have the client request them one at a time (give me a better idea)

    So, while this time consuming xfer is goin on, 60Mb of server memory is tied up. So what you say. Now imagine 100 or even 1000 users at once.

    Two problems arise
    -Efficient usage of memory
    -Handling out of memory situations

    Assuming the IIS server will handle the out of memory, I want to make the best use of memory I can.

    I have two ways to store the data:
    -A continuous sequence of bytes in a BYTE array or STRING
    -A linked list of packets where each node is only 64k of continuous bytes

    Now without knowing anything about server firmware, IIS and memory management in general, it appears that there might be more 64k 'slots' available than 60MB 'slots' when the loading gets high?

    If that is the case then I can design my CGI.exe to hold the data in a linked list rather than a STRING.

    Do I have the concept right?

  • #2
    If you control both the server and client, you can probably get better results with other techniques than CGI.

    Read up on streaming servers such as TwonkyVision and ShoutCast, which stream data at high speed to multiple clients. I don't know how they work, but suspect they hit bandwidth limits before memory runs out.
    --pdf

    Comment


    • #3
      Found some load testing results for icecast - the network bandwidth is definitely the limiting factor, before CPU or memory:

      --pdf

      Comment


      • #4
        Originally posted by Mike Trader View Post
        Assuming the IIS server will handle the out of memory, I want to make the best use of memory I can.
        I'm not sure if you're right with that assumption.

        Comment


        • #5
          A linked list will actually be larger in memory than a continous string, since each chunk needs a reference to the next one. A temporary disk file, or direct from disk, is how I would handle it. The disk access will never be as slow as the transfer speed over TCP.
          kgpsoftware.com | Slam DBMS | PrpT Control | Other Downloads | Contact Me

          Comment


          • #6
            decided to break the string up into packets of say 64k (give me a better number) and have the client request them one at a time (give me a better idea)
            So, while this time consuming xfer is goin on, 60Mb of server memory
            If you are sending it 64 K at a time, there is no reason to reserve 60 Mb .. just go get it 64 K at a time.

            Or, if inconvenient to get it 64K at a time....... you could get all 60Mb, dump it all to a pipe object immediately, then read & send 64Kb at a time. For this approach you could start with this demo code...
            Anonymous Pipe as Job Queue Demo 10-29-03

            MCM
            Michael Mattias
            Tal Systems (retired)
            Port Washington WI USA
            [email protected]
            http://www.talsystems.com

            Comment


            • #7
              An Alternate Aproach

              If you have the option why not create a "temporary" DB Table and fetch the rows from that a few at a time. The chunks can be as big or small as you like.

              If you have an SQL DB it would be fairly easy to SELECT INTO then turn around and read the results. Most SQL DB's can handle temporary tables easily as private dataspace (visable to your process only even if the name is the same) and will manage them in memory or disk as required without any extra code. When the process connection is terminated most will even drop the temporary table for you.

              If it is not an SQL engine then you could follow similar logic in your own code.

              Funny how approaches vary. If I knew the possibility existed to return large datasets I would not even consider memory. I suppose it is "old school" thinking and growing up in the days of K bytes of RAM not gigs.

              My approach has always been use resources wisely and just because you have gigs of RAM it does not mean you should ignore the problem and just use the memory. Obviously you recognize the issue so you just need to find a solution.

              Food for thought.
              Mark Strickland, CISSP, CEH
              SimplyBASICsecurity.com

              Comment


              • #8
                >If you have an SQL DB it would be fairly easy to SELECT INTO

                'SELECT INTO' is a syntax construct which AFAIK only applies to ESQL (Embedded SQL, using a precompiler, not available for PowerBASIC) (as far as I know, because if one IS available I'll buy a license today) or in a stored procedure.

                But even without that, there is no reason to store the entire result set in user memory; you can build 'packets' to send by fetching a row at a time until you have a size which is convenient to send, farm out the send (eg the pipe idea) and resume fetching.

                For that matter, you could dump each fetched row to the pipe immediately.

                MCM
                Michael Mattias
                Tal Systems (retired)
                Port Washington WI USA
                [email protected]
                http://www.talsystems.com

                Comment


                • #9
                  INSERT INTO table

                  Picky, picky, picky - - OK -- Michael you are correct but you still can do the same thing

                  Try this:

                  INSERT INTO table (col, col, ...)
                  (SELECT expression, expression, ...
                  FROM source_table ...)


                  Other DB specific ways exist but the thought was --- Don't use memory
                  Mark Strickland, CISSP, CEH
                  SimplyBASICsecurity.com

                  Comment


                  • #10
                    Oh, I see what you mean now... build atemp table = result set of the "master" query...

                    But still. we wouldn't want any less-experienced types trying to make 'SELECT INTO' work using nothing but the supported compiler syntax, would we? Surely that would be an abdication of our sworn and solemn responsibilities as old farts.

                    MCM
                    Michael Mattias
                    Tal Systems (retired)
                    Port Washington WI USA
                    [email protected]
                    http://www.talsystems.com

                    Comment


                    • #11
                      Yes. You guys are right. I need to use the HD not tie up memory.
                      In which case, if I am going to create a temporary results table, I might as well just use an SQL database to start with and do a regular SELECT pausing every 64k bytes while the packet is sent.

                      When the client responds, I can get the next 64k bytes worth of Rows and send that packet.

                      The disadvantage of this unbuffered (in memory) technique is that I have no idea how many Rows will be returned!

                      I would have to do some kind of limiting from the client end so that it was not possible to request the entire contents of the database.

                      If the client never responds, I guess I need to have the SELECT timeout somehow.

                      SQLite would probably be a good choice for this but I am not sure how it will handle multiple connsctions that are all part way through a transaction.


                      It is easy to write to with PB but:

                      "Like every tool, SQLite has its strengths and weaknesses. While being an ideal solution for small and/or mostly-read applications, it is not well suited for large-scale applications performing frequent writes. This limitation is due to SQLite’s single file based architecture, which doesn’t allow multiplexing across servers, or the usage of database-wide locks on writes."

                      Since I the clients will be be reading from the database, SQLite seems to be a good choice, but I remain unclear about how it will handle multiple connections on the server.


                      If I have a CGI.exe that responds to the client HTTP request and opens a SQLite database, executes a SELECT query and then sends the result set 64k at a time until the query has returned all the data and concludes, then closes the database, what locks are in place?

                      If a second client executes a different query while the first client is still returning data, I assume the server runs a second instance the CGI.exe in a seperate thread with its own memory etc.

                      When that thread opens the same SQLite database is it locked out and the query queued until the the first clients query has returned all the data?

                      From this document it seems not




                      but how does it handle multiple

                      Comment


                      • #12
                        The disadvantage of this unbuffered (in memory) technique is that I have no idea how many Rows will be returned!
                        I will often precede a query like this with another query:
                        Code:
                          SELECT COUNT(*) from (SELECT "the big query or maybe just the same WHERE" )
                        Now I have a row count BEFORE I actually run the query.

                        MCM
                        Michael Mattias
                        Tal Systems (retired)
                        Port Washington WI USA
                        [email protected]
                        http://www.talsystems.com

                        Comment


                        • #13
                          You beat me to it ...

                          Quick on the draw for an "old f*rt" you are Michael ...

                          COUNT(*) will do the trick

                          SQLite probably can do read only for multiple connections but it is not designed for multi-user access. Paul Squires has started an open source server project for SQLite but it is still in "alpha" mode.

                          You might want to consider PostgreSQL since it can have procedural stored procedures (versus pure SQL that is NON-procedural). You can write your own loop look through the record set and do what you want. I believe you can even write your own EXE as a stored procedure.

                          SQL opens up lots of "not so intuitive" techniques. When I first discovered you could do a table join to itself for a strange SELECT I needed it was a real revelation.

                          When the result set can vary from a few K to hundreds of MEGs it does present more challenges. Be sure to explore some "out of the box" ideas.
                          Mark Strickland, CISSP, CEH
                          SimplyBASICsecurity.com

                          Comment


                          • #14
                            I should have been more precise and said:
                            I have no idea how many bytes will be returned.

                            In order to figure out how many packets are going to be sent, I would need that number.

                            Perhaps I could create an additional column for each Row I insert called 'TotBytes' and do the calculation at time of INSERT.

                            Then I could first do a query like
                            "SELECT sum(TotBytes) FROM MyTable WHERE the full set of conditions"

                            Then the full Query.

                            I cant think of any other way.

                            Comment


                            • #15
                              Unless your number of bytes per row varies wildly, I would think an estimated byte count would be plenty good enough.. at least at the point where you are doing it.

                              Then again, you don't need to know the number of packets in advance... you can do that on the fly.
                              Code:
                              DO
                                 GetRow
                                 Count bytes in row 
                                 IF Count + LEN(Current Packet String) > 64K
                                      Send Current packet  
                                      null current packet
                                 ENDIF
                                 CurrentPacket = CurrentPacket & ThisRow
                               
                              LOOP until no more rows
                              IF LEN(CurrentPacket)
                                 Send the last piece
                              ENDIF
                              This sends packets less than 64 K but always in complete rows, which may or may not be a requirement of your application.

                              If you don't need to send in multiples of rows, in above loop write to disk file, counting bytes. Then when all rows are written to work file, access work file in 64K chunks and send those 'records.'

                              MCM
                              Michael Mattias
                              Tal Systems (retired)
                              Port Washington WI USA
                              [email protected]
                              http://www.talsystems.com

                              Comment


                              • #16
                                Kind of reminds me of working under MS-DOS... so little RAM forcing one's imagination to kick in.

                                Of course if you haven't done a lot of work under MS-DOS or other 'limited RAM' environments (eg OS/VS COBOL on 'legacy' IBM mainframes), thinking about reducing RAM requirements is something you probably have not done a whole lot.
                                Michael Mattias
                                Tal Systems (retired)
                                Port Washington WI USA
                                [email protected]
                                http://www.talsystems.com

                                Comment

                                Working...
                                X