Announcement

Collapse
No announcement yet.

Continuing; File IO

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Continuing; File IO

    continuing http://www.powerbasic.com/support/pb...ead.php?t=3924



    i'm looking for some very fast code, probably using a combination of
    api and inline asm to read both small and large files from a single-pc.
    no need to worry about reading over networks, as i will implement
    this differently. on a respectable ide type drive, i would expect
    about 10mb/sec avg..

    line input won't be appropriate as files will be of many types:
    exe, com, doc etc... maybe a block read, like 16kb each block?
    if so, i'd implement a fine-tune function that allows a user to
    choose a blocksize most efficient with their hardware.

    i'll be basically scanning files for specific strings. so a block
    read may be faster when processing several hundred instr calls..

    thanks,

    - nathan

    [this message has been edited by nathan evans (edited june 26, 2001).]

  • #2
    Nathan,

    The fastest way I can think of is opening the entire file and
    dumping it in memory in one swoop. If of course you have enough
    memory.

    Code:
    FUNCTION PBMAIN AS LONG
       a& = TIMER
       DIM BufferPTR AS STRING PTR
       DIM Buffer AS STRING
       BufferPTR = STRPTR(Buffer)
       OPEN "C:\file.dat" FOR BINARY AS #1
       GET$ 1,LOF(1), @BufferPTR: CLOSE #1
    
       OPEN "C:\file2.dat" FOR BINARY AS #1
       SETEOF 1
       PUT 1,,@BufferPTR: CLOSE #1
       a& = TIMER - a&
       msgbox str$(a&)
    END FUNCTION
    I wrote this on the BBs, I haven't tested it yet. But this is
    probably the fastest way I can think of it.

    ------------------
    -Greg

    [This message has been edited by Gregery D Engle (edited June 26, 2001).]
    -Greg
    [email protected]
    MCP,MCSA,MCSE,MCSD

    Comment


    • #3
      I'll give it a go tomorrow.

      Thanks bud.

      - Nathan

      ------------------

      Comment


      • #4
        This works for some files, but not others.

        From what ive tested it works fine on EXE and BMP, but not on mp3.

        Possibly related to LOF..

        Thanks.

        Comment


        • #5
          Nathan,

          That is interesting.. I will have to test my code at home. I
          doubt it is related to the LOF(1). In Win16, EOF was caused by
          simply writing chr$(0) to the end of the file. That is not the
          case with Win32.

          I will have to do more research, this is odd because I've used
          this method for years.

          All mp3 files crash?



          ------------------
          -Greg
          -Greg
          [email protected]
          MCP,MCSA,MCSE,MCSD

          Comment


          • #6
            I only tested it briefly.

            I tried it on EXE, TXT, BMP.. they all worked fine.
            Then i tried it on a MP3 file, and it returned 0 bytes..
            Tried it on a few more MP3 files - returned 0 bytes.

            Its really got to be rock solid code, and read everything - fast.


            Thanks.. I'm working on it right now myself also.

            - Nath.

            Comment


            • #7
              Perhaps it has to do with the Size of a mp3 file (about 4MB normally).
              Nathan, how much RAM do you have?

              For higher Speed, I think, it's better to use the CreateFile, ReadFile, WriteFile API's directly.


              ------------------
              E-Mail (home): mailto:[email protected][email protected]</A>
              E-Mail (work): mailto:[email protected][email protected]</A>

              [This message has been edited by Sven Blumenstein (edited June 28, 2001).]

              Comment


              • #8
                512mb ram running win2k.

                I will convert some VB sources over to PB that uses the ReadFile APIs.

                cheers.

                - Nath.


                Comment


                • #9
                  Doesn't *seem* to be anything wrong with this Function, but it errors.

                  Error 481: Parameter mismatch with prior declaration



                  Code:
                  FUNCTION OpenFile(BYVAL szFile AS STRING) AS STRING
                  
                      DIM hOrgFile AS LONG, bBytes() AS BYTE
                      DIM nSize AS LONG, Ret AS LONG
                  
                      hOrgFile = CreateFile(szFile, GENERIC_READ, FILE_SHARE_READ OR FILE_SHARE_WRITE, BYVAL 0&, OPEN_EXISTING, 0, 0)    'Get Handle to File
                      nSize = GetFileSize(hOrgFile, 0)                  'Get File Size
                          IF nSize = 0 OR nSize = -1 THEN EXIT FUNCTION
                      
                      SetFilePointer hOrgFile, 0, 0, FILE_BEGIN         'Tell it to start at beginning of file
                      REDIM bBytes(1 TO nSize) AS BYTE                  'Create byte array
                      ReadFile hOrgFile, bBytes(1), UBOUND(bBytes), Ret, BYVAL 0&
                  
                      IF Ret <> UBOUND(bBytes) THEN EXIT FUNCTION
                      CloseHandle (hOrgFile)                            'Close handle (prevent memory leak)
                  
                      MSGBOX STR$(UBOUND(bBytes))
                  
                      'OpenFile = StrConv(bBytes, vbUnicode)
                  
                  END FUNCTION
                  [This message has been edited by Nathan Evans (edited June 28, 2001).]

                  Comment


                  • #10
                    oh, please tell me how to use the darn <code> tags :/

                    ------------------

                    Comment


                    • #11
                      put [ code] then put [ /code] without a space between the
                      braces.

                      ------------------
                      -Greg
                      -Greg
                      [email protected]
                      MCP,MCSA,MCSE,MCSD

                      Comment


                      • #12
                        Ok, i've converted the code: Here's the PB code i've ended up with.
                        It reads the specified file into a Byte array.

                        Code:
                        FUNCTION fOpenFile(szFile AS STRING) AS STRING
                        
                            DIM hOrgFile AS LONG, bBytes() AS BYTE
                            DIM nSize AS LONG, Ret AS LONG
                        
                            hOrgFile = CreateFile(BYCOPY szFile$, %GENERIC_READ, %FILE_SHARE_READ + %FILE_SHARE_WRITE, BYVAL 0&, %OPEN_EXISTING, 0, 0)    'Get Handle to File
                            nSize = GetFileSize(hOrgFile, 0)                  'Get File Size
                                IF nSize = 0 OR nSize = -1 THEN EXIT FUNCTION
                            
                            SetFilePointer hOrgFile, 0, 0, %FILE_BEGIN         'Tell it to start at beginning of file
                            REDIM bBytes(1 TO nSize) AS BYTE                  'Create byte array
                            ReadFile hOrgFile, bBytes(1), UBOUND(bBytes), Ret, BYVAL 0&
                        
                            IF Ret <> UBOUND(bBytes) THEN EXIT FUNCTION
                            CALL CloseHandle (hOrgFile)                            'Close handle (prevent memory leak)
                        
                            MSGBOX STR$(UBOUND(bBytes))
                        
                            'OpenFile = StrConv(bBytes, vbUnicode)
                        
                        END FUNCTION
                        This takes around 120ms to read a 8mb file. I think that is pretty good?

                        Now that i've inputted the file..

                        I need a fast method to search that Byte array for several hundred strings.

                        I've experimented with the Boyer-Moore algorithms, but have so far had no luck.

                        BTW: I'm totally new to PowerBasic; i'm still getting used to all the variable names like %,$.


                        Thanks.
                        -Nath.

                        Comment


                        • #13
                          Nathan --

                          > i'm still getting used to all the variable names like %,$

                          Off-topic for a moment... You don't have to use Type Identifiers in your PowerBASIC programs if you don't want to. Simply declare your variables like this...

                          Code:
                          FUNCTION Whatever AS LONG
                              DIM lResult AS LONG
                              DIM sString AS STRING
                          ...and so on. You are free to use Hungarian Notation if that's what you prefer, or any other system that you like.

                          Another helpful tip... use $DIM ALL at the top of all of your programs. That will require you to DIM all of your variables, and help avoid typos.

                          -- Eric

                          ------------------
                          Perfect Sync Development Tools
                          Perfect Sync Web Site
                          Contact Us: mailto:[email protected][email protected]</A>



                          [This message has been edited by Eric Pearson (edited June 28, 2001).]
                          "Not my circus, not my monkeys."

                          Comment


                          • #14
                            Thanks.

                            I was wondering what VBs Option Explicit was in PB :P

                            -Nath.

                            Comment


                            • #15
                              Eric,

                              Off topic here but I rarely use DIM ALL, I feel that it slows
                              down my programming. What is your opinion on using hungarion
                              notation with the appropriate suffic, eg &, ???, $, ! Instead of
                              declaring them first?



                              ------------------
                              -Greg
                              -Greg
                              [email protected]
                              MCP,MCSA,MCSE,MCSD

                              Comment


                              • #16
                                Originally posted by Nathan Evans:
                                I was wondering what VBs Option Explicit was in PB :P
                                Funnily enough, you can use OPTION EXPLICIT or you can use #DIM ALL.

                                Getting back to the problems earlier in this thread where you said the code did not work for MP3 files... I notice that the posted code does not test for any error conditions, so this suggests that your OPEN statement faailed (maybe due to a lock on the file from some other application?) - unless you at least examine ERR after the open statement (or use ON ERROR GOTO), you could be making invalid assumptions when the file length is found to be zero.

                                Also, you may want to consider adding ACCESS READ LOCK WRITE (or LOCK SHARED) to the the file statement - without an ACCESS and LOCK clause the compiler will open the file with exclusive access (ACCESS READ WRITE LOCK READ WRITE) and this could lead to a runtime error is another app has the file locked with, say, LOCK WRITE.

                                I hope this helps!


                                ------------------
                                Lance
                                PowerBASIC Support
                                mailto:[email protected][email protected]</A>
                                Lance
                                mailto:[email protected]

                                Comment


                                • #17
                                  Thanks for the tips everyone.

                                  Lance,
                                  I didn't try changing the file locks, however i am now using the ReadFile et al API.
                                  This can open a 8mb file in around 110ms avg. Is this good?

                                  I'll post my ReadFile based function when i'm on my dev box.

                                  However, i do know that none of the files i tried to opened were in use.

                                  Thanks.

                                  - Nath.

                                  Comment

                                  Working...
                                  X