Announcement

Collapse
No announcement yet.

Opening a file with extended char in name

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Opening a file with extended char in name

    I've been using this routine for a long time:

    Code:
    Function GrabFileAsString (ByVal FilName As String) As String
       'Opens the file and returns it in a string...
    
       Local InBuf As Long, InpLen As Quad, temp As String
    
       FilName = Trim$(FilName, $Dq)  'just to be sure...
    
       InBuf = FreeFile
       On Error Resume Next
       Open FilName For Binary As #InBuf
       If Err Then
          Function = ""     '
          Exit Function
       End If
       On Error GoTo 0
       InpLen = Lof(InBuf)
       Get$ #InBuf, InpLen, temp
       Close #InBuf
       Function = temp
    End Function
    Today it stumbled on the following filename:

    "Adding hotkey to Semen's - Josés shortcut code - PowerBASIC Peer Support Forums.mht"

    It seems to me that the file OPENs with no problem (the ERR test that follows the OPEN does not report an error).

    But LOF() is returning 0 for a 480K file. Thus temp incorrectly gets set as a 0-length string.

    Suspicious of the apostrophe and the accented e, I copied the file with a new name (no special chars) and it works fine, so I know it isn't the file content.

    But why does OPEN not have a problem, and LOF does? (Or is OPEN having a problem that I'm not catching?)

    I'm probably forgetting something obvious (to you)... Can anyone else see it?

    Thanks,
    -John


    P.S. I'm continuing to test: DIR$ returns the name with the accented e changed to (what appears to be) a comma...

    I have a sinking feeling that I'm back in the realm of DOS vs OEM character sets, so let me make this point right away: I am not creating the filenames. I have no control over what's on the drive. If the file was saved via some other program or a web browser or a user with a mean streak and it contains non-standard characters, that's what I've got to deal with. I have to be able to recognize such situations. But HOW?


    ... I still don't understand: should OPEN be reporting an error?

    -jhm

  • #2
    John, You don't say which Open you are using. I know that sometimes I use LOF and get bum results, so I just change to EOF and it seems to work. I believe it's when using Binary Open but don't recall just now.

    =================================================
    "We grow great by dreams.
    All big men are dreamers.
    They see things in the soft haze of a spring day
    or in the red fire of a long winter's evening.
    Some of us let these great dreams die,
    but others nourish and protect them."
    Woodrow Wilson
    =================================================
    It's a pretty day. I hope you enjoy it.

    Gösta

    JWAM: (Quit Smoking): http://www.SwedesDock.com/smoking
    LDN - A Miracle Drug: http://www.SwedesDock.com/LDN/

    Comment


    • #3
      Are you trying to confuse me (more than I already am)??

      LOF() returns the length of the open disk file, while EOF() returns the end-of-file status of the open file (or comm channel).

      You're going to have to explain to me how EOF() is applicable here...



      As for which Open I'm using:

      Code:
         Open FilName For Binary As #InBuf
      You ARE trying to confuse me!!! Too much turkey, eh?? (Or not enough football!)

      Comment


      • #4
        Originally posted by John Montenigro View Post
        Are you trying to confuse me (more than I already am)??
        This is probably a case of the blinder leading the blind, I guess {grin}. I was just speaking off the top of my head. Probably not a good thing given the contents of it. I just recall having a problem getting file length and was pretty sure that EOF/LOF was how I resolved it. When something doesn't work, PB is so fast I just try something then something else until something works. Every once in awhile I Goto the the Help file.
        You're going to have to explain to me how EOF() is applicable here...
        Explaining how anything works in programming is not in my (small) world of expertise. I'm a T&E man (Trial and Error until something works.)

        You ARE trying to confuse me!!! Too much turkey, eh?? (Or not enough football!)
        Almost never get enough football. (Turkey is another story.) I caught the Colorado/Nebraska game just by chance yesterday and it was a dandy. Down to the wire until the last minute or so. Gotta love the enthusiasm of those college kids.

        ====================================
        Be not angry that you cannot
        make others as you wish them to be,
        since you cannot make
        yourself as you wish to be.
        Thomas a Kempis
        ====================================
        Last edited by Gösta H. Lovgren-2; 29 Nov 2008, 05:24 PM.
        It's a pretty day. I hope you enjoy it.

        Gösta

        JWAM: (Quit Smoking): http://www.SwedesDock.com/smoking
        LDN - A Miracle Drug: http://www.SwedesDock.com/LDN/

        Comment


        • #5
          OPEN does not have a problem. Opening FOR BINARY creates the file if it doesn't exist. When first created that file is size..... zero.

          If you want to open files with Unicode names, you'll have to use CreateFile.. or more accurately, CreateFileW()

          You can still use all your PB verbs to read or write that file because of OPEN HANDLE.

          Talk about it being your lucky day... here's a demo of opening a file with CreateFile and using OPEN HANDLE to get at it with PowerBASIC statements:
          Create and use a file which will be deleted on close.

          Change the CreateFile() to CreateFileW() and you are essentially done.

          For reference this might be handy for you to check the name of your file:
          Directory List with Non-ASCII (Unicode) characters in file names 5-31-08

          MCM
          Michael Mattias
          Tal Systems Inc. (retired)
          Racine WI USA
          [email protected]
          http://www.talsystems.com

          Comment


          • #6
            Thanks, Michael!

            If I understand what you're saying, here's what's happening in this situation:
            1. I use DIR$(mask$, ONLY %NORMAL) to get the name of an existing file.
            (Yes, I've defined the %NORMAL equate per the Helpfile.)
            2. The name of that existing file contains a non-ASCII character.
            3. I try to OPEN that file.
            4. Something I'm doing is causing PB's OPEN to NOT see it as an existing file.
            5. PB is therefore creating the file, and since I do an immediate GET, I'm getting back the contents of the empty file - nothing, with a length of 0.

            If that's correct, then there's a disconnect somewhere between reading the filename via DIR$ and trying to access it via OPEN/BINARY. It seems to me that in between, something changes, and I can't tell what.

            I've tried using the new DIR$ syntax that specifies either OEM or ANSI, but I haven't been able to get anything that works.



            I looked at the links briefly, and I'll need to check them out in more detail tomorrow. But I'm not entirely sure we're on the same page??

            This isn't a Unicode string - it just contains a char with an ASCII value greater than 127... and it seems there's a difference between the way it's seen by DIR$ vs when it's seen by OPEN. In between, the only thing I do is copy it to another variable, and pass it as a parameter to a FUNCTION. Could these steps somehow change the way the string is interpreted internally? I'm not doing any editing or string manipulation.


            Here's what I'm using temporarily to check the filename.

            Code:
            Function NonASCIIChars(ByVal txt As String) As Long
               'Returns TRUE if there are non-ASCII chars
               '  actually the TRUE return value will be the position of the first one...
               Local i, x As Long
               Function = 0   'presume all chars are OK, std
            
               For i = 1 To Len(txt)
                  x = Asc(txt, i)
                  If x > 127 Then  '128 to 255 are not standard ASCII
                     'msgbox "Invalid char at position " & str$(i) & " in: " & $crlf & txt
                     Function = i  'return the position of the first nonASCII char
                     Exit Function
                  End If
               Next i
            End Function
            However, I have no remedy except to move it to an "Exceptions" list. But this is an inadequate solution, because the existing file is not getting procesed.

            -jhm

            Comment


            • #7
              Okay Joh, I'll give it anoter shot. instead of
              Code:
              FilName = Trim$(FilName, $Dq)  'just to be sure...
              Have you tried
              Code:
                         'Just trim spaces, no quotes
              FilName = Trim$(FilName)  'just to be sure...
              'now surround the name with quotes
              FilName = $dq & FilName & $dq
              It"s later now and most of the turkey has cleared out of my system. {grin}.

              ========================================================
              "It ain't what they call you, it's what you answer to."
              W. C. Fields
              ========================================================
              It's a pretty day. I hope you enjoy it.

              Gösta

              JWAM: (Quit Smoking): http://www.SwedesDock.com/smoking
              LDN - A Miracle Drug: http://www.SwedesDock.com/LDN/

              Comment


              • #8
                I tracked down that thread and saved it from IE, having no trouble at all opening the file and getting the length, and the filename with the characters appearing as they should be. However, I can cause your code to fail on my system by calling the API Call SetFileApisToOEM.

                Give this code a try (and of course point the file to the correct location) and see if it works on the filename with the extended character:

                Code:
                #COMPILE EXE
                #DIM ALL
                
                #INCLUDE "WIN32API.INC"
                
                DECLARE FUNCTION AreFileAPIsANSI LIB "KERNEL32.DLL" ALIAS "AreFileApisANSI" () AS LONG
                
                FUNCTION GrabFileAsString (BYVAL FilName AS STRING) AS STRING
                   'Opens the file and returns it in a string...
                
                   LOCAL InBuf AS LONG, InpLen AS QUAD, temp AS STRING
                
                   FilName = TRIM$(FilName, $DQ)  'just to be sure...
                
                   InBuf = FREEFILE
                   ON ERROR RESUME NEXT
                   OPEN FilName FOR BINARY AS #InBuf
                   IF ERR THEN
                      FUNCTION = ""     '
                      EXIT FUNCTION
                   END IF
                   ON ERROR GOTO 0
                   InpLen = LOF(InBuf)
                   GET$ #InBuf, InpLen, temp
                   CLOSE #InBuf
                   FUNCTION = temp
                END FUNCTION
                
                FUNCTION PBMAIN () AS LONG
                
                    LOCAL fName AS STRING
                    LOCAL fData AS STRING
                    LOCAL lRet  AS LONG
                    
                    lRet = AreFileApisANSI()
                    IF lRet = 0 THEN
                        SetFileApisToANSI()
                    END IF
                    
                    fName = "Adding hotkey to Semen's - Josés shortcut code - PowerBASIC Peer Support Forums.mht"
                    
                    fData = GrabFileAsString("C:\" + fName)
                    
                    ? FORMAT$(LEN(fData))
                    
                    OPEN "C:\" + fName FOR BINARY AS #1
                    ? FORMAT$(LOF(1))
                    CLOSE #1
                    
                    ? DIR$("C:\" + fName)
                
                END FUNCTION
                Last edited by Adam J. Drake; 30 Nov 2008, 01:07 AM.
                Adam Drake
                Drake Software

                Comment


                • #9
                  Seems to me if PB DIR$() can find it, PB OPEN should open it without mucking about with the SetFileAPISToANSI/OEM functions (which as far I know is not a supported action).

                  But if all else fails, you can try making a copy of the file to a name OPEN can understand using either PB FILECOPY or WinAPI SHFileOperation or CopyFile and then opening the copy.

                  MCM
                  Michael Mattias
                  Tal Systems Inc. (retired)
                  Racine WI USA
                  [email protected]
                  http://www.talsystems.com

                  Comment


                  • #10
                    Adam,
                    I tried your code and it works fine, so I've started testing the API calls in my program. It's working great when I'm gathering filenames by feeding a wildcard to DIR$ (thank you!).

                    It's not working all the time when I work with filenames that were gathered into a file by another program. I've checked, and the other program does not modify the filename strings - the non-ASCII chars are intact in the file.

                    Using the debugger's Variable Watcher window, I can see that, before my program calls GrabFileAsString(), the non-ASCII char has been replaced with a vertical black bar. If I pass the string to the GrabFileAsString() function, it fails as described in the first post of this thread.

                    So I'm now tracing and re-examining how my program handles the filespec strings.

                    After I do some cleanup, I'll post my program...give me a day or so.

                    Thanks!
                    -John



                    Gosta,
                    The Trim$ of the $DQ was incidental. At one point I had to be sure that the parameter didn't have an extra $DQ appended. It's not part of this problem. But I do appreciate your eye-balling the code!

                    Thanks!
                    -John




                    Michael,

                    I can see copying the file to a simple name and then processing it as part of the "exceptions list" workaround, but I'd really like to figure out how to overcome the problem as it stands. There is a reason, and I just need to understand the cause... As usual, it's probably something I'm not doing correctly.

                    As mentioned above, now that I'm seeing a transformed string before the call, I'm shifting my focus up a level from the function with the OPEN/GET$, to the code just before the call...

                    I'll post again when I have better info...

                    Thanks!
                    -John

                    Comment


                    • #11
                      Brief update: Now have it all working without problems.

                      What was the problem? The ANSI/OEM stuff.

                      How did I resolve it?
                      In the FUNCTION that uses DIR$ to obtain filenames from a wildcard spec, I ensure we're set to OEM.
                      In the FUNCTION that parses a listfile to an array, I ensure we're set to ANSI.
                      In the FUNCTION that I first posted (GrabFileAsString()), I ensure we're set to OEM.

                      To make the changes easier, I modified Adam's code into the following:

                      Code:
                      Sub SetCharsTo(ByVal CharSet As String)
                         'accepts a string of A or O
                         Local lRet As Long
                      
                         If UCase$(Left$(CharSet, 1)) = "A" Then        'use this when parsing from a DOS or DIR$ result    ????
                            lRet = AreFileApisANSI()
                            If lRet = 0 Then
                               SetFileApisToANSI()
                            End If
                         ElseIf UCase$(Left$(CharSet, 1)) = "O" Then    'use this when parsing from a file made by external program  ????   (also when doing OPEN BINARY/GET$)
                            lRet = AreFileApisANSI()
                            If lRet <> 0 Then
                               SetFileApisToOEM()
                            End If
                         End If
                      End Sub
                      The call is made by either: SetCharsTo("ANSI") or SetCharsTo("OEM")

                      Monday afternoon I'll be re-testing extensively to see if I can further isolate the conditions of this, and to see if I can further reduce the calls to SetCharsTo()

                      Thanks for the API calls, Adam, and for troubleshooting my code. I appreciate it!

                      -John

                      Comment

                      Working...
                      X