Announcement

Collapse
No announcement yet.

Question on limitations of recursive code

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Didn't get much time this evening, so I'll have to follow up tomorrow or Thursday with Pierre's code, the CHKDSK, and Stuart's last post...


    But now that I've seen the program complete, I did go back to the last failure to observe and assess that csv file...

    So:
    1. The overall file length of the CSV was: 301,318,655 bytes

    2. The number of lines in the csv: 42,575
    (including the first line which contains the starting time)

    3. The length of the very last line was: 226,275 characters
    Yes, that is the correct last column number!

    4. That last line was line number 42,575, and showed a RecursionDepth of: 2639

    5. The last folder in the path on that line was:
    StayFocused Is A Great Way To Focus On Important Tasks, Uses Pomodoro Technique_files
    This folder name is 85 characters long.
    I'm calling this the "repeating folder name", because ... (read #6)

    6. The parent folder of that last folder was THE SAME FOLDER...

    7. In fact, ALL the preceding folders in the path were THE SAME FOLDER,
    (all the way back to a certain point that I'll call the "root" of that branch)

    8. Each LINE in the csv preceding the very last line showed one less repeat of the
    repeating folder, and one less RecursionDepth value.

    9. Going backwards through the csv, finding the first line in which the "repeating folder" only occurs once (would be a "normal" path).
    Found line 39,947 which showed a RecursionDepth of: 19
    The length of the line containing only the "root" part of the path = 174 characters

    10. Subtracting the first non-repeating line from the very last line:
    42,575
    - 39,947
    ======
    2,628 repeating lines
    May be interesting, but not sure yet if it means much, so keep on looking for more measurable aspects.

    11. Ah HA!
    Noticed that the length of the "root" section of the path, plus the length
    of the "repeating folder name" ...
    174 + 86 = 260
    Coincidence? ... that's the value of %MAX_PATH ...

    So, the program hits a path that's exactly %MAX_PATH long,
    then starts re-iterating over that same folder 2,628 more times,
    UNTIL... the stack limit is reached...

    Too tired to debug the code right now, but I'd bet that a test for = should really be a test for >=

    OK, till tomorrow night!
    -John

    Comment


    • #22
      Good detective job John.

      You are surely aware of the following, but, just in case...
      %MAX_PATH include a terminating null character.
      This mean that you got 259 valid characters available.
      If you get 174 + 86 for 260 then the last character will be missing.
      If you look at it, is it what you have on disk or is it incomplete?

      Comment


      • #23
        Hi Pierre!

        The VERY LAST line in the CSV file is actually 226,275 characters long and ends with CRLF.

        The FIRST line in which the "repeating folder" appears only once (as we would expect) is exactly 260 characters from the D:\ to the last character of the folder name. Remember, the program does a PRINT# of that string to the file, then a few spaces, then the RecursionDepth value and ends with CRLF...

        I think that the "repeating folder" phenomenon is the result of a combination of the FindNextFile getting stuck on the same folder because of the 260 character length the first time it hits.

        But I'm not a good enough detective to see WHY/HOW this could occur, at least not without some more study...

        The program does not test for length of path. If it did, maybe it would be able to avoid getting stuck in a repeating loop so many times.

        I'll check in again tonight, after I get some more of the basement packed up.

        Thanks,
        -John

        Last edited by John Montenigro; 28 Aug 2019, 05:41 PM.

        Comment


        • #24
          Just a WAG, but if it gets a string like
          C:\My folder with a long name\.....\My Deep Directory\." (Note the dot at the end indicating the entry for Current Directory),
          but this gets truncated to C:\My folder with a long name\.....\My Deep Directory" (Note the absence of the dot)
          will it recurse on itself?

          Comment


          • #25
            Just went back to the source in the link above and noticed:
            LOCAL NextFile AS ASCIIZ * %MAX_PATH

            Based on Post #20 above, what happens if you change that to concatenate the "*."

            ASCIIZ * 32767

            Added:
            * %MAXPATH includes thr terminating Null.

            So if StartFolderName is "C:\Deep .... Folder" is 260 characters then
            NextFile = StartFolderName + "*.*" will strip the trailing "\: and not concatenate the "*.*".

            So FINDFIRSTFILE will Find "C:\Deep .... Folder" again.

            Comment


            • #26
              For the record,
              Remember that EnumDirTreeA / EnumDirTreeW could be used and are very versatile.
              Here is a minimal example, much more can be done...
              This is Console under PB-Win...

              Code:
              #COMPILE EXE '#Win#
              #DIM ALL
              #INCLUDE "Win32Api.inc"
              
              DECLARE FUNCTION EnumDirTreeA LIB "DbgHelp.dll" ALIAS "EnumDirTree" _
              (BYVAL hProcess AS DWORD, BYREF RootPath AS ASCIIZ, BYREF InputPathName AS ASCIIZ, _
               BYREF OutputPathBuffer AS ASCIIZ, BYVAL cb AS DWORD, BYVAL DATA AS DWORD) AS LONG
              '______________________________________________________________________________
              
              FUNCTION EnumDirTreeCallBack(BYREF zFullPathFileName AS ASCIIZ, BYVAL hConsoleOut AS LONG) AS LONG
              
               'CharToOem(zFullPathFileName, zFullPathFileName)
               WriteConsole(hConsoleOut, zFullPathFileName & $CRLF , LEN(zFullPathFileName) + LEN($CRLF), BYVAL 0, BYVAL 0)
               'FUNCTION = %TRUE 'Will stop enum
              
              END FUNCTION
              '______________________________________________________________________________
              
              FUNCTION PBMAIN() AS LONG
               LOCAL hProcess      AS DWORD
               LOCAL hConsoleOut   AS DWORD
               LOCAL RootPath      AS ASCIIZ * %MAX_PATH
               LOCAL InputPathName AS ASCIIZ * %MAX_PATH
              
               AllocConsole
               hConsoleOut = GetStdHandle(%STD_OUTPUT_HANDLE)
              
               RootPath      = "D:\" ' Also valid RootPath = "C:\;D:\"
               InputPathName = "*"
              
               WriteConsole(hConsoleOut, "RootPath: " & RootPath & $CRLF, LEN("RootPath: ") + LEN(RootPath) + LEN($CRLF), BYVAL 0, BYVAL 0)
               WriteConsole(hConsoleOut, "InputPathName: " & InputPathName & $CRLF, LEN("InputPathName: ") + LEN(InputPathName) + LEN($CRLF), BYVAL 0, BYVAL 0)
               WriteConsole(hConsoleOut, $CRLF, LEN($CRLF), BYVAL 0, BYVAL 0)
              
               hProcess = OpenProcess(%PROCESS_QUERY_INFORMATION, 1, GetCurrentProcessId())
               IF hProcess THEN
                 EnumDirTreeA(hProcess, RootPath, InputPathName, BYVAL %NULL, CODEPTR(EnumDirTreeCallBack), hConsoleOut)
                 CloseHandle(hProcess)
               END IF
              
               WriteConsole(hConsoleOut, $CRLF, LEN($CRLF), BYVAL 0, BYVAL 0)
               WriteConsole(hConsoleOut, "Done.", 5, BYVAL 0, BYVAL 0)
               SLEEP 20000
               FreeConsole
              
              END FUNCTION
              '______________________________________________________________________________
              '

              Comment


              • #27
                Give this a try - gets well over a million files on my machine...
                Work in progress...


                Code:
                #COMPILE EXE
                #DIM ALL
                
                #INCLUDE "WIN32API.INC"
                
                
                GLOBAL GL_hDlg          AS DWORD
                GLOBAL GL_hFont_1       AS DWORD
                
                GLOBAL GL_COUNTER       AS DWORD
                GLOBAL RECURSION_DEPTH  AS DWORD
                
                 ENUM CONTROLS SINGULAR
                  ENUM_CONTROLS_STARTING_NUMBER = 100
                '  TREEVIEW_1
                '  LABEL_1
                 END ENUM
                
                 ENUM MESSAGES SINGULAR
                  ENUM_STARTING_MESSAGESS_NUMBER = %WM_APP
                  START_APPLICATION
                  DIALOG_END
                 END ENUM
                
                '------------------------------------------------------------------------------
                 FUNCTION PBMAIN () AS LONG
                
                  OPEN "out.txt" FOR OUTPUT AS #1
                
                  FONT NEW "Courier New", 36, 0, %ANSI_CHARSET TO GL_hFont_1
                
                  MAIN_DIALOG
                
                  CLOSE #1
                
                 END FUNCTION
                '------------------------------------------------------------------------------
                 FUNCTION MAIN_DIALOG()AS LONG
                
                  LOCAL DIALOG_HEIGHT     AS LONG
                  LOCAL DIALOG_WIDTH      AS LONG
                  LOCAL DIALOG_STYLE      AS LONG
                  LOCAL DIALOG_STYLE_EX   AS LONG
                  LOCAL LISTBOX_STYLE     AS LONG
                  LOCAL LISTBOX_STYLE_EX  AS LONG
                
                  DIALOG_HEIGHT = 600
                  DIALOG_WIDTH  = 1000
                
                  DIALOG_STYLE    = %WS_POPUP OR %WS_CAPTION OR %WS_SYSMENU OR %WS_MINIMIZEBOX OR %WS_VISIBLE OR %DS_MODALFRAME OR %DS_3DLOOK OR %DS_NOFAILCREATE OR %DS_SETFONT OR %WS_THICKFRAME OR %DS_CENTER
                  DIALOG_STYLE_EX = %WS_EX_CONTROLPARENT OR %WS_EX_LEFT OR %WS_EX_LTRREADING OR %WS_EX_RIGHTSCROLLBAR
                
                  DIALOG NEW PIXELS, %HWND_DESKTOP, "Recursive File Catalog", 10, 10, DIALOG_WIDTH, DIALOG_HEIGHT, DIALOG_STYLE, DIALOG_STYLE_EX, TO GL_hDlg
                
                  'CONTROL ADD LABEL, GL_hDlg, %LABEL_1, "", 10, (DIALOG_HEIGHT/2), DIALOG_WIDTH-20, (DIALOG_HEIGHT/2), %WS_CHILD OR %WS_VISIBLE OR %SS_CENTER, %WS_EX_LEFT OR %WS_EX_LTRREADING
                  'CONTROL SET FONT GL_hDlg, %LABEL_1, hFont_1
                
                  'CONTROL ADD TREEVIEW, GL_hDlg, %TREEVIEW_1, "", 10, 10, DIALOG_WIDTH-20, DIALOG_HEIGHT-20
                
                  DIALOG SHOW MODAL GL_hDlg, CALL MAIN_DIALOG_CALLBACK
                
                
                 END FUNCTION
                '------------------------------------------------------------------------------
                 CALLBACK FUNCTION MAIN_DIALOG_CALLBACK
                
                  LOCAL DIALOG_WIDTH  AS LONG
                  LOCAL DIALOG_HEIGHT AS LONG
                
                    SELECT CASE AS LONG CB.MSG
                
                      CASE %WM_INITDIALOG
                        DIALOG POST GL_hDlg, %START_APPLICATION,0,0
                
                      CASE %START_APPLICATION
                        'CATALOG_FILES(EXE.PATH$, "*")
                        CATALOG_FILES("C:\", "*")
                        MSGBOX "done"
                        DIALOG END GL_hDlg
                
                      CASE %WM_SIZING
                        'DIALOG GET CLIENT GL_hDlg TO DIALOG_WIDTH, DIALOG_HEIGHT
                        'CONTROL SET SIZE  GL_hDlg, %TREEVIEW_1, DIALOG_WIDTH-20, DIALOG_HEIGHT - 20
                
                      CASE %DIALOG_END
                
                
                      CASE %WM_TIMER
                      CASE %WM_CLOSE
                      CASE %WM_DESTROY
                
                    END SELECT
                 END FUNCTION
                
                '------------------------------------------------------------------------------
                 FUNCTION CATALOG_FILES(BYVAL FILE_DRIVE_PATH AS STRING, FILE_EXTENTION AS STRING) AS LONG
                
                
                    LOCAL SEARCH_HANDLE      AS DWORD               ' Search handle
                    LOCAL STRUCT_FIND_DATA   AS WIN32_FIND_DATA     ' FindFirstFile structure
                    LOCAL CURRENT_PATH       AS ASCIIZ * %MAX_PATH  ' What to search for
                    LOCAL cFileName          AS ASCIIZ * %MAX_PATH  ' What to search for
                    LOCAL FILE_NAME          AS STRING
                    LOCAL FULL_FILE_NAME     AS STRING
                    LOCAL COUNTER            AS LONG
                    LOCAL STRUCT_SystemTime  AS SYSTEMTIME
                    LOCAL FILE_TAIL          AS STRING
                    LOCAL FILE_ATTR          AS LONG
                
                    INCR RECURSION_DEPTH
                
                    PRINT #1, ""
                    PRINT #1, STRING$(RECURSION_DEPTH, $TAB) + FILE_DRIVE_PATH
                
                    CURRENT_PATH  = FILE_DRIVE_PATH & "*." + FILE_EXTENTION
                    SEARCH_HANDLE = FindFirstFile(CURRENT_PATH, STRUCT_FIND_DATA)  'GET A SEARCH_HANDLE FOR FindNextFile API CALLS
                
                    IF SEARCH_HANDLE <> %INVALID_HANDLE_VALUE THEN                 'IF IT IS A DIRECTORY CALL THIS FUNCTION RECURSIVLY
                        DO
                            IF (STRUCT_FIND_DATA.dwFileAttributes AND %FILE_ATTRIBUTE_DIRECTORY) <> %FILE_ATTRIBUTE_DIRECTORY THEN  'THIS IS A FILE NOT A DIRECTORY
                
                               cFileName = STRUCT_FIND_DATA.cFileName
                
                               PRINT #1, STRING$(RECURSION_DEPTH, $TAB) + FILE_DRIVE_PATH + cFileName
                
                
                            END IF
                        LOOP WHILE FindNextFile(SEARCH_HANDLE, STRUCT_FIND_DATA)
                
                        CALL FindClose(SEARCH_HANDLE)
                    END IF
                
                
                    CURRENT_PATH  = FILE_DRIVE_PATH & "*"
                    SEARCH_HANDLE = FindFirstFile(CURRENT_PATH, STRUCT_FIND_DATA)
                
                
                
                   'THIS IS JUST DIRECTORY NAMES THAT MAKE A RECURSIVE CALL TO THE FUNCTION
                    IF SEARCH_HANDLE <> %INVALID_HANDLE_VALUE THEN
                        DO
                            IF (STRUCT_FIND_DATA.dwFileAttributes AND %FILE_ATTRIBUTE_DIRECTORY) = %FILE_ATTRIBUTE_DIRECTORY AND (STRUCT_FIND_DATA.dwFileAttributes AND %FILE_ATTRIBUTE_HIDDEN) = 0 THEN  ' If dirs, but not hidden
                
                                'THIS IS INSIDE THE DIRECTORY LOOP HERE
                                'TEXT_TO_SCREEN STR$(TIMER)
                
                                IF STRUCT_FIND_DATA.cFileName <> "." AND STRUCT_FIND_DATA.cFileName <> ".." THEN          ' Not these..
                                  CALL CATALOG_FILES(FILE_DRIVE_PATH & RTRIM$(STRUCT_FIND_DATA.cFileName, CHR$(0)) & "\", FILE_EXTENTION)
                                END IF
                
                            END IF
                        LOOP WHILE FindNextFile(SEARCH_HANDLE, STRUCT_FIND_DATA)
                        CALL FindClose(SEARCH_HANDLE)  'WIN API FindClose
                    END IF
                
                   DECR RECURSION_DEPTH
                
                 END FUNCTION
                '------------------------------------------------------------------------------
                 FUNCTION XCATALOG_FILES(BYVAL FILE_DRIVE_PATH AS STRING, FILE_EXTENTION AS STRING) AS LONG
                
                '   LOCAL hRoot         AS DWORD
                '   LOCAL hParent       AS DWORD
                
                
                    LOCAL SEARCH_HANDLE      AS DWORD               ' Search handle
                    LOCAL STRUCT_FIND_DATA   AS WIN32_FIND_DATA     ' FindFirstFile structure
                    LOCAL CURRENT_PATH       AS ASCIIZ * %MAX_PATH  ' What to search for
                    LOCAL cFileName          AS ASCIIZ * %MAX_PATH  ' What to search for
                    LOCAL FILE_NAME          AS STRING
                    LOCAL FULL_FILE_NAME     AS STRING
                    LOCAL COUNTER            AS LONG
                    LOCAL STRUCT_SystemTime  AS SYSTEMTIME
                    LOCAL FILE_TAIL          AS STRING
                    LOCAL FILE_ATTR          AS LONG
                
                    LOCAL DEBUG_DW_FILE_ATTR AS LONG
                    LOCAL DEBUG_DW_FILE_PEEK AS LONG
                    LOCAL DEBUG_CASE_STRING  AS STRING
                    LOCAL DEBUG_BIN_STRING   AS STRING
                    LOCAL DEBUG_BIT_VAR      AS LONG
                
                
                
                   ' TREEVIEW INSERT ITEM GL_hDlg, %TREEVIEW_1, 0, %TVI_LAST, 0, 0, FILE_DRIVE_PATH TO hRoot
                
                    CURRENT_PATH  = FILE_DRIVE_PATH & "*." + FILE_EXTENTION
                    SEARCH_HANDLE = FindFirstFile(CURRENT_PATH, STRUCT_FIND_DATA)  'GET A SEARCH_HANDLE FOR FindNextFile API CALLS
                
                    IF SEARCH_HANDLE <> %INVALID_HANDLE_VALUE THEN                 'IF IT IS A DIRECTORY CALL THIS FUNCTION RECURSIVLY
                        DO
                
                
                
                          cFileName = STRUCT_FIND_DATA.cFileName 'DEBUG
                
                          IF cFileName = "."   THEN ITERATE DO   'BECAUSE WE HAVE NO USE FOR THESE?
                          IF cFileName = ".."  THEN ITERATE DO   'BECAUSE WE HAVE NO USE FOR THESE?
                
                          INCR GL_COUNTER
                
                          DEBUG_DW_FILE_ATTR = STRUCT_FIND_DATA.dwFileAttributes    'THIS IS A NUMBER AND 0 WOULD BE A NORMAL FILE AS WOULD 128 AND 16 A DIRECTORY BUT MULTI BITS COULD BE HIGH...
                          'DEBUG_DW_FILE_PEEK = %FILE_ATTRIBUTE_DIRECTORY 'THIS IS AN EQUATE TO DIRECTORY, 16
                
                
                          IF DEBUG_DW_FILE_ATTR = 0 THEN
                              PRINT #1,  STRING$(RECURSION_DEPTH, $TAB) +  FORMAT$(GL_COUNTER,"000000") + " F> " + FILE_DRIVE_PATH + cFileName '+ " (" + "IT IS A 0 FILE"
                               'DEBUG_CASE_STRING = "IT IS 0 A FILE"
                               DEBUG_CASE_STRING = ""
                               ITERATE DO
                          END IF
                
                          IF DEBUG_DW_FILE_ATTR = 0 THEN
                               PRINT #1,   STRING$(RECURSION_DEPTH, $TAB) +  FORMAT$(GL_COUNTER,"000000") + " F> " + FILE_DRIVE_PATH + cFileName '+ " (" + "IT IS A 128 FILE"
                               'DEBUG_CASE_STRING = "IT IS A 128 FILE"
                               DEBUG_CASE_STRING = ""
                               ITERATE DO
                          END IF
                
                
                
                            DEBUG_BIN_STRING = BIN$(STRUCT_FIND_DATA.dwFileAttributes ) 'SHOW US THE BINARY STRING VALUE
                
                            'IF ISTRUE BIT(STRUCT_FIND_DATA.dwFileAttributes, 0) THEN  DEBUG_CASE_STRING = DEBUG_CASE_STRING + "READ-ONLY "   '1  'DON'T CARE? IT'S A FILE
                            'IF ISTRUE BIT(STRUCT_FIND_DATA.dwFileAttributes, 1) THEN  DEBUG_CASE_STRING = DEBUG_CASE_STRING + "HIDDEN "      '2
                            'IF ISTRUE BIT(STRUCT_FIND_DATA.dwFileAttributes, 2) THEN  DEBUG_CASE_STRING = DEBUG_CASE_STRING + "SYSTEM "      '4
                
                            'IF ISTRUE BIT(STRUCT_FIND_DATA.dwFileAttributes, 3) THEN
                            '      DEBUG_CASE_STRING = DEBUG_CASE_STRING + "VOLUME-LABEL"'8
                            'END IF
                
                            IF ISTRUE BIT(STRUCT_FIND_DATA.dwFileAttributes, 4) THEN
                                PRINT #1,  STRING$(RECURSION_DEPTH, $TAB) + FORMAT$(GL_COUNTER,"000000") + " D> " + FILE_DRIVE_PATH + cFileName '+ " (" + "IT IS A 16 DIRECTORY)"
                                PRINT #1,   STRING$(RECURSION_DEPTH, $TAB) +  "CALL RESURIVE " + FILE_DRIVE_PATH & RTRIM$(STRUCT_FIND_DATA.cFileName, CHR$(0)) & "\" + "  -  " + FILE_EXTENTION
                                INCR RECURSION_DEPTH
                                CALL CATALOG_FILES(FILE_DRIVE_PATH & RTRIM$(STRUCT_FIND_DATA.cFileName, CHR$(0)) & "\", FILE_EXTENTION)
                                DECR RECURSION_DEPTH
                                PRINT #1,   STRING$(RECURSION_DEPTH, $TAB) + "EXIT CALL RESURIVE "
                                'DEBUG_CASE_STRING = DEBUG_CASE_STRING '+ "DIRECTORY"   '16
                                DEBUG_CASE_STRING = ""
                                ITERATE DO
                            END IF
                
                
                
                
                
                            IF ISTRUE BIT(STRUCT_FIND_DATA.dwFileAttributes, 5) THEN  DEBUG_CASE_STRING = DEBUG_CASE_STRING + "ARCHIVED "    '32
                
                            'IF ISTRUE BIT(STRUCT_FIND_DATA.dwFileAttributes, 6) THEN  DEBUG_CASE_STRING = DEBUG_CASE_STRING + "NORMAL "      '64
                            IF ISTRUE BIT(STRUCT_FIND_DATA.dwFileAttributes, 7) THEN  DEBUG_CASE_STRING = DEBUG_CASE_STRING + "NORMAL "      '128
                
                            PRINT #1,  STRING$(RECURSION_DEPTH, $TAB) + FORMAT$(GL_COUNTER,"000000") + " F> " +  FILE_DRIVE_PATH + cFileName '+ " (" + "FINAL " + DEBUG_CASE_STRING + ")"
                
                        DEBUG_CASE_STRING = ""
                        LOOP WHILE FindNextFile(SEARCH_HANDLE, STRUCT_FIND_DATA)
                
                        CALL FindClose(SEARCH_HANDLE)
                    END IF
                
                
                
                
                
                    'THE WAY THIS WORKS IS WE LOOP A SECOND TIME AFTER ONLY COLLECTING FILES FROM THE PRIOR LOOP
                
                
                    CURRENT_PATH  = FILE_DRIVE_PATH & "*"
                    SEARCH_HANDLE = FindFirstFile(CURRENT_PATH, STRUCT_FIND_DATA)
                
                
                
                   'THIS IS JUST DIRECTORY NAMES THAT MAKE A RECURSIVE CALL TO THE FUNCTION
                    IF SEARCH_HANDLE <> %INVALID_HANDLE_VALUE THEN
                        DO
                            IF (STRUCT_FIND_DATA.dwFileAttributes AND %FILE_ATTRIBUTE_DIRECTORY) = %FILE_ATTRIBUTE_DIRECTORY AND (STRUCT_FIND_DATA.dwFileAttributes AND %FILE_ATTRIBUTE_HIDDEN) = 0 THEN  ' If dirs, but not hidden
                
                                'THIS IS INSIDE THE DIRECTORY LOOP HERE
                                'TEXT_TO_SCREEN STR$(TIMER)
                
                                cFileName = STRUCT_FIND_DATA.cFileName 'DEBUG
                
                                IF STRUCT_FIND_DATA.cFileName <> "." AND STRUCT_FIND_DATA.cFileName <> ".." THEN          ' Not these..
                                  CALL CATALOG_FILES(FILE_DRIVE_PATH & RTRIM$(STRUCT_FIND_DATA.cFileName, CHR$(0)) & "\", FILE_EXTENTION)
                                END IF
                
                            END IF
                        LOOP WHILE FindNextFile(SEARCH_HANDLE, STRUCT_FIND_DATA)
                        CALL FindClose(SEARCH_HANDLE)  'WIN API FindClose
                    END IF
                
                '  FUNCTION = gFileCount
                
                 END FUNCTION

                Comment

                Working...
                X