There is a bug with ARRAY SORT when trying to do a case-insensitive sort. For certain pairs of characters, the array will be sorted one way, but sorting manually, using a ">" comparison, would sort it the other way. This could create a serious problem if you use a search routine that assumes the data is sorted in a certain way (probably binary search) and that makes use of "<". Here's an example --
FUNCTION PBMAIN () AS LONG
DIM arr() AS STRING
DIM i AS INTEGER
DIM msg AS STRING
'create a 2-element array --
REDIM arr(1 TO 2)
arr(1) = CHR$(225)
arr(2) = CHR$(200)
GOSUB show_array
'sort it --
ARRAY SORT arr(), COLLATE UCASE
GOSUB show_array
'the sorted array should have element#1 < element#2, but when you test for it . . .
IF UCASE$(arr(1)) > UCASE$(arr(2)) THEN
MSGBOX "problem"
ELSE
MSGBOX "no problem"
END IF
EXIT FUNCTION
'----------------------------------------------------------------------------------
show_array:
msg = ""
FOR i = 1 TO 2
msg = msg & STR$(i) & " " & arr(i) & CHR$(10)
NEXT
MSGBOX msg
RETURN
END FUNCTION
==================================================================
The problem shows up only when using COLLATE UCASE. If you do a case-sensitive sort, no problem.
And all the problems occur after ASCII 128, so maybe it's not much of a concern (assuming you're working in English). Still it seems like the fix would be fairly easy for someone who knows the insides of ARRAY SORT. It would be nice if the PB people could get everything coordinated -- One less thing for everyone else to have to consider.
For a listing of the ~600 combinations where ARRAY SORT works one way and "<" works the other way, run the following code . . .
FUNCTION PBMAIN () AS LONG
DIM s() AS STRING
DIM i AS INTEGER
DIM j AS INTEGER
DIM s1 AS STRING
DIM s2 AS STRING
DIM errors AS LONG
DIM msg AS STRING
DIM EOL AS STRING
EOL = CHR$(13) & CHR$(10)
'create an array of the 256 ascii chars --
REDIM s(0 TO 255)
FOR i = 0 TO 255
s(i) = CHR$(i)
NEXT
'sort it --
ARRAY SORT s(), COLLATE UCASE
'compare each char. of the sorted array with all the other chars, looking for instances where the lower char in the array is found to be ">" the upper char --
FOR i = 0 TO 255
s1 = s(i)
msg = msg & "-------------------------------------" & EOL
FOR j = i + 1 TO 255
s2 = s(j)
IF UCASE$(s1) > UCASE$(s2) THEN
errors = errors + 1
msg = msg & STR$(i) & " " & STR$(j) & " " & s1 & " " & s2 & EOL
END IF
NEXT
NEXT
'now write msg to .txt file for easy viewing
END FUNCTION
FUNCTION PBMAIN () AS LONG
DIM arr() AS STRING
DIM i AS INTEGER
DIM msg AS STRING
'create a 2-element array --
REDIM arr(1 TO 2)
arr(1) = CHR$(225)
arr(2) = CHR$(200)
GOSUB show_array
'sort it --
ARRAY SORT arr(), COLLATE UCASE
GOSUB show_array
'the sorted array should have element#1 < element#2, but when you test for it . . .
IF UCASE$(arr(1)) > UCASE$(arr(2)) THEN
MSGBOX "problem"
ELSE
MSGBOX "no problem"
END IF
EXIT FUNCTION
'----------------------------------------------------------------------------------
show_array:
msg = ""
FOR i = 1 TO 2
msg = msg & STR$(i) & " " & arr(i) & CHR$(10)
NEXT
MSGBOX msg
RETURN
END FUNCTION
==================================================================
The problem shows up only when using COLLATE UCASE. If you do a case-sensitive sort, no problem.
And all the problems occur after ASCII 128, so maybe it's not much of a concern (assuming you're working in English). Still it seems like the fix would be fairly easy for someone who knows the insides of ARRAY SORT. It would be nice if the PB people could get everything coordinated -- One less thing for everyone else to have to consider.
For a listing of the ~600 combinations where ARRAY SORT works one way and "<" works the other way, run the following code . . .
FUNCTION PBMAIN () AS LONG
DIM s() AS STRING
DIM i AS INTEGER
DIM j AS INTEGER
DIM s1 AS STRING
DIM s2 AS STRING
DIM errors AS LONG
DIM msg AS STRING
DIM EOL AS STRING
EOL = CHR$(13) & CHR$(10)
'create an array of the 256 ascii chars --
REDIM s(0 TO 255)
FOR i = 0 TO 255
s(i) = CHR$(i)
NEXT
'sort it --
ARRAY SORT s(), COLLATE UCASE
'compare each char. of the sorted array with all the other chars, looking for instances where the lower char in the array is found to be ">" the upper char --
FOR i = 0 TO 255
s1 = s(i)
msg = msg & "-------------------------------------" & EOL
FOR j = i + 1 TO 255
s2 = s(j)
IF UCASE$(s1) > UCASE$(s2) THEN
errors = errors + 1
msg = msg & STR$(i) & " " & STR$(j) & " " & s1 & " " & s2 & EOL
END IF
NEXT
NEXT
'now write msg to .txt file for easy viewing
END FUNCTION
Comment