Pierre, you're right... my logic was totally flawed. Looking again in light of Bob Zale's post, it seems to me that Bill's problem lay with his 'manual sort'. As the PowerBasic documentation for UCASE$ clearly states, the function is only valid for ascii characters from CHR$(0) to CHR$(127).
The following code begs the question as to what the compiler does with the higher ascii codes.
FUNCTION PBMAIN
LOCAL c1, c2 AS STRING
LOCAL i, j AS LONG
c1 = CHR$(225)
c2 = CHR$(220)
i = (c1 > c2)
j = (UCASE$(c1) > UCASE$(c2))
MSGBOX STR$(i)+" "+STR$(j) ' -1 0 is displayed
END FUNCTION
If the value 225 is changed to 223 then -1 -1 is displayed
or if the value 220 is changed to 224 then -1 -1 is displayed
Its kind of what led me to my flawed solution, first.
Announcement
Collapse
No announcement yet.
ARRAY SORT bug
Collapse
X
-
Charles,
The UCASE$ function simply adds 32 to the acii value
so that adding 32 to the 225 value wraps around to a value of 1.
and retain the same value of 225 with prior version that
did not take care of internationnal character set.
Ucase$ substract 32 from non-accentued lowercase letter character under 127
but there is more to it, you may have a look at
this thread.
Leave a comment:
-
Originally posted by Bill Kadenhead View PostThere is a bug with ARRAY SORT ... It would be nice if the PB people could get everything coordinated...
"COLLATE cstring is used to specify an entirely new sorting order. This can be used for a variety of purposes, the most obvious of which is the case of international character sets. The collate string cstring must contain exactly 256 characters, one for each of the ASCII codes 0-255, in the order that they would be sorted (from lowest to highest, if an ascending sort were performed on them)."
Best regards,
Bob Zale
PowerBASIC Inc.
Leave a comment:
-
Hi Bill,
This is normal behaviour for <<ARRAY SORT SomeArray$(), COLLATE UCASE>>
becose it will capitalize only a to z characters,
not those accentued ones like é or à.
When sorting international characters, those above CHR$(127),
you could use <<ARRAY SORT SomeArray$(), COLLATE $String>>.
See the PowerBASIC help file about it.
Also this demo might help.
Leave a comment:
-
Bill, you have not shown a bug here. The UCASE$ function simply adds 32 to the acii value so that adding 32 to the 225 value wraps around to a value of 1.
Just replace your statement:
IF UCASE$(arr(1)) > UCASE$(arr(2)) THEN
with:
MSGBOX STR$(ASC(CHR$(ASC(arr(1))+32)))+" " + STR$(ASC(CHR$(ASC(arr(2))+32)))
IF CHR$(ASC(arr(1))+32) > CHR$(ASC(arr(2))+32) THEN
and you'll see what I mean.
PowerBasic is a very well developed compiler and I am extremely hesitant to refer to any anomoly I experience as a 'bug'.
Leave a comment:
-
ARRAY SORT bug
There is a bug with ARRAY SORT when trying to do a case-insensitive sort. For certain pairs of characters, the array will be sorted one way, but sorting manually, using a ">" comparison, would sort it the other way. This could create a serious problem if you use a search routine that assumes the data is sorted in a certain way (probably binary search) and that makes use of "<". Here's an example --
FUNCTION PBMAIN () AS LONG
DIM arr() AS STRING
DIM i AS INTEGER
DIM msg AS STRING
'create a 2-element array --
REDIM arr(1 TO 2)
arr(1) = CHR$(225)
arr(2) = CHR$(200)
GOSUB show_array
'sort it --
ARRAY SORT arr(), COLLATE UCASE
GOSUB show_array
'the sorted array should have element#1 < element#2, but when you test for it . . .
IF UCASE$(arr(1)) > UCASE$(arr(2)) THEN
MSGBOX "problem"
ELSE
MSGBOX "no problem"
END IF
EXIT FUNCTION
'----------------------------------------------------------------------------------
show_array:
msg = ""
FOR i = 1 TO 2
msg = msg & STR$(i) & " " & arr(i) & CHR$(10)
NEXT
MSGBOX msg
RETURN
END FUNCTION
==================================================================
The problem shows up only when using COLLATE UCASE. If you do a case-sensitive sort, no problem.
And all the problems occur after ASCII 128, so maybe it's not much of a concern (assuming you're working in English). Still it seems like the fix would be fairly easy for someone who knows the insides of ARRAY SORT. It would be nice if the PB people could get everything coordinated -- One less thing for everyone else to have to consider.
For a listing of the ~600 combinations where ARRAY SORT works one way and "<" works the other way, run the following code . . .
FUNCTION PBMAIN () AS LONG
DIM s() AS STRING
DIM i AS INTEGER
DIM j AS INTEGER
DIM s1 AS STRING
DIM s2 AS STRING
DIM errors AS LONG
DIM msg AS STRING
DIM EOL AS STRING
EOL = CHR$(13) & CHR$(10)
'create an array of the 256 ascii chars --
REDIM s(0 TO 255)
FOR i = 0 TO 255
s(i) = CHR$(i)
NEXT
'sort it --
ARRAY SORT s(), COLLATE UCASE
'compare each char. of the sorted array with all the other chars, looking for instances where the lower char in the array is found to be ">" the upper char --
FOR i = 0 TO 255
s1 = s(i)
msg = msg & "-------------------------------------" & EOL
FOR j = i + 1 TO 255
s2 = s(j)
IF UCASE$(s1) > UCASE$(s2) THEN
errors = errors + 1
msg = msg & STR$(i) & " " & STR$(j) & " " & s1 & " " & s2 & EOL
END IF
NEXT
NEXT
'now write msg to .txt file for easy viewing
END FUNCTIONTags: None
Leave a comment: