In the last few days I've been looking at spell check code. To speed up the
search I created the binary search code found below. If the search term
is not found I also want to know where the search term would have been
found so I can use nearby words as spelling suggestions.
I was surprised to find many links to the term, but far fewer examples. So
I wrote my own and here it is.
Primary Code:
The following single function does all the work. In the compilable version
further down the page, a linear search is included along with a speed test
that shows up to a speed improvement of 10X, depending on the position
of the search term.
The code is provided below, and is also available in the
gbSnippets PowerBASIC source code library (snippet# gbs_00396).
You can get the entire library by downloading gbSnippets or you
can view individual snippets online.
gbSnippets home page: http://www.garybeene.com/sw/gbsnippets.htm
Online source code listings: http://www.garybeene.com/power/code/
If you've already installed gbSnippets, you can ensure that your local
library is synchronized with the latest snippets on the gbSnippets server
by using the "Actions/Synchronize with gbSnippets Server" menu.
Finally, here's a direct link to the gbsnippets.pbr resource file use in this code:
To use the function for other data types, simply declare sWord as the preferred data type.
The following compilable example includes a linear search through the array,
as well as a speed comparison between the two.
search I created the binary search code found below. If the search term
is not found I also want to know where the search term would have been
found so I can use nearby words as spelling suggestions.
I was surprised to find many links to the term, but far fewer examples. So
I wrote my own and here it is.
Primary Code:
The following single function does all the work. In the compilable version
further down the page, a linear search is included along with a speed test
that shows up to a speed improvement of 10X, depending on the position
of the search term.
The code is provided below, and is also available in the
gbSnippets PowerBASIC source code library (snippet# gbs_00396).
You can get the entire library by downloading gbSnippets or you
can view individual snippets online.
gbSnippets home page: http://www.garybeene.com/sw/gbsnippets.htm
Online source code listings: http://www.garybeene.com/power/code/
If you've already installed gbSnippets, you can ensure that your local
library is synchronized with the latest snippets on the gbSnippets server
by using the "Actions/Synchronize with gbSnippets Server" menu.
Finally, here's a direct link to the gbsnippets.pbr resource file use in this code:
Code:
Function BinaryWordSearch(sWord As String, iPos&) As Long 'search for sWord$ in WordList(), which is a Global array 'return 1 if found, 0 otherwise 'iPos& is UBound(WordList) + 1 if searchterm > all values in array 'iPos& is -1 if searchterm < all values in array 'otherwise, iPos is position found, or position where search term should have been found (immediately below 1st item greater) Local Upper As Long, Lower As Long Lower = LBound(WordList) : Upper = UBound(WordList) 'test boundary values If sWord = WordList(Lower) Then iPos& = Lower : Function = 1 : Exit Function If sWord = WordList(Upper) Then iPos& = Upper : Function = 1 : Exit Function If sWord < WordList(Lower) Then iPos& = Lower - 1 : Function = 0 : Exit Function If sWord > WordList(Upper) Then iPos& = Upper + 1 : Function = 0 : Exit Function Do Until (Upper <= (Lower+1)) iPos& = (Lower + Upper) / 2 Select Case sWord Case > WordList(iPos&) : Lower = iPos& Case < WordList(iPos&) : Upper = iPos& Case WordList(iPos&) : Function = 1 : Exit Function End Select Loop End Function
The following compilable example includes a linear search through the array,
as well as a speed comparison between the two.
Code:
'Compilable Example: 'This example declares sWord as STRING, search for a match in a sorted string array. 'Also, be aware that in code such as this, which uses the greater than or less than 'symbols, that string comparisons are being made and that "0600" will come after "06" '(the numeric values of 600 and 6 are NOT used to make the comparison). #Compile Exe #Dim All #Include "Win32API.inc" Global hDlg as Dword, WordList() As String Function PBMain() As Long Local i As Long ReDim WordList(1000) For i = 0 to 700 : WordList(i) = Format$(i,"00000") : Next i 'these two lines leave out "00701" For i = 701 to 1000 : WordList(i) = Format$(i+1,"00000") : Next i Dialog New Pixels, 0, "Test Code",300,300,200,200, %WS_OverlappedWindow To hDlg Control Add Button, hDlg, 100,"Search", 50,10,100,20 Control Add TextBox, hDlg, 200,"00000", 50,35,100,20 Dialog Show Modal hDlg Call DlgProc End Function CallBack Function DlgProc() As Long If CB.Msg = %WM_Command AND CB.Ctl = 100 AND CB.Ctlmsg = %BN_Clicked Then Local sWord$, iPos&, i As Long Control Get Text hDlg, 200 To sWord$ If BinaryWordSearch(sWord$, iPos&) Then MsgBox sWord$ + " was found at array position " + Str$(iPos&) + "." Else MsgBox sWord$ + " was not found. It should have been at position " + Str$(iPos&) + "." End If If LinearWordSearch(sWord$, iPos&) Then MsgBox sWord$ + " was found at array position " + Str$(iPos&) + "." Else MsgBox sWord$ + " was not found. It should have been at position " + Str$(iPos&) + "." End If SpeedTest sWord$ End If End Function Function BinaryWordSearch(sWord As String, iPos&) As Long 'search for sWord$ in WordList(), which is a Global array 'return 1 if found, 0 otherwise 'iPos& is UBound(WordList) + 1 if searchterm > all values in array 'iPos& is -1 if searchterm < all values in array 'otherwise, iPos is position found, or position where search term should have been found (immediately below 1st item greater) Local Upper As Long, Lower As Long Lower = LBound(WordList) : Upper = UBound(WordList) 'test boundary values If sWord = WordList(Lower) Then iPos& = Lower : Function = 1 : Exit Function If sWord = WordList(Upper) Then iPos& = Upper : Function = 1 : Exit Function If sWord < WordList(Lower) Then iPos& = Lower - 1 : Function = 0 : Exit Function If sWord > WordList(Upper) Then iPos& = Upper + 1 : Function = 0 : Exit Function Do Until (Upper <= (Lower+1)) iPos& = (Lower + Upper) / 2 Select Case sWord Case > WordList(iPos&) : Lower = iPos& Case < WordList(iPos&) : Upper = iPos& Case WordList(iPos&) : Function = 1 : Exit Function End Select Loop End Function Function LinearWordSearch(sWord As String, iPos&) As Long 'search for sWord$ in WordList(), which is a Global array 'return 1 if found, 0 otherwise 'iPos& is UBound(WordList) + 1 if searchterm > all values in array 'iPos& is -1 if searchterm < all values in array 'otherwise, iPos is position found, or position where search term should have been found (immediately below 1st item greater) Local i As Long For i = 0 to UBound(WordList) Select Case sWord Case > WordList(i) 'no action, keep looping Case < WordList(i) : iPos& = i - 1 : Function = 0 : Exit Function Case WordList(i) : iPos& = i : Function = 1 : Exit Function End Select Next i iPos& = i End Function Sub SpeedTest (sWord As String) 'speed test - Binary Dim iStart As Long, iEnd As Long, i As Long, iPos& iStart = GetTickCount For i = 1 To 10000 : BinaryWordSearch(sWord,iPos&) : Next i iEnd = GetTickCount MsgBox "Binary: " + Format$((iEnd - iStart)/1000,3) & " seconds" 'speed test - Linear iStart = GetTickCount For i = 1 To 10000 : LinearWordSearch(sWord,iPos&) : Next i iEnd = GetTickCount MsgBox "Linear: " + Format$((iEnd - iStart)/1000,3) & " seconds" End Sub 'gbs_00396
Comment