Generic Find Text

Date:    Fri, 13 Jan 1995 13:58:23 EST
From:    John Dowdell <71333.42@COMPUSERVE.COM>
Subject: Re: Glossay?Hypertext??'sFIND FIND FIND
Roy Pardi writes on Jan 10 of a desire for a generic "find" function for textfields. Here's what I usually use myself, Roy... the entry comes from the KnowledgeBase:


There are many ways to achieve this... here is a generic example. Suppose there's a long, scrolling textfield on the screen, and we wish to go through and find each sequential appearance of any arbitrary word in that field... we wish to automatically scroll there and highlight the next occurence of the word. The following handler does all that:

  on FindNext searchString, thisField
    global searchText, void
    if voidP(searchText) then set searchText = the text of field thisField
    set theLength = length(searchString)
    set target = offset (searchString, searchText)
    if target = 0 then
      alert "There are no more matches for " &QUOTE& searchString &QUOTE& " here."
      set searchText = void
      exit
    end if
    set hitPosition = target + length(the text of field thisField) - length(searchText)
    put EMPTY before char (hitPosition + 50) of field thisField
    hilite char hitPosition to (hitPosition + theLength - 1) of field thisField
    delete char 1 to (target + theLength) of searchText
  end
It is called by a command such as the following, that can be triggered by clicking on the textfield itself:
  FindNext "magenta", "Color Theory"

This will search for the next occurence of the word "magenta" in the textfield named "Color Theory", scroll to it, and highlight it for you. There are a great many variants on the above... custom formatting, going from field to field, whatever... but the above shows an easy, quick search engine.

(In the above example, the global variable "searchText" controls the remaining text in that field to search; "void" is just an arbitrary uninitialized variables useful for clearing things out; the "50" is an arbitrary value that depends on the size of the textfield on Stage.)

Further, the use of the powerful "list" datatype allowes you to set up indexes to *significant* hotwords, rather than just doing the bulk text search outlined above... can allow for very sophisticated behaviors, but requires a bit more familiarity with high-level scripting constructs.


Roger's examples point up ways to use lists to achieve that final goal. I haven't benchmarked this against other algorithms (again, please refer to Roger's contributions for ways of setting up benchmarks), but have achieved good results myself with this.

I seem to recall that you were discussing the use of the "offset" function awhile back, Roy... hope I'm not reinventing the wheel by posting this. Again, there are many, many variations on hypertext engines, so this is definitely only grist for your own mill.

And although the topic of high-ASCII differentiation doesn't come up much on the list here, it's of great concern to a few developers on CompuServe. Their concern is that the Lingo "offset" function compares base letter to base letter... it matches despite case differences or diacritical marks. Here's a second KB entry that provides some useful routines for those to whom this differentiation is important within their text engines:

(Q) I would like to use the Lingo "offset()" function to search a textfield, but in Norwegian it does not distinguish between "A" and "A". [whoops! telecom software does not support that last character -- looks like a capital A with a ring atop -- jd]

(A) Yes, this is correct, and is described in the Lingo Dictionary's entry for "offset" as "the string comparison is not sensitive to case or diacritical marks." If you wish to check for case or accent, then it's easy to add this functionality... here's the core function:

  on CompareCase a, b
    if length(a) <> length(b) then return FALSE
    repeat with i = 1 to length(a)
      set charA = charToNum (char i of a)
      set charB = charToNum (char i of b)
      if charA <> charB then return FALSE
    end repeat
    return TRUE
  end CompareCase
This returns TRUE if the two string arguments are an exact match, or FALSE if they differ... will be sensitive to case and accent marks. To expand this into a case-sensitive "offset" function, then wrap the above core test within something like the following:
  on CaseOffset searchString, container
    set deletedChars = 0
    repeat while TRUE
      set place = offset (searchString, container)
      if NOT place then return FALSE
      set tester = char place to (place + length(searchString) - 1) of container
      if CompareCase (tester, searchString) then
        return deletedChars + place
      else
        set deletedChars = deletedChars + place + length(searchString)
        delete char place to (place + length(searchString)) of container
      end if
    end repeat
  end CaseOffset
Calling this would be syntactally identical to the stock "offset" function, and will return a similar result... the only difference, again, is that this handler would be sensitive to case and diacritical marks. You could further refine the above handler if you wish by running additional tests within CompareCase() along the lines of "if (charA <> charB) AND (charA + 32 <> charB) AND (charA <> charB + 32) then return FALSE"... this would remove the case-sensitivity while still preserving sensitivity to diacritical marks.

So, yes, the documentation is accurate and yes, if you wish to extend Lingo functionality to be sensitive to either case, accent marks, or both, then you are free to do so.