Full Text Search

Nymph provides a case insensitive full text search engine based on the SciActive Tokenizer. It has the ability to search for sequential tokens, exact tokens, "or" searches, and negation of tokens.

When an entity is saved into the database, it has a chance (through the getFTSText static function) to alter text from its properties before FTS indexing (removing HTML tags, extracting text from Base64, etc). Once the text is finalized, it's converted to lower case, broken apart into words, stop words are removed, each word is stemmed (running -> run, etc), and the stem and the original are turned into 32-bit tokens. The tokens, along with their position and whether they're a stem, are stored in the database.

The following example queries show how to use the available features.

  • light - finds "light", "lights", "lighting", "lighted"
  • light mode - finds "light mode", "The disctinct mode ... bright lighting around..."
  • 'light mode' - finds "light mode", "lighting mode", "lighting the mode", "lighted and moded"
  • "light mode" - finds "light mode", "light and mode", "light a mode"
  • light or dark - finds "this is light" or "dark is here"
  • "light mode" or "dark mode" - finds "light mode on" or "dark mode above"
  • -light - finds anything without "light", "lights", "lighting", "lighted"
  • -'light mode' - finds anything without "light mode", "lighting mode", "lighting the mode", "lighted and moded"
  • -"light mode" - finds anything without "light mode", "light and mode", "light a mode"
  • "light mode" or -"grayscale" - finds "light mode", "light and mode", "light a mode" or anything without "grayscale"

You can use the search feature with the search clause in a Nymph query:

nymph.getEntities(
  { class: SomeEntity },
  { type: '&', search: ['prop', 'light mode'] },
);

One thing to note is that this is not a substring search, so a search for "light" will not match strings with the words, "slight", "blight", or "delighted".

Previous: Transactions Next: Export and Import