IDOL Server can perform automatic spelling correction to terms that it believes were typed incorrectly. It uses intuitive rules to determine the most likely intended term, working from the existing documents in the IDOL index. It does not need any dictionary files.
You can specify certain misspellings that you know must correct to specific words, or to specify that IDOL Server must never correct a particular word.
When IDOL Server stops, it stores any corrections that it has made in a human-readable file. This file allows it to rapidly perform the same correction at a later stage, and allows you to add further corrections.
For example, consider the following query:
action=Query&Text=san+fransisco&SpellCheck=True
For this query, the server offers the obvious correction, and also shows the full queryText complete with correction. After you stop IDOL temporarily (using the Stop
action, or by stopping the service), the server creates a file in the main
subdirectory of the Content component, called prx.db
, with the following contents:
<?xml version="1.0" encoding="UTF-8" ?> <PROXIMALS> <PROXIMAL ORIG="FRANSISCO" CORRECT="Francisco" /> </PROXIMALS>
You can add any further corrections to this file, and amend any existing ones. Then restart the server.
Ensure that the modified file is still valid XML, or IDOL Server cannot read your changes.
The usual practice is to insert a list of common misspellings into this file to aid the process.
To specify a word that must not be corrected in query text, simply add the term to the prx.db
file without the CORRECT
attribute. For example:
<PROXIMAL ORIG="TELEFONE" />
There are several configuration parameters that affect the spelling correction operation, all in the [Server]
section.
SpellCheckIncorrectMaxDocOccs
(default=0)
The maximum number of documents in IDOL Server in which a term can occur before IDOL Server treats it as a correct term and does not attempt to find a spelling correction for it.
SpellCheckCorrectMinDocOccs
(default=5)
The minimum number of documents in IDOL Server in which a term can occur and still appear as a spelling correction for another term.
SpellCheckMaxCheckTerms
(default=0)
The maximum number of terms that a query can contain and still have spelling corrections suggested for the terms. A value of 0
means that there is no limit.
|