AgentBoolean Queries

You can think of an AgentBoolean query as the inverse of a standard query. Standard queries identify documents that match a query, while AgentBoolean queries find queries (agents) that match a document or piece of text.

The main use of AgentBoolean queries is to match documents against certain criteria, and then trigger an action each time IDOL Server receives a new matching document. You can create a set of AgentBoolean rules, which contain a Boolean or FieldText query to represent an agent or a category. IDOL Server stores these rules in the Agentstore component, in the same way that the Content component stores documents. You can then send documents to the Agentstore component as a query, which finds any agents or categories that match (that is, if you were to run a search with the matching agent or category rule, it would return the specified document).

You might create AgentBoolean rules that define categories. When you index new documents, you can use an AgentBoolean query to categorize them, and update the document with a category field before indexing.

You might create AgentBoolean rules for an agent that defines a topic that a user is interested in. In this case you can run an AgentBoolean query with a new document to find the agents that would match the document, and then alert interested users to new matching content.

For full details of how to set up and use AgentBoolean queries, refer to the IDOL Server Administration Guide.

Performance

The most important part of optimizing AgentBoolean performance is to ensure that the field containing the Boolean rules is configured as the AgentBooleanCacheField. This configuration uses additional memory but makes AgentBoolean queries much faster.

NOTE:

The same is true for FieldTextCacheField, AgentParamsCacheField, and AgentSecurityCacheField, where appropriate.

The second part of optimizing performance is to use Indexfields to minimize the number of rules that IDOL must check.

To understand this process, remember that the usual query process still applies in AgentBoolean queries. AgentBoolean agents match when terms from the query Text match terms in Index fields. IDOL then checks AgentBoolean rules as a post-filter. The process runs more quickly if you can minimize the number of AgentBoolean agents that match, while guaranteeing that all correct rule matches return.

To create the AgentBoolean rule penguin AND Antarctic, any document that the rule matches must contain the word Antarctic.

To create this rule in IDOL you create the following AgentBoolean agent:

#DREREFERENCE Penguin
#DREFIELD AGENTBOOL="penguin AND Antarctic"
#DRECONTENT
Antarctic
#DREENDDOC

In this example, you could equally use the word Penguin for the DRECONTENT field. However, you should always use the rarest possible term, which reduces the number of times that IDOL has to check the rule; in this case, Antarctic is the rarer term.

To create the AgentBoolean rule penguin OR Antarctic it is no longer the case that any document that the rule matches must contain the word Antarctic, as it could instead only contain the word penguin.

In this case you create the following AgentBoolean agent:

#DREREFERENCE Penguin
#DREFIELD AGENTBOOL="penguin OR Antarctic"
#DRECONTENT
penguin Antarctic
#DREENDDOC

You can use the DocumentStats action against an IDOL Server containing an index of text documents to automatically determine the optimal DRECONTENT for any AgentBoolean rule.

action=DocumentStats&QueryStats=True&text=penguin AND Antarctic  

Wildcard Rules

When any of your AgentBoolean rules contain wildcard terms, you might not be able to find a minimal DRECONTENT value that always correctly matches the rule, because you cannot use wildcard terms in the DRECONTENT field.

In this case, you can mark the AgentBoolean agent with an AlwaysMatchType field so that IDOL checks the rule for every query.

#DREREFERENCE Penguin
#DREFIELD AGENTBOOL="penguin OR Antarc*"
#DREFIELD ALWAYSMATCH="1"
#DREENDDOC

If the wildcard term is not required as part of the DRECONTENT, this approach is not necessary. For example:

#DREREFERENCE Penguin
#DREFIELD AGENTBOOL="penguin AND Antarc*"
#DRECONTENT
penguin
#DREENDDOC

Again, you can use the DocumentStats action to determine the optimal DRECONTENT value. This action returns the following field to indicate whether you need to set the AlwaysMatchType field.

<autn:wildrequired>true<autn:wildrequired> 

Field Restrictions

All fields that are referenced in an agent must exist in the IDOL Server AgentStore index before you index the agent. For example, if the agent contains a rule to match a value in the field FieldName, then you must index a FieldName field as part of a dummy document.

Similarly, any fields that occur in documents that you send in TextParse queries must also exist in a dummy document for agents to use the fields.

The easiest way to create the dummy document is to generate an empty example of a document that you want to send for checking against the AgentBoolean rules. For example, you might index the following document before indexing your AgentBoolean agents:

#DREREFERENCE DummyDocument
#DRETITLE
#DREFIELD Location=""
#DREFIELD Author=""
#DREFIELD Type=""
#DRECONTENT
#DREENDDOC

After you index this dummy document, you can index an agent such as:

#DREREFERENCE Penguin
#DREFIELD AGENTBOOL="penguin:Type AND Antarc*:Location"
#DRECONTENT
penguin
#DREENDDOC

AgentBoolean Query Process

The first stage of an AgentBoolean query is to extract the text out of the input. For a query with the TextParse action parameter set to True, IDOL Server uses the TextParseIndexType fields. For a non-TextParse query, it uses all the supplied text.

This text is used to perform an initial conceptual match against the agent Index fields. After this process has narrowed down the list of possible agents, IDOL Server checks any AgentBoolean field against the input text and (for TextParse queries) any FieldText and AgentSecurity fields.


_FT_HTML5_bannerTitle.htm