CMS 10.0–10.5 Boosts and Exclusions


InSite Search achieves fast search responses because of the way its Lucene engine searches indexes rather than searching each unit of content, individually. Lucene uses a predefined formula to score and rank each matched page or other content source. Administrators can influence the way indexes are built from within the CMS. "Boosts" and "exclusions" are two tools that administrators have to customize search results.

Note
Boosts and exclusions do not affect Keymatch searches.

Boosts

Boost values are set on fields and schemas before they are indexed. Boosting is a way to adjust the search relevance of InSite Search results. It tunes the search results by changing the components in the scoring formula. Boost values affect the calculation of the total document score when ranking documents that match search criteria. When you boost a field, the content containing the boosted field will rank higher than other matching content in the score calculation formula, provided all other factors are equal.

Configuring a boost value is an index-time event in InSite Search, which means that any changes to boost values will require a re-indexing of search content. It is recommended that a test environment is created to test these changes in search results prior to making the changes in a live production environment.

Scoring

Boost values are configured as a float value, defaulting to a value of 1.0. In the score calculation formula, the boost number is multiplied by other numeric factors, so the boost default value of 1.0 has the effect of ignoring boost considerations. When the boost value is set to a number greater than 1.0, the final score increases. You can view the tallied score value within the Score element of a search result's XML data.

Assuming no other influences, a boost value of 2.0 will rank a field higher than another field with a boost value of 1.5. For example, page type Alpha has a boost value of 2.0 on its Title field; whereas, page type Beta has a boost value of 1.5 on its Title field. A user runs a query with Product as the search term. Page instances built from both Alpha and Beta schemas return, because the term Product is present in all Title fields of the returned pages. If all else is equal, the pages of type Alpha will rank in the result set higher than Beta's because of the higher boost value applied to the field.

It would be an over-simplification to say that high boost number values, result in high ranks within the results set, however. Keep in mind that boosting is also dependent on factors like how often the search term occurs in the boosted field. It's the total score of the calculation formula that determines the ultimate ranking of content within the result set.

Note
While you can set the boost value from a range of 1.0 through 4.0, the recommended range is to stay within 1.0 through 2.0, as higher values tend to introduce additional variances in results.

Ranking

You might increase the boost field value because the default score algorithm does not reflect the "relevance order" of content units, accurately. For example, suppose there are five pages, each with a Title field. Prior to setting a boost value, the page ranking order is [x1, x2, x3, x4, x5]. If you increase the Title field boost of x005 at index time, then for the same search, x5 will move up in search results ranking.

Recap

Lucene search has three distinct phases:

  1. Lucene builds an index based on defined content.
  2. It finds documents that match users search criteria, then
  3. It calculates the score for each search result and assigns rank.

Boosting pages and schemas influence the overall score formula. Boost values greater than 1.0 push the most relevant content towards the top of search results.

Exclusions

In addition to boosting content, administrators can exclude fields and schema types from the search index to narrow the scope of user searches.