CMS 10.1–10.5 Search.Config Reference


The primary functions of the Search.config file is to configure indexing behavior, to add index types by defining indexing sources, and to set other InSite Search configurations. It's important to note the two types of indexing libraries that can be defined within InSite Search:

Main Content Index
This index that stores page content from either a published CMS site or a site crawl indexing source. You can add multiple instances of each type of indexing source, but the resulting content is always stored in the defined @indexLocation value.
Independent Indexes
These indexes comprise of supplemental data that can be used on top of the main content index to enhance data results. Each independent index is stored in their own index location path on disk, as defined in each of their own separate configuration files.

Search.config Example

The example below displays generic code contained within Search.config.

Caution
The sample Search.config is only for reference and should not be used in place of the installed one.
Search.config example
<?xml version="1.0"?>
<configuration>
    <configSections>
        <section name="Search"
            type="Ingeniux.Search.Configuration.IndexingConfiguration, Ingeniux.Search"/>
    </configSections>
    <Search indexLocation="App_Data\LuceneIndex"
        synonymsLocation="[Drive]:\[path to DSS root directory]\published\iss-config\Synonyms.xml"
        indexingEnabled="true" queryMaxClauses="1024">
        <Hiliter startTag="&lt;strong&gt;" endTag="&lt;/strong&gt;"/>
        <Settings>
            <add name="defaultIndexingAnalyzer"
                value="Ingeniux.Search.Analyzers.StemmingIndexingAnalyzer, Ingeniux.Search"/>
            <add name="defaultQueryAnalyzer"
                value="Ingeniux.Search.Analyzers.StemmingQueryAnalyzer, Ingeniux.Search"/>
            <add name="QueryFieldsFileLocation"
                value="[Drive]:\[path to DSS root directory]\[subfolder(s)]\QueryFields.xml"/>
            <add name="DocumentBoostByFacetsFileLocation"
                value="App_Data\DocumentBoostByFacetsFileLocation.xml"/>
            <add name="GSearchFieldMapping"
                value="[Drive]:\[path to DSS root directory]\[subfolder(s)]\GSearchFieldMapping.xml"/>
        </Settings>
        <IndexingSources>
            <add name="CMSPublishedContent" type="Ingeniux.Runtime.Search.DssContentSearchSource"
                settingsFile="[Drive]:\[path to DSS root directory]\settings\SearchSource.config"/>
            <add name="KeyMatch" type="Ingeniux.Search.KeyMatchSearchDocumentSource"
                settingsFile="App_Data\KeymatchSource.config"/>
            <add name="SpellCheckDictionary" type="Ingeniux.Search.SpellCheckerSearchDocumentSource"
                settingsFile="App_Data\spellcheckerSource.config"/>
            <add name="SiteCrawlerSource" type="Ingeniux.Search.HtmlSiteSource"
                settingsFile="App_Data\sitecrawlerSource.config"/>
            <add name="Analytics" type="Ingeniux.Search.AnalyticsSearchDocumentSource"
                settingsFile="App_Data\analyticSource.config"/>        
        </IndexingSources>
        <SearchProfiles>
            <add name="Independent-search-profile-1">
                <Sources>
                    <add name="KeyMatch"/>
                    <add name="SpellCheckDictionary"/>
                    <add name="analytics"/>
                </Sources>
            </add>
        </SearchProfiles>
    </Search>
</configuration>
        

Search Element Attributes

Configure <Search> element attributes as needed. Attributes include:

@indexLocation
The path to the Lucene index files. The default value is App_Data\LuceneIndex.

Default value: App_Data\LuceneIndex

@indexingEnabled
If true, Lucene indexing is enabled.

Default value: false

@sideBySideDisabled
If false, side-by-side reindexing will be enabled. If true, side-by-side reindexing will be disabled.

Default value: false

If @sideBySideDisabled is not defined in Search.config, then side-by-side reindexing will be enabled.
@simpleQuery
If true, all indexable fields on a page are concatenated into a single Lucene field, which the query will then execute against. For this setting to take affect, ensure you re-index content.

Default value: false

Note
The @simpleQuery setting makes the query much leaner because it only executes searches against a single field. Once this setting is enabled, custom search implementations cannot query against individual fields as search facets.
@synonymsLocation
This value indicates the absolute path to the external Synonyms.xml file on your DSS server.

Default value: If this attribute is not defined, then the default location of synonyms.xml is the config folder in the root of the DSS project.

Before specifying a path, consider your InSite Search version and synonym configuration needs. See the Synonyms section for details.

Version Notes: InSite Search 2.9.8–2.10.14
In InSite Search 2.9.8–2.10.14, @synonymsLocation is not applicable, as synonyms are configured directly in Search.config.
@indexingQueueCapacity
When configured with a value, indicates limit of batch operations to queue for processing.
@collectorMaxDocs
Value indicates how many search items to return (i.e., upper limit) in a search. This attribute controls the number of search results, not the number indexed.

Default value: 10000 is the default if not designated.

@indexSearchReaderCacheTimeout
Time in seconds before the index reader times out between two requests.

Default value: 15

@queryMaxClauses
This integer value determines the maximum size of the returned result set. This is a setting that is used for performance tweaks, mainly for sites with large result return sets.
@optimizeSchedule
Version Notes: InSite Search 2.14+
Value schedules Lucene optimization operations for InSite Search.

To add a single time value, you can choose from multiple time formats. For example, valid formats include 2:00pm, 14:00, 14:00:00, and 14:15:07.0000-07:00. To add more than one time value, use the cron format. For example, entering 0 0 2,14 * * ? will run the Lucene Optimize operation daily at 2am and 2pm.

See Scheduling Lucene Optimization Operations for details.

@optimizeTimeout
Version Notes: InSite Search 2.14+
Value determines a cutoff time to prevent the Lucene optimizer from running. For large search datasets, this setting helps to prevent the optimizer from running during a certain time of day. This attribute only accepts one time value.

You can choose from multiple time formats. For example, valid formats include 2:00pm, 14:00, 14:00:00, and 14:15:07.0000-07:00. This value doesn't accept cron formatting.

See Scheduling Lucene Optimization Operations for details.

Hiliter Element Attributes

Configure the <Hiliter> element attributes within <Search> as needed. Attributes include:

@startTag
This start tag encapsulates (highlights) the search term within the returned matching fragment(s) from a search result. The default value is the opening strong tag.
@endTag
This end tag encapsulates the search term within the returned matching fragment(s) from a search result. The default value is the closing strong tag.

Settings Element Attributes

Configure settings as needed via the <add> elements within <Settings>, where the @name attribute specifies the setting name and @value specifies the setting value. For example:

<add name="IgnoreIdf" value="true" />

Settings include:

allowedFacetsforStatsProviders
Comma-delimited list of facet/field names for statistics.
IgnoreIDf
If true, ignores inverse document frequency. If false, common words across index rank lower.

Default value: true

IgnoreLengthNorm
If true, search scoring is as indexing time. If false, longer fields rank lower.

Default value: true

HighlightFragSizeLimit
Number indicates the total character limit on the highlight fragment element. If 0, then no limit.

Default value: 0

DedicatedHighlightFragField
Dedicated field for highlight fragments. When field doesn't exist, then InSite Search falls back to default fragment mechanism (i.e., Dedicated field > HiliteField.xml > Default highlighting of all fields).
IncludeComponentContentInPage
If true, InSite Search references component content. If false, which is the default, content within components will not be indexed.

Default value: false

IncludeSpellcheckSuggestionsInQueryBuilding
If true, the spellchecker source provides suggestions to include in building search result queries. If false, the spellchecker source will not provide suggestions to include in building search result queries.

Default value: true

QueryFieldsFileLocation
This setting specifies the location of file for listing the exclusive fields to construct term query. When undefined, the default location is the Config\QueryFrields.xml under DSS project root.
HiliteFieldsFileLocation
This setting specifies the location of file for listing the exclusive fields to construct highlight fragments. When missing, use the default location, which is Config\HiliteFields.xml under site root.
DocumentBoostByFacetsFileLocation
This setting specifies the location of file for listing the boosts by values of facets. When missing, use the default location, which is Config\DocumentBoostByFacets.xml folder under site root.
indexSearchReaderCacheTimeout
This value in seconds is used to improve indexing performance while site is under a heavy search load. The longer the time out, the longer the reader will be reused, thus improving performance at the cost of including outdated search content within the time range.

Default value: 15

SideBySideIndexingSideLimitMB
The size limit (MB) of the index to allow for side-by-side indexing. When the index size is over this limit, side-by-side indexing is disabled. If 0, then unlimited size.

Default value: 500

OmitLuceneMessagesLog
When set to true, the system will no longer generate Lucene messages logs in order to save disk space.

Default value: true

defaultIndexingAnalyzer
This attribute must be enabled in tandem with defaultQueryAnalyzer for stemming analyzer to work.
defaultQueryAnalyzer
This attribute must be enabled in tandem with defaultIndexingAnalyzer for stemming analyzer to work, properly.

Indexing Source Element Attributes

Configure indexing settings as needed via the <add> elements within <IndexingSources>, where the @name attribute specifies the setting name and @value specifies the setting value. For example:

<add name="CMSPublishedContent" type="Ingeniux.Runtime.Search.DssContentSearchSource" settingsFile="C:\PublishedContent\settings\SearchSource.config" />

Each indexing source @add element contains the following attributes:

@name
Any arbitrary, unique name for the indexing source.
@type
Search index class name.
Note
Search.config must have at least one main content index configured. Refer to the Index Library Classes table below for values.
@settingsFile
Filepath to source configuration file. The value must use an absolute path to SearchSource.config.
Index Library ClassIndex TypeNotes
Ingeniux.Runtime.Search.DssContentSearchSourceMainThis is for indexing (XML) CMS content, rendered via the DSS. It is not intended for Multi-Format Output (MFO) published content.
Ingeniux.Search.HtmlSiteSourceMainThis is used for web crawling a URL endpoint in the similar way as a traditional webcrawler.
Ingeniux.Search.AnalyticsSearchDocumentSourceIndependentWhen enabled, this creates a separate index that tracks queries made to the InSite Search instance.
Ingeniux.Search.KeyMatchSearchDocumentSourceIndependentThis creates an index used for Keymatch functionality. This index is based on a list of keywords as input to the index.
Ingeniux.Search.SpellCheckerSearchDocumentSourceIndependentThis index is used for search suggest functionality. A list of terms needs to be provided for this index to populate.

Search Profiles

When Search.config defines search profiles, InSite Search will only fetch results from sources specified within the profiles.

The direct child <add> elements of <SearchProfile> specify the each individual search profile. The @name attribute specifies the site name of the search profile.

Under the site name of the search profile, the direct child <add> elements of <Sources> specify each source included within the search profile. The @name attribute specifies the source name.

For example:

<SearchProfiles>
  <add name="Independent-search-profile-1">
    <Sources>
      <add name="KeyMatch"/>
      <add name="SpellCheckDictionary"/>
      <add name="analytics"/>
    </Sources>
  </add>
</SearchProfiles>

Unique Fields

Configure unique fields for InSite Search analyzers. If your site uses a custom indexer, you can use the <UnniqueFields> element to add additional unique fields for search analysis,

For example:

<UniqueFields>
  <add field="OriginalTitle" type="Ingeniux.Search.Analyzers.CaseSensitiveSearchIndexingAnalyzer" />
</UniqueFields>

For a unique field entry, <add> attributes include:

@field
Specify the original title of the unique field.
@type
Define the search analyzer type in this field. For example:

Ingeniux.Search.Analyzers.CaseSensitiveSearchIndexingAnalyzer

Synonyms

Synonym configuration differs depending on your CMS version and your InSite Search version. Follow the sections relevant to your implementation.

InSite Search 2.11+

In InSite Search 2.11, the <Synonyms> element is deprecated in Search.config. Instead, synonyms configuration now resides in a separate configuration XML file, synonyms.xml. Existing synonym entries will be migrated to this new format on upgrade to InSite Search 2.11.23+. This file can be configured as relative to the DSS application or as an absolute path to a file on the server. Alternatively, Users can also manage synonyms.xml as an asset to be updated and published in the CMS.

By default, synonyms.xml resides in the config folder of the DSS root directory. However, if you want to move the synonyms.xml outside this location, ensure you provide the absolute path to synonyms.xml in the synonymsLocation attribute of Search.config.

See Referencing External Synonyms for details to reference synonyms.xml from a different location than the DSS config folder.

Note
If you are upgrading to InSite Search 2.11+ and have synonyms configured in Search.config from an earlier InSite Search release, See Migrating Synonym Configuration for details.

InSite Search 2.9–2.10

In InSite Search 2.9.8–2.10.14, synonyms are configured in Search.config. You can add synonyms by adding an <S> element under <Synonyms> and then adding synonyms delimited by the comma character (,) to the @words attribute.

For example, your configured synonyms will look similar to this:

<Synonyms>
   <S words="organization,institution,company,firm,corporation"/>
   <S words="cat,kitty,feline,lion"/>
</Synonyms>
Note
For the sample configuration above, if the search term entered by an end user is corporation, then results are returned for all matches of organization, institution, company, firm and/or corporation in the indexed content.