Home CMS Documentation Home CMS 10 InSite Search InSite Search Reference Resources CMS 10.6 Search.Config Reference

CMS 10.6 Search.Config Reference

The primary functions of the Search.config file is to configure indexing behavior, to add index types by defining indexing sources, and to set other InSite Search configurations. It's important to note the two types of indexing libraries that can be defined within InSite Search:

Main Content Index: This index that stores page content from either a published CMS site or a site crawl indexing source. You can add multiple instances of each type of indexing source, but the resulting content is always stored in the defined @indexLocation value.
Independent Indexes: These indexes comprise of supplemental data that can be used on top of the main content index to enhance data results. Each independent index is stored in their own index location path on disk, as defined in each of their own separate configuration files.

Search.config Example

The example below displays generic code contained within Search.config.

Caution

The sample Search.config is only for reference and should not be used in place of the installed one.

Search.config example

<?xml version="1.0"?>
<configuration>
  <configSections>
    <section name="Search" type="Ingeniux.Search.Configuration.IndexingConfiguration, Ingeniux.Search" />
  </configSections>
  <Search indexLocation="App_Data\LuceneIndex" indexingEnabled="false" sideBySideDisabled="false" simpleQuery="false" indexingQueueCapacity="0" collectorMaxDocs="10000" indexSearchReaderCacheTimeout="15" queryMaxClauses="1024">
    <Hiliter startTag="&lt;strong&gt;" endTag="&lt;/strong&gt;" />
    <Settings>
      <!-- Search Facet Stats: List of facet/field names to provide stats for, comma delimited-->
      <add name="allowedFacetsForStatsProviders" value="" />
      <!-- Search Scoring: Ignore Inverted Document Frequency. When disabled, common words across index will rank lower-->
      <add name="IgnoreIdf" value="true" />
      <!-- Search Scoring as Indexing Time: When disabled, longer fields will rank lower-->
      <add name="IgnoreLengthNorm" value="true" />
      <!-- Number of characters limit on highlight fragments -->
      <add name="HighlightFragSizeLimit" value="0" />
      <!-- Using a dedicated field for highlight frags. When field doesn't exist, fallback to default frags mechanism-->
      <add name="DedicatedHighlightFragField" value="" />
      <!-- Whether to include references component content into page, default to false-->
      <add name="IncludeComponentContentInPage" value="false" />
      <!-- The spellchecker source to provide suggestion to include in search result query building-->
      <add name="IncludeSpellcheckSuggestionsInQueryBuilding" value="true" />
      <!-- The size limit (MB) for the index to allow for sbs. When index size is over this limit, sbs is disabled, 0 for unlimited -->
      <add name="SideBySideIndexingSideLimitMB" value="500" />
      <!-- The number of times to try and get the lucene lock before failing, defaults to 15 -->
      <!--<add name="MaxReaderFailureCount" value="15" />-->
      <!-- When set to true, no longer generate lucene messages logs to save disk space -->
      <!--<add name="OmitLuceneMessagesLog" value="true" />-->
      <!-- Specifies the location of file for listing the exclusive fields to construct term query.
					 When missing, use the default location at "Config" folder under site root -->
      <!--<add name="QueryFieldsFileLocation" value="F:\igxsites\search\Config\QueryFields.xml" />-->
      <!-- Specifies the location of file for listing the exclusive fields to construct highlight fragments.
					 When missing, use the default location at "Config" folder under site root -->
      <!--<add name="HiliteFieldsFileLocation" value="F:\igxsites\search\Config\HiliteFields.xml" />-->
      <!-- Specifies the location of file for listing the boosts by values of facets.
					 When missing, use the default location at "Config" folder under site root -->
      <!--<add name="DocumentBoostByFacetsFileLocation" value="F:\igxsites\search\Config\DocumentBoostByFacets.xml" />-->
      <!-- Stemming analyzers -->
      <!--<add name="defaultIndexingAnalyzer" value="Ingeniux.Search.Analyzers.StemmingIndexingAnalyzer, Ingeniux.Search" />
					<add name="defaultQueryAnalyzer" value="Ingeniux.Search.Analyzers.StemmingQueryAnalyzer, Ingeniux.Search" />-->
    </Settings>
    <IndexingSources>
      <!--<add name="CMSPublishedContent" type="Ingeniux.Runtime.Search.DssContentSearchSource" settingsFile="C:\PublishedContent\settings\SearchSource.config" />-->
      <!--<add name="SiteCrawl" type="Ingeniux.Search.HtmlSiteSource" settingsFile="App_Data\sitecrawlersource.config" />-->
	  <!--<add name="IntranetCartella" type="Cartella.Search.CartellaSearchSource" settingsFile="App_Data\cartellaSearchSource.config" />-->
      <!--<add name="SpellCheckDictionary" type="Ingeniux.Search.SpellCheckerSearchDocumentSource" settingsFile="App_Data\spellcheckerSource.config" />-->
      <!--<add name="Analytics" type="Ingeniux.Search.AnalyticsSearchDocumentSource" settingsFile="App_Data\analyticSource.config" />-->
      <!--<add name="Keymatch" type="Ingeniux.Search.KeyMatchSearchDocumentSource" settingsFile="App_Data\keymatchSource.config" />-->
    </IndexingSources>
    <SearchProfiles>
			<!--<add name="Intranet">
				<Sources>
					<add name="IntranetCartella" />
					<add name="IntranetPublicContent" />
					<add name="IntranetDirectory" />
				</Sources>
			</add>-->
    </SearchProfiles>
    <UniqueFields>
      <!--<add field="OriginalTitle" type="Ingeniux.Search.Analyzers.CaseSensitiveSearchIndexingAnalyzer" />-->
    </UniqueFields>
    <Synonyms>
      <!--Synonyms are deprecated from Search.config. Please find synonym definition in the file 'config\synonyms.xml'.-->
    </Synonyms>
  </Search>
</configuration>

Search Element Attributes

Configure <Search> element attributes as needed. Attributes include:

@indexLocation: The path to the Lucene index files. The default value is App_Data\LuceneIndex.
Default value: App_Data\LuceneIndex
@indexingEnabled: If true, Lucene indexing is enabled.
Default value: false
@sideBySideDisabled: If false, side-by-side reindexing will be enabled. If true, side-by-side reindexing will be disabled.
Default value: false
If @sideBySideDisabled is not defined in Search.config, then side-by-side reindexing will be enabled.
@simpleQuery: If true, all indexable fields on a page are concatenated into a single Lucene field, which the query will then execute against. For this setting to take affect, ensure you re-index content.
Default value: false
Note
The @simpleQuery setting makes the query much leaner because it only executes searches against a single field. Once this setting is enabled, custom search implementations cannot query against individual fields as search facets.
@synonymsLocation: This value indicates the absolute path to the external Synonyms.xml file on your DSS server. If this attribute is undefined, then the default location of synonyms.xml depends on your InSite Search version.
Before specifying a path, consider your InSite Search version and synonym configuration needs. See the Synonyms section for details.
@indexingQueueCapacity: When configured with a value, indicates limit of batch operations to queue for processing.
@collectorMaxDocs: Value indicates how many search items to return (i.e., upper limit) in a search. This attribute controls the number of search results, not the number indexed.
Default value: 10000 is the default if not designated.
@indexSearchReaderCacheTimeout: Time in seconds before the index reader times out between two requests.
Default value: 15
@queryMaxClauses: This integer value determines the maximum size of the returned result set. This is a setting that is used for performance tweaks, mainly for sites with large result return sets.
@optimizeSchedule: Version Notes: InSite Search 2.14+
Value schedules Lucene optimization operations for InSite Search.
To add a single time value, you can choose from multiple time formats. For example, valid formats include 2:00pm, 14:00, 14:00:00, and 14:15:07.0000-07:00. To add more than one time value, use the cron format. For example, entering 0 0 2,14 * * ? will run the Lucene Optimize operation daily at 2am and 2pm.
See Scheduling Lucene Optimization Operations for details.
@optimizeTimeout: Version Notes: InSite Search 2.14+
Value determines a cutoff time to prevent the Lucene optimizer from running. For large search datasets, this setting helps to prevent the optimizer from running during a certain time of day. This attribute only accepts one time value.
You can choose from multiple time formats. For example, valid formats include 2:00pm, 14:00, 14:00:00, and 14:15:07.0000-07:00. This value doesn't accept cron formatting.
See Scheduling Lucene Optimization Operations for details.

Hiliter Element Attributes

Configure the <Hiliter> element attributes within <Search> as needed. Attributes include:

@startTag: This start tag encapsulates (highlights) the search term within the returned matching fragment(s) from a search result. The default value is the opening strong tag.
@endTag: This end tag encapsulates the search term within the returned matching fragment(s) from a search result. The default value is the closing strong tag.

Settings Element Attributes

Configure settings as needed via the <add> elements within <Settings>, where the @name attribute specifies the setting name and @value specifies the setting value. For example:

<add name="IgnoreIdf" value="true" />

Settings include:

allowedFacetsforStatsProviders: Comma-delimited list of facet/field names for statistics.
IgnoreIDf: If true, ignores inverse document frequency. If false, common words across index rank lower.
Default value: true
IgnoreLengthNorm: If true, search scoring is as indexing time. If false, longer fields rank lower.
Default value: true
HighlightFragSizeLimit: Number indicates the total character limit on the highlight fragment element. If 0, then no limit.
Default value: 0
DedicatedHighlightFragField: Dedicated field for highlight fragments. When field doesn't exist, then InSite Search falls back to default fragment mechanism (i.e., Dedicated field > HiliteField.xml > Default highlighting of all fields).
IncludeComponentContentInPage: If true, InSite Search references component content. If false, which is the default, content within components will not be indexed.
Default value: false
IncludeSpellcheckSuggestionsInQueryBuilding: If true, the spellchecker source provides suggestions to include in building search result queries. If false, the spellchecker source will not provide suggestions to include in building search result queries.
Default value: true
QueryFieldsFileLocation: This setting specifies the location of file for listing the exclusive fields to construct term query. When undefined, the default location is the Config\QueryFrields.xml under DSS project root.
HiliteFieldsFileLocation: This setting specifies the location of file for listing the exclusive fields to construct highlight fragments. When missing, use the default location, which is Config\HiliteFields.xml under site root.
DocumentBoostByFacetsFileLocation: This setting specifies the location of file for listing the boosts by values of facets. When missing, use the default location, which is Config\DocumentBoostByFacets.xml folder under site root.
indexSearchReaderCacheTimeout: This value in seconds is used to improve indexing performance while site is under a heavy search load. The longer the time out, the longer the reader will be reused, thus improving performance at the cost of including outdated search content within the time range.
Default value: 15
SideBySideIndexingSideLimitMB: The size limit (MB) of the index to allow for side-by-side indexing. When the index size is over this limit, side-by-side indexing is disabled. If 0, then unlimited size.; Default value: 500
OmitLuceneMessagesLog: When set to true, the system will no longer generate Lucene messages logs in order to save disk space.
Default value: true
defaultIndexingAnalyzer: This attribute must be enabled in tandem with defaultQueryAnalyzer for stemming analyzer to work.
defaultQueryAnalyzer: This attribute must be enabled in tandem with defaultIndexingAnalyzer for stemming analyzer to work, properly.

Indexing Source Element Attributes

Configure indexing settings as needed via the <add> elements within <IndexingSources>, where the @name attribute specifies the setting name and @value specifies the setting value. For example:

<add name="CMSPublishedContent" type="Ingeniux.Runtime.Search.DssContentSearchSource" settingsFile="C:\PublishedContent\settings\SearchSource.config" />

Each indexing source @add element contains the following attributes:

@name: Any arbitrary, unique name for the indexing source.
@type: Search index class name.
Note
Search.config must have at least one main content index configured. Refer to the Index Library Classes table below for values.
@settingsFile: Filepath to source configuration file. The value must use an absolute path to SearchSource.config.

Index Library Class	Index Type	Notes
Ingeniux.Runtime.Search.DssContentSearchSource	Main	This is for indexing (XML) CMS content, rendered via the DSS. It is not intended for Multi-Format Output (MFO) published content.
Ingeniux.Search.HtmlSiteSource	Main	This is used for web crawling a URL endpoint in the similar way as a traditional webcrawler.
Ingeniux.Search.AnalyticsSearchDocumentSource	Independent	When enabled, this creates a separate index that tracks queries made to the InSite Search instance.
Ingeniux.Search.KeyMatchSearchDocumentSource	Independent	This creates an index used for Keymatch functionality. This index is based on a list of keywords as input to the index.
Ingeniux.Search.SpellCheckerSearchDocumentSource	Independent	This index is used for search suggest functionality. A list of terms needs to be provided for this index to populate.

Search Profiles

When Search.config defines search profiles, InSite Search will only fetch results from sources specified within the profiles.

The direct child <add> elements of <SearchProfile> specify the each individual search profile. The @name attribute specifies the site name of the search profile.

Under the site name of the search profile, the direct child <add> elements of <Sources> specify each source included within the search profile. The @name attribute specifies the source name.

For example:

<SearchProfiles>
  <add name="Independent-search-profile-1">
    <Sources>
      <add name="KeyMatch"/>
      <add name="SpellCheckDictionary"/>
      <add name="analytics"/>
    </Sources>
  </add>
</SearchProfiles>

Unique Fields

Configure unique fields for InSite Search analyzers. If your site uses a custom indexer, you can use the <UnniqueFields> element to add additional unique fields for search analysis,

For example:

<UniqueFields>
  <add field="OriginalTitle" type="Ingeniux.Search.Analyzers.CaseSensitiveSearchIndexingAnalyzer" />
</UniqueFields>

For a unique field entry, <add> attributes include:

@field: Specify the original title of the unique field.
@type: Define the search analyzer type in this field. For example:
Ingeniux.Search.Analyzers.CaseSensitiveSearchIndexingAnalyzer

Synonyms

Synonyms can be configured directly in the CMS 10.6 UI. Users with administrator permissions can modify synonyms in Administration > InSite Search Configuration > Synonyms. When content is published and replicated to the DSS, the synonym configurations are replicated to Settings/synonyms.xml in the DSS published content path.

However, if you want to configure synonyms in sysnoyms.xml outside the CMS UI, you can do so.

Synonym configuration differs depending on your InSite Search version and configuration preferences. Follow the section that corresponds with your version.

Note

If your organization configured the synonyms.xml file for a previous Ingeniux CMS version, administrators can import this file into CMS 10.6. Once imported, your synonyms display in the Custom Synonyms tab. See CMS 10.6 Importing Synonyms for details.

InSite Search 2.14

In InSite Search 2.14, the <Synonyms> element is deprecated in Search.config. Instead, synonym configuration resides in a separate configuration XML file, synonyms.xml. If @synonymsLocation is not defined in Search.config, then InSite Search references synonyms.xml in the config folder of the DSS root directory by default.

Keep in mind that, to configure synonyms directly from the CMS UI, InSite Search must reference Settings\synonyms.xml in the DSS published content directory. Otherwise, UI synonym configurations will not carry over to the published content.

If you are using InSite Search 2.14 with CMS 10.6 synonyms configured in the UI, change the default location from config/synonyms.xml to Settings\synonyms.xml. Add the absolute path of Settings\synonyms.xml to @synonymsLocation in Search.config.
If you want to continue configuring synonyms in config\synonyms.xml without using the CMS UI, then the @synonymsLocation attribute can remain as undefined in Search.config. The config\synonyms.xml path is the default synonym location in InSite Search 2.14.
If you want to store synonym configurations in a different location than config\synonyms.xml, add the absolute path of synonyms.xml to @synonymsLocation in Search.config. See Referencing External Synonyms for details.

CMS 10.6 Search.Config Reference

Table of Contents Release Notes Search

Table of Contents