CMS 10.1–10.5 Search.Config Reference
The primary functions of the Search.config file is to configure indexing behavior, to add index types by defining indexing sources, and to set other InSite Search configurations. It's important to note the two types of indexing libraries that can be defined within InSite Search:
- Main Content Index
- This index that stores page content from either a published CMS site or a site crawl indexing source. You can add multiple instances of each type of indexing source, but the resulting content is always stored in the defined
@indexLocation
value. - Independent Indexes
- These indexes comprise of supplemental data that can be used on top of the main content index to enhance data results. Each independent index is stored in their own index location path on disk, as defined in each of their own separate configuration files.
Search.config Example
The example below displays generic code contained within Search.config.
Search Element Attributes
Configure <Search>
element attributes as needed. Attributes include:
@indexLocation
- The path to the Lucene index files. The default value is App_Data\LuceneIndex.
Default value: App_Data\LuceneIndex
@indexingEnabled
- If true, Lucene indexing is enabled.
Default value: false
@sideBySideDisabled
- If false, side-by-side reindexing will be enabled. If true, side-by-side reindexing will be disabled.
Default value: false
If@sideBySideDisabled
is not defined in Search.config, then side-by-side reindexing will be enabled. @simpleQuery
- If true, all indexable fields on a page are concatenated into a single Lucene field, which the query will then execute against. For this setting to take affect, ensure you re-index content.
Default value: false
NoteThe@simpleQuery
setting makes the query much leaner because it only executes searches against a single field. Once this setting is enabled, custom search implementations cannot query against individual fields as search facets. @synonymsLocation
- This value indicates the absolute path to the external Synonyms.xml file on your DSS server.
Default value: If this attribute is not defined, then the default location of synonyms.xml is the config folder in the root of the DSS project.
Before specifying a path, consider your InSite Search version and synonym configuration needs. See the Synonyms section for details.
Version Notes: InSite Search 2.9.8–2.10.14In InSite Search 2.9.8–2.10.14,@synonymsLocation
is not applicable, as synonyms are configured directly in Search.config. @indexingQueueCapacity
- When configured with a value, indicates limit of batch operations to queue for processing.
@collectorMaxDocs
- Value indicates how many search items to return (i.e., upper limit) in a search. This attribute controls the number of search results, not the number indexed.
Default value: 10000 is the default if not designated.
@indexSearchReaderCacheTimeout
- Time in seconds before the index reader times out between two requests.
Default value: 15
@queryMaxClauses
- This integer value determines the maximum size of the returned result set. This is a setting that is used for performance tweaks, mainly for sites with large result return sets.
@optimizeSchedule
- Version Notes: InSite Search 2.14+Value schedules Lucene optimization operations for InSite Search.
To add a single time value, you can choose from multiple time formats. For example, valid formats include 2:00pm, 14:00, 14:00:00, and 14:15:07.0000-07:00. To add more than one time value, use the cron format. For example, entering 0 0 2,14 * * ? will run the Lucene Optimize operation daily at 2am and 2pm.
See Scheduling Lucene Optimization Operations for details.
@optimizeTimeout
- Version Notes: InSite Search 2.14+Value determines a cutoff time to prevent the Lucene optimizer from running. For large search datasets, this setting helps to prevent the optimizer from running during a certain time of day. This attribute only accepts one time value.
You can choose from multiple time formats. For example, valid formats include 2:00pm, 14:00, 14:00:00, and 14:15:07.0000-07:00. This value doesn't accept cron formatting.
See Scheduling Lucene Optimization Operations for details.
Hiliter Element Attributes
Configure the <Hiliter>
element attributes within <Search>
as needed. Attributes include:
@startTag
- This start tag encapsulates (highlights) the search term within the returned matching fragment(s) from a search result. The default value is the opening strong tag.
@endTag
- This end tag encapsulates the search term within the returned matching fragment(s) from a search result. The default value is the closing strong tag.
Settings Element Attributes
Configure settings as needed via the <add>
elements within <Settings>
, where the @name
attribute specifies the setting name and @value
specifies the setting value. For example:
<add name="IgnoreIdf" value="true" />
Settings include:
- allowedFacetsforStatsProviders
- Comma-delimited list of facet/field names for statistics.
- IgnoreIDf
- If true, ignores inverse document frequency. If false, common words across index rank lower.
Default value: true
- IgnoreLengthNorm
- If true, search scoring is as indexing time. If false, longer fields rank lower.
Default value: true
- HighlightFragSizeLimit
- Number indicates the total character limit on the highlight fragment element. If 0, then no limit.
Default value: 0
- DedicatedHighlightFragField
- Dedicated field for highlight fragments. When field doesn't exist, then InSite Search falls back to default fragment mechanism (i.e., Dedicated field > HiliteField.xml > Default highlighting of all fields).
- IncludeComponentContentInPage
- If true, InSite Search references component content. If false, which is the default, content within components will not be indexed.
Default value: false
- IncludeSpellcheckSuggestionsInQueryBuilding
- If true, the spellchecker source provides suggestions to include in building search result queries. If false, the spellchecker source will not provide suggestions to include in building search result queries.
Default value: true
- QueryFieldsFileLocation
- This setting specifies the location of file for listing the exclusive fields to construct term query. When undefined, the default location is the Config\QueryFrields.xml under DSS project root.
- HiliteFieldsFileLocation
- This setting specifies the location of file for listing the exclusive fields to construct highlight fragments. When missing, use the default location, which is Config\HiliteFields.xml under site root.
- DocumentBoostByFacetsFileLocation
- This setting specifies the location of file for listing the boosts by values of facets. When missing, use the default location, which is Config\DocumentBoostByFacets.xml folder under site root.
- indexSearchReaderCacheTimeout
- This value in seconds is used to improve indexing performance while site is under a heavy search load. The longer the time out, the longer the reader will be reused, thus improving performance at the cost of including outdated search content within the time range.
Default value: 15
- SideBySideIndexingSideLimitMB
- The size limit (MB) of the index to allow for side-by-side indexing. When the index size is over this limit, side-by-side indexing is disabled. If 0, then unlimited size.
Default value: 500
- OmitLuceneMessagesLog
- When set to true, the system will no longer generate Lucene messages logs in order to save disk space.
Default value: true
- defaultIndexingAnalyzer
- This attribute must be enabled in tandem with defaultQueryAnalyzer for stemming analyzer to work.
- defaultQueryAnalyzer
- This attribute must be enabled in tandem with defaultIndexingAnalyzer for stemming analyzer to work, properly.
Indexing Source Element Attributes
Configure indexing settings as needed via the <add>
elements within <IndexingSources>
, where the @name
attribute specifies the setting name and @value
specifies the setting value. For example:
<add name="CMSPublishedContent" type="Ingeniux.Runtime.Search.DssContentSearchSource" settingsFile="C:\PublishedContent\settings\SearchSource.config" />
Each indexing source @add
element contains the following attributes:
@name
- Any arbitrary, unique name for the indexing source.
@type
- Search index class name.NoteSearch.config must have at least one main content index configured. Refer to the Index Library Classes table below for values.
@settingsFile
- Filepath to source configuration file. The value must use an absolute path to SearchSource.config.
Index Library Class | Index Type | Notes |
---|---|---|
Ingeniux.Runtime.Search.DssContentSearchSource | Main | This is for indexing (XML) CMS content, rendered via the DSS. It is not intended for Multi-Format Output (MFO) published content. |
Ingeniux.Search.HtmlSiteSource | Main | This is used for web crawling a URL endpoint in the similar way as a traditional webcrawler. |
Ingeniux.Search.AnalyticsSearchDocumentSource | Independent | When enabled, this creates a separate index that tracks queries made to the InSite Search instance. |
Ingeniux.Search.KeyMatchSearchDocumentSource | Independent | This creates an index used for Keymatch functionality. This index is based on a list of keywords as input to the index. |
Ingeniux.Search.SpellCheckerSearchDocumentSource | Independent | This index is used for search suggest functionality. A list of terms needs to be provided for this index to populate. |
Search Profiles
When Search.config defines search profiles, InSite Search will only fetch results from sources specified within the profiles.
The direct child <add>
elements of <SearchProfile>
specify the each individual search profile. The @name
attribute specifies the site name of the search profile.
Under the site name of the search profile, the direct child <add>
elements of <Sources>
specify each source included within the search profile. The @name
attribute specifies the source name.
For example:
<SearchProfiles>
<add name="Independent-search-profile-1">
<Sources>
<add name="KeyMatch"/>
<add name="SpellCheckDictionary"/>
<add name="analytics"/>
</Sources>
</add>
</SearchProfiles>
Unique Fields
Configure unique fields for InSite Search analyzers. If your site uses a custom indexer,
you can use the <UnniqueFields>
element to add additional unique
fields for search analysis,
For example:
<UniqueFields>
<add field="OriginalTitle" type="Ingeniux.Search.Analyzers.CaseSensitiveSearchIndexingAnalyzer" />
</UniqueFields>
For a unique field entry, <add>
attributes include:
@field
- Specify the original title of the unique field.
@type
- Define the search analyzer type in this field. For
example:
Ingeniux.Search.Analyzers.CaseSensitiveSearchIndexingAnalyzer
Synonyms
Synonym configuration differs depending on your CMS version and your InSite Search version. Follow the sections relevant to your implementation.
InSite Search 2.11+
In InSite Search 2.11, the <Synonyms>
element is deprecated in Search.config. Instead, synonyms configuration now resides in a separate configuration XML file, synonyms.xml. Existing synonym entries will be migrated to this new format on upgrade to InSite Search 2.11.23+. This file can be configured as relative to the DSS application or as an absolute path to a file on the server. Alternatively, Users can also manage synonyms.xml as an asset to be updated and published in the CMS.
By default, synonyms.xml resides in the config folder of the DSS root directory. However, if you want to move the synonyms.xml outside this location, ensure you provide the absolute path to synonyms.xml in the synonymsLocation attribute of Search.config.
See Referencing External Synonyms for details to reference synonyms.xml from a different location than the DSS config folder.
InSite Search 2.9–2.10
In InSite Search 2.9.8–2.10.14, synonyms are configured in Search.config. You can add synonyms by adding an <S>
element under <Synonyms>
and then adding synonyms delimited by the comma character (,) to the @words
attribute.
For example, your configured synonyms will look similar to this:
<Synonyms>
<S words="organization,institution,company,firm,corporation"/>
<S words="cat,kitty,feline,lion"/>
</Synonyms>