Search.Config Reference


The primary functions of the Search.config file is to configure indexing behavior, to add index types by defining indexing sources, and to set other InSite Search configurations. It's important to note that there are two types of indexing libraries that can be defined within InSite Search:

  • The Main Content Index is the index that stores page content from either a published CMS site or a site crawl indexing source. You can add multiple instances of each type of indexing source, but the resulting content is always stored in the defined @indexLocation value.
  • Independent Indexes comprise of supplemental data that can be used on top of the main content index to enhance data results. Each independent index is stored in their own index location path on disk, as defined in each of their own separate configuration files.

Search.config Example

The example below displays generic code contained within Search.config.
Search.config Example

            <?xml version="1.0"?>
<configuration>
    <configSections>
        <section name="Search"
            type="Ingeniux.Search.Configuration.IndexingConfiguration, Ingeniux.Search"/>
    </configSections>
    <Search indexLocation="App_Data\LuceneIndex"
        synonymslocation="[Drive]:\[path to DSS root directory]\published\iss-config\Synonyms.xml"
        indexingEnabled="true" queryMaxClauses="1024">
        <Hiliter startTag="&lt;strong&gt;" endTag="&lt;/strong&gt;"/>
        <Settings>
            <add name="defaultIndexingAnalyzer"
                value="Ingeniux.Search.Analyzers.StemmingIndexingAnalyzer, Ingeniux.Search"/>
            <add name="defaultQueryAnalyzer"
                value="Ingeniux.Search.Analyzers.StemmingQueryAnalyzer, Ingeniux.Search"/>
            <add name="QueryFieldsFileLocation"
                value="[Drive]:\[path to DSS root directory]\[subfolder(s)]\QueryFields.xml"/>
            <add name="DocumentBoostByFacetsFileLocation"
                value="App_Data\DocumentBoostByFacetsFileLocation.xml"/>
            <add name="GSearchFieldMapping"
                value="[Drive]:\[path to DSS root directory]\[subfolder(s)]\GSearchFieldMapping.xml"/>
        </Settings>
        <IndexingSources>
            <add name="CMSPublishedContent" type="Ingeniux.Runtime.Search.DssContentSearchSource"
                settingsFile="[Drive]:\[path to DSS root directory]\settings\SearchSource.config"/>
            <add name="KeyMatch" type="Ingeniux.Search.KeyMatchSearchDocumentSource"
                settingsFile="App_Data\KeymatchSource.config"/>
            <add name="SpellCheckDictionary" type="Ingeniux.Search.SpellCheckerSearchDocumentSource"
                settingsFile="App_Data\spellcheckerSource.config"/>
            <add name="SiteCrawlerSource" type="Ingeniux.Search.HtmlSiteSource"
                settingsFile="App_Data\sitecrawlerSource.config"/>
            <add name="Analytics" type="Ingeniux.Search.AnalyticsSearchDocumentSource"
                settingsFile="App_Data\analyticSource.config"/>        
        </IndexingSources>
        <SearchProfiles>
            <add name="Independent-search-profile-1">
                <Sources>
                    <add name="KeyMatch"/>
                    <add name="SpellCheckDictionary"/>
                    <add name="analytics"/>
                </Sources>
            </add>
        </SearchProfiles>
    </Search>
</configuration>
         
<Search>element attributes include:
  • @indexLocation: This is a path to the Lucene index files. Its default value is App_Data\LuceneIndex.
  • @indexingEnabled: If true, then Lucene indexing is enabled.
  • @sideBySideDisabled: If true, side-by-side re-indexing will be disabled. Enabled by default or when undefined.
  • @simpleQuery: If true, all indexable fields on a page are concatenated into a single Lucene field, which the query will then execute against. Disabled by default. Requires re-indexing content if enabled.
    Note
    The @simpleQuery option makes the query much leaner because it only executes searches against a single field. Once the simple query option is enabled, custom search implementations cannot query against individual fields as search facets.
  • @synonymsLocation: Value indicates the absolute path to the external Synonyms.xml file on your DSS server; the default location when undefined is the Config\ folder in the root of the DSS project. For example, Synonyms.xml could reside on in Assets Manager and be published to the DSS server.
  • @indexingQueueCapacity: When configured with a value, indicates limit of batch operations to queue for processing.
  • @collectorMaxDocs: Value indicates how many search items to return (i.e., upper limit) in a search. 10000 is the default if not designated. This attribute controls the number of search results, not the number indexed.
  • @indexSearchReaderCacheTimeout: Time in seconds before the index reader times out between two requests. Default value is 15.
  • @queryMaxClauses: This integer value determines the maximum size of the returned result set. This is a setting that is used for performance tweaks, mainly for sites with large result return sets.
  • Version Notes: ISS 2.14+
    @optimizeSchedule: Value schedules Lucene optimization operations for InSite Search.
    Note
    To add a single time value, you can choose from multiple time formats. For example, valid formats include 2:00pm, 14:00, 14:00:00, and 14:15:07.0000-07:00. To add more than one time value, use the cron format. For example, entering 0 0 2,14 * * ? will run the Lucene Optimize operation daily at 2am and 2pm.

    See Scheduling Lucene Optimization Operations for details.

  • Version Notes: ISS 2.14+
    @optimizeTimeout: Value determines a cutoff time to prevent the Lucene optimizer from running. For large search datasets, this setting helps to prevent the optimizer from running during a certain time of day. This attribute only accepts one time value.
    Note
    You can choose from multiple time formats. For example, valid formats include 2:00pm, 14:00, 14:00:00, and 14:15:07.0000-07:00. This value doesn't accept cron formatting.

    See Scheduling Lucene Optimization Operations for details.

The <Hiliter> tag within <Search> attributes:
  • @startTag: This start tag encapsulates (highlights) the search term within the returned matching fragment(s) from a search result. The default value is the opening strong tag.
  • @endTag: This end tag encapsulates the search term within the returned matching fragment(s) from a search result. The default value is the closing strong tag.
The <add> tag within <Settings> attributes:
  • @allowedFacetsforStatsProviders: Comma-delimited list of facet/field names for stats.
  • @IgnoreIDf: If true, ignores inverse document frequency. When disabled, common words across index rank lower.
  • @IgnoreLengthNorm: If true, search scoring is as indexing time. When disabled, longer fields rank lower.
  • @HighlightFragSizeLimit: Number indicates the total character limit on the highlight fragment element. If 0, then no limit.
  • @DedicatedHighlightFragField: Dedicated field for highlight fragments. When field doesn't exist, then ISS falls back to default fragment mechanism (i.e., Dedicated field => HiliteField.xml => Default highlighting of all fields).
  • @IncludeComponentContentInPage: If true, InSite Search references content within components. If false, which is the default, content within components will not be indexed.
  • @IncludeSpellcheckSuggestionsInQueryBuilding: If true, the spellchecker source provides suggestions to include in building search result queries.
  • @QueryFieldsFileLocation: Specifies the location of file for listing the exclusive fields to construct term query. Uses the default location Config folder under DSS project root when undefined.
  • @HiliteFieldsFileLocation: Specifies the location of file for listing the exclusive fields to construct highlight fragments. When missing, use the default location, which is Config folder under site root.
  • @DocumentBoostByFacetsFileLocation: Specifies the location of file for listing the boosts by values of facets. When missing, use the default location, which is Config folder under site root.
  • @indexSearchReaderCacheTimeout: This value in seconds is used to improve indexing performance while site is under a heavy search load. The longer the time out, the longer the reader will be reused, thus improving performance at the cost of including outdated search content within the time range. Default is 15 when @indexSearchReaderCacheTimeout is not specified.
  • @SideBySideIndexingSideLimitMB: The size limit (MB) of the index to allow for side-by-side indexing. When the index size is over this limit, side-by-side indexing is disabled. If 0, then unlimited size.
  • @OmitLuceneMessagesLog: When set to true, the system will no longer generate Lucene messages logs in order to save disk space. Default is true.
  • @defaultIndexingAnalyzer: This attribute must be enabled in tandem with defaultQueryAnalyzer for stemming analyzer to work.
  • @defaultQueryAnalyzer: This attribute must be enabled in tandem with @defaultIndexingAnalyzer for stemming analyzer to work, properly.
Locate the @add tag within @IndexingSources attributes:
  • @type: Search index class name.

    Note: Search.config must have at least one main content index configured. Refer to the Index Library Classes table below for values.

  • @name: Any arbitrary, unique name.
  • @settingsFile: Path to source configuration file. The settingsFile value must use an absolute path to SearchSource.config.
Index Library ClassIndex TypeNotes
Ingeniux.Runtime.Search.DssContentSearchSourceMainThis is for indexing (XML) CMS content, rendered via the DSS. It is not intended for Multi-Format Output (MFO) published content.
Ingeniux.Search.HtmlSiteSourceMainUsed for web crawling a URL endpoint in the similar way as a traditional webcrawler.
Ingeniux.Search.AnalyticsSearchDocumentSourceIndependentWhen enabled, it creates a separate index that tracks queries made to the InSite Search instance.
Ingeniux.Search.KeyMatchSearchDocumentSourceIndependentCreates an index used for Keymatch functionality. This index is based on a list of keywords as input to the index.
Ingeniux.Search.SpellCheckerSearchDocumentSourceIndependentIndex used for search suggest functionality. A list of terms needs to be provided for this index to populate.