# Ingeniux CMS URL Audit Report Instructions

## Purpose

Use these instructions with an Ingeniux CMS `UrlMap.xml` export to create an Excel-based URL audit report with visuals, issue detection, and cleanup recommendations.

The goal is to help content managers and site administrators identify URL risks caused by page moves, renames, canonical URL changes, outdated redirect history, duplicate URL collisions, and site tree restructuring.

## Important Ingeniux URL Management Context

URL history is not automatically bad.

In Ingeniux CMS/DSS, historical URL entries are often intentional and useful. They can preserve continuity after a page is renamed or moved and can support 301 redirects from old live URLs to the current URL.

The audit should distinguish between:

- healthy redirect history that protects users and SEO
- redirect accumulation that may be old enough or low-value enough to review for cleanup
- URL collision patterns that indicate naming conflicts and should be cleaned up

## Source Data Structure

The URL map contains a root `Site` element with settings such as:

- `HomePageID`
- `ForceLowerCaseURL`
- `AutoRedirectCanonicalURL`
- `UseStructuredURL`
- `UseAliasUrls`
- `URLExtension`
- `EnabledExtensions`

Each `Page` element may include:

- `ID`
- `Path`
- `Schema`
- Child `Moved` elements
- Child `Renamed` elements
- Optional `Canonical="true"` attribute on renamed paths

## Step 1: Parse and Normalize the Data

Create a normalized table where each row represents either:

1. A current page URL
2. A moved historical URL
3. A renamed historical URL

Recommended columns:

- Page ID
- Current Path
- Schema
- URL Entry Type
  - Current
  - Moved
  - Renamed
- Historical Path
- Is Canonical
- Current Path Depth
- Historical Path Depth
- Current Section
- Historical Section
- Section Changed
- Contains xID Suffix
- Contains Placeholder Pattern
- URL Issue Category
- Recommended Action
- Review Priority

## Step 2: Create Summary Metrics

Generate a summary worksheet with:

- Total page count
- Count of current URLs
- Count of moved URLs
- Count of renamed URLs
- Count of canonical historical URLs
- Count of pages with more than one historical URL
- Count of pages with both moved and renamed history
- Count of current URLs with xID suffixes
- Count of historical URLs with xID suffixes
- Count of URL changes by schema
- Count of URL changes by top-level section
- Count of large section move clusters
- Count of cleanup tasks by priority

## Step 3: Detect URL Issues

### 1. Large-Scale Section Moves

Identify patterns where:

- many pages moved from one section to another
- an entire branch appears to have changed URL structure
- historical paths cluster under one old section while current paths cluster under a new section

Examples:

- `/blogs/...` → `/knowledgecenter/blog/...`
- `/section/...` → `/training/...`

#### Why This Matters

Large section moves can create substantial redirect growth. These redirects are often correct, intentional, and necessary when the old URLs were live, indexed, externally linked, or used in bookmarks/campaigns.

The report should call out large section moves as a maintenance warning, not as an automatic error.

#### Important Ingeniux Consideration

Redirect history is only created when the previous URL was live/publishable before the move or rename. This means old URL history can be important, but it may also become obsolete over time.

#### Recommended Action

Review large redirect clusters using external validation data:

- Google Analytics landing page data
- Google Search Console indexed URL data
- Server/CDN logs
- backlink data
- campaign URL records

Redirect cleanup may be appropriate when:

- old URLs were never indexed
- the site was not live at the time
- URLs were live for too short a period to matter
- analytics/logs show no remaining traffic
- the move occurred long ago and old URLs no longer provide value

Do not remove redirect history without validation.

---

### 2. xID-Suffixed URL Patterns

Flag URLs where the final URL segment contains an Ingeniux xID suffix pattern.

Examples:

- `/searchresultsx55`
- `/searchresults-x55`
- `/searchresults_x55`
- `/products/loremipsumx36`

Suggested detection pattern:

```regex
[-_]?x\d+$
```

Apply detection to:

- current page URLs
- moved historical URLs
- renamed historical URLs

#### Why This Matters

In Ingeniux CMS, xID suffixes are commonly introduced when there is a duplicate sibling page name or URL collision. Depending on DSS/publishing target settings, the xID may be appended directly or separated with a dash or underscore.

These URLs are functional, but they are usually not desirable as permanent public URLs.

#### Recommended Action

For current URLs containing xID suffixes:

- mark as high priority
- resolve duplicate sibling page names
- rename the page to create a clean slug
- validate redirect behavior before changing a live URL

For historical URLs containing xID suffixes:

- mark as medium priority
- validate whether those redirects are still needed
- use analytics, indexing, or logs before removing historical URL entries

---

### 3. Pages with Multiple Historical URLs

Flag pages that have more than one `Moved` or `Renamed` entry.

#### Why This Matters

Multiple historical URLs may indicate:

- repeated restructuring
- redirect accumulation
- legacy URL buildup
- possible redirect chain complexity

This is not automatically a problem. It should be reviewed for lifecycle management.

#### Recommended Action

- Verify redirect behavior
- Confirm canonical URL behavior
- Check analytics and logs for old URL traffic
- Remove obsolete redirect history only if safe

---

### 4. Pages with Both Moved and Renamed Paths

Flag pages that include both `Moved` and `Renamed` entries.

#### Why This Matters

These pages changed both location and naming. This can create more complex redirect behavior than a simple rename or move.

#### Recommended Action

- Test old paths
- Confirm they redirect to the current path or intended canonical path
- Check for redirect chains or loops

---

### 5. Canonical Renamed URLs

Flag renamed paths marked `Canonical="true"`.

#### Why This Matters

Canonical URL handling affects preferred public URLs, redirects, SEO, and duplicate-content behavior.

#### Recommended Action

- Confirm canonical URL aligns with the current content strategy
- Validate canonical redirects and metadata
- Make sure old canonical choices are still intentional

---

### 6. URL Structure Complexity

Calculate URL depth by counting path segments, but do not automatically flag normal structured content as a problem.

Examples of acceptable depth:

- `/knowledgecenter/blog/post-name`
- `/products/category/product-name`

Depth becomes worth reviewing only when:

- hierarchy appears repetitive
- URLs are unnecessarily long
- URL structure reflects accidental site tree nesting
- sibling content types use inconsistent structures
- content is difficult to maintain or understand

#### Recommended Action

Review whether URL hierarchy provides meaningful organization. Simplify only where structure no longer provides value.

---

### 7. Placeholder or Temporary URL Patterns

Flag paths that include temporary or placeholder terms, such as:

- `loremipsum`
- `test`
- `sample`
- `temp`
- `copy`

#### Recommended Action

- Rename current URLs where appropriate
- Validate redirects before changing live URLs
- Review and potentially remove obsolete historical URL entries after validation

---

## Step 4: Recommended Excel Worksheets

Create the following worksheets:

### 1. Executive Summary

Include:

- Key metrics
- High-priority issue count
- xID URL count
- large section move count
- top cleanup opportunities
- recommended next actions

### 2. URL Inventory

One row per current page.

Columns:

- Page ID
- Current Path
- Schema
- Top-Level Section
- Path Depth
- Has Moved History
- Has Renamed History
- Has Canonical History
- Historical URL Count
- Current URL Has xID Suffix
- Historical URL Has xID Suffix
- Has Section Move History
- Issue Summary
- Review Priority
- Recommended Action

### 3. Historical URL Detail

One row per current or historical URL.

Columns:

- Page ID
- Current Path
- Schema
- URL Entry Type
- URL Path
- Is Canonical
- Current Section
- URL Section
- Section Changed
- Contains xID Suffix
- Contains Placeholder Pattern
- Recommended Action

### 4. xID URL Cleanup

One row per xID-suffixed URL.

Columns:

- Page ID
- Current Path
- Schema
- URL Entry Type
- URL Path
- Priority
- Recommended Action

### 5. Section Move Clusters

One row per old-section to new-section move group.

Columns:

- Historical Section
- Current Section
- URL Count
- Priority
- Explanation
- Recommended Action

### 6. Issue Flags

One row per detected issue.

Columns:

- Page ID
- Current Path
- Issue Type
- Evidence
- Risk
- Recommended Action
- Priority

### 7. Cleanup Tasks

One row per action.

Columns:

- Priority
- Page ID
- Current Path
- Task Type
- Action
- Status
- Owner
- Notes

## Step 5: Required Visuals

### 1. URL History by Type

Chart:

- Current URLs
- Moved URLs
- Renamed URLs
- Canonical renamed URLs

### 2. xID URL Summary

Chart:

- Current URLs with xID suffixes
- Historical URLs with xID suffixes

### 3. URL Changes by Schema

Chart:

- moved and renamed counts by schema

### 4. URL Changes by Top-Level Section

Chart:

- historical URL count by section

### 5. Review Priority Summary

Chart:

- High
- Medium
- Low

### 6. Section Move Clusters

Chart:

- old section → new section redirect volume

## Step 6: Priority Rules

### High Priority

Flag as high priority if:

- current URL contains an xID suffix
- URL contains placeholder text in the current public path
- page has significant URL collision/naming issues

### Medium Priority

Flag as medium priority if:

- historical URL contains an xID suffix
- page has multiple historical URLs
- page has both moved and renamed history
- page is part of a large section move
- renamed URL is marked canonical and should be verified
- URL structure is unusually deep or unclear

### Low Priority

Flag as low priority if:

- page has no history
- URL is clean and shallow
- schema and path appear consistent
- historical URL behavior does not indicate cleanup risk

## Step 7: Recommended Cleanup Actions

Use the following action types:

### Resolve URL Collision Naming

Use when URLs contain xID suffixes.

### Validate Section Move Redirects

Use when many URLs moved from one section to another.

### Confirm Canonical URL

Use when a renamed path is marked canonical.

### Validate Redirect Accumulation

Use when a page has multiple historical URLs.

### Review URL Naming

Use when slugs are vague, temporary, or contain placeholder values.

### Review URL Structure

Use when URL depth or hierarchy appears overly complex.

### No Immediate Action

Use when URL has no clear cleanup issue.

## Step 8: Manual Review Checklist

Before changing or deleting URL history, confirm:

- Was the old URL live?
- Was the old URL live long enough to be indexed?
- Does the old URL still receive traffic?
- Does the old URL have external backlinks?
- Does the old URL appear in search engine results?
- Does the current URL match intended site structure?
- Does the page appear in navigation?
- Would changing the URL require a redirect?
- Are canonical URL rules still needed?

## Step 9: Analytics and Validation

Pair the URL audit with:

- Google Analytics landing page data
- Google Search Console indexed URL and query data
- Server/CDN logs
- Internal site search reports
- CMS page publish status reports

Use external data to identify:

- old URLs still receiving traffic
- redirected URLs still being hit
- high-traffic pages with risky URL history
- URLs indexed by search engines that should be redirected or retired
- historical URLs that no longer need redirects

## Step 10: AI Prompt for Report Generation

Use this prompt with the raw `UrlMap.xml` data:

```text
You are analyzing an Ingeniux CMS UrlMap.xml file.

Create a URL audit report that identifies URL cleanup opportunities, redirect accumulation, section move patterns, xID-suffixed collision URLs, and visualization-ready metrics.

Parse all Page entries and their current Path, Schema, Moved paths, Renamed paths, and Canonical flags.

Create the following outputs:

1. Executive summary metrics
2. URL inventory table
3. Historical URL detail table
4. xID URL cleanup table
5. Section move cluster table
6. Issue flags table
7. Cleanup task list
8. Visual chart recommendations

Detect:
- current URLs with xID suffixes using pattern [-_]?x\d+$
- historical URLs with xID suffixes using pattern [-_]?x\d+$
- pages with multiple URL history entries
- pages with both moved and renamed history
- large section moves
- canonical renamed URLs
- placeholder or temporary URL patterns
- schema types with URL churn
- URL structure complexity without automatically penalizing normal content hierarchy

Important:
- Do not treat URL history as automatically bad.
- Redirects are necessary when old URLs were live, indexed, or externally linked.
- Call out large move clusters as redirect maintenance warnings.
- Recommend removing redirect history only after validating analytics, indexing, logs, or backlink data.
- Treat current xID-suffixed URLs as high-priority cleanup candidates.
```

## Key Principle

URL cleanup protects:

- SEO
- bookmarks
- external links
- navigation
- analytics continuity
- user trust

Historical redirects are often valuable. Clean them up only when data shows they are no longer needed.
