Crawl budget managed; no infinite scroll or session IDs in URLs
Why it matters
Session IDs in query strings and unbounded faceted URLs multiply your crawlable surface exponentially — Googlebot spends its allotted fetches on duplicate variants of the same listing and never reaches your actual product pages. Infinite scroll without a paginated fallback makes content past the fold effectively invisible to crawlers. The result: shallow index coverage, delayed discovery of new pages, and wasted crawl capacity.
Severity rationale
High because crawl waste compounds over thousands of URLs and starves new content of indexation.
Remediation
Strip session IDs from URLs entirely and store session state in cookies. Add noindex to active filter combinations and canonical facet pages to their unfiltered base. Configure in app/products/page.tsx:
export async function generateMetadata({ searchParams }) {
return { robots: { index: Object.keys(searchParams).length === 0 } }
}
Detection
-
ID:
crawl-budget -
Severity:
high -
What to look for: Count all infinite scroll implementations, faceted navigation patterns, and URL parameters containing session IDs (JSESSIONID, sid, PHPSESSID, etc.). Enumerate every route that could generate duplicate crawlable URLs. Check whether faceted pages use
noindex,canonical, or URL parameter handling to prevent crawl waste. -
Pass criteria: Zero infinite scroll patterns that create duplicate crawlable URLs. All faceted navigation routes use
noindexorcanonicaltags. Zero session IDs appear in any URL query parameters. Report even on pass: "X faceted routes found, all with crawl controls; 0 session ID patterns detected." -
Fail criteria: At least 1 infinite scroll creates duplicate URLs, or at least 1 session ID in URLs, or faceted pages crawlable as separate URLs without
noindex/canonicalcontrols. -
Skip (N/A) when: The project has no pagination, infinite scroll, or faceted navigation patterns.
-
Detail on fail:
"Search results page uses faceted filters; 12 filter combinations crawlable as separate URLs without noindex"or"Session ID (JSESSIONID) appears in URL query parameters on 3 routes". -
Remediation: Crawl budget is finite. Manage faceted navigation with canonicals or noindex in
app/products/page.tsx:// For faceted navigation, noindex filter combinations: export const metadata = { robots: { index: !hasActiveFilters, // noindex if filters active }, } // Use rel=next/prev for pagination in app/products/page.tsx: export const metadata = { other: { 'rel-next': 'https://yoursite.com/products?page=2', 'rel-prev': 'https://yoursite.com/products?page=1', }, }
Taxons
History
- 2026-04-18·v1.0.0·Initial import from seo-advanced·automated