A sitemap that leaks pending, rejected, or suspended listings feeds search engines URLs that return 404, soft-404, or thin-content pages, which tanks crawl budget and suppresses ranking for the listings that are actually approved. Worse, indexed-then-removed URLs can surface moderator-rejected content (spam, fraud, or policy violations) in Google results for days before re-crawl, exposing the directory to reputational risk and potential liability under platform-intermediary rules like DSA Article 16 on notice-and-action transparency.
Medium because the leak is public-facing and wastes crawl budget but does not directly expose user data or credentials.
Filter the sitemap query to only approved rows and exclude drafts, pending, rejected, and suspended states at the SQL layer, not in post-processing. Edit app/sitemap.ts so the where clause is explicit:
const listings = await db.listings.findMany({
where: { status: 'approved', published_at: { not: null } },
select: { id: true, updated_at: true }
})
Add a regression test that seeds one row per status and asserts the sitemap length equals the approved count.
ID: directory-submissions-moderation.spam-prevention.sitemap-approved-only
Severity: medium
What to look for: Examine the sitemap generation logic (e.g., /sitemap.xml or dynamic sitemap routes). Check that the sitemap only includes listings with status = 'approved'. Pending, rejected, or suspended listings should not be in the sitemap.
Pass criteria: Enumerate all relevant code paths. The sitemap only lists approved, published listings. Pending or rejected listings are excluded. with at least 1 verified instance.
Fail criteria: Sitemap includes pending or rejected listings, or all listings are included regardless of status.
Skip (N/A) when: The project doesn't have a sitemap or has no moderation system.
Detail on fail: "Sitemap includes pending listings that aren't visible on the site. Search engines index pages that don't exist." or "No sitemap — SEO issue but not specific to moderation."
Remediation: Filter sitemap by status:
// app/sitemap.ts (Next.js)
import { MetadataRoute } from 'next'
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const listings = await db.listings.findMany({
where: { status: 'approved' },
select: { id: true, updated_at: true }
})
return listings.map(listing => ({
url: `https://yourdomain.com/listings/${listing.id}`,
lastModified: listing.updated_at
}))
}