Your robots.txt should live at yourdomain.com/robots.txt and not block pages you want indexed. A surprising number of production sites ship with Disallow: / leftover from staging — run the Robots.txt Generator if you're unsure.
Technical SEO Checklist.
Forty points across crawlability, indexation, performance, schema, internal architecture, mobile usability, and security. The technical surface that either amplifies your content work or quietly suppresses it.
01 · Crawlability
Sitemap at /sitemap.xml (or /sitemap-index.xml for large sites). Include only canonical URLs you want indexed — no redirects, no noindex pages, no parameterized URLs. Submit to Google Search Console and reference in robots.txt.
Every important page should be reachable via internal links from the homepage in 3 clicks or fewer. Orphan pages (no internal links pointing to them) often don't get crawled or rank poorly. Screaming Frog's 'Orphan Pages' report finds these.
Check Google Search Console → Crawl Stats. Large numbers of crawls to faceted navigation, sort parameters, or paginated archives waste budget. Use noindex or parameter handling in GSC to prune what Google spends time on.
If you've blocked specific bots (GPTBot, ClaudeBot, bad scrapers), test that your desired traffic still reaches you. Over-aggressive user-agent blocks sometimes catch Googlebot variants you didn't intend to block.
Consistency matters. If your canonical is https://site.com/, internal links shouldn't mix http://site.com, www.site.com, or trailing-slash variants. Crawlers follow what they find; inconsistencies create redirect hops.
02 · Indexation
Run Screaming Frog or Sitebulb on the full site. Every URL in the sitemap should return 200. Redirects (301/302) indicate stale sitemap entries. 404s mean broken canonical links somewhere.
Self-referential canonicals are the default. A page at /blog/post-a should have <link rel='canonical' href='https://site.com/blog/post-a' />. Wrong canonicals cause de-indexation of the page they're on in favor of some other URL.
Author pages, tag archives, internal search results, thank-you pages, and most paginated sequences should be noindex, follow. Keeps the index clean and prevents algorithmic dilution.
GSC Coverage shows Valid, Excluded, and Error pages. The gap between sitemap-submitted and indexed should be small. Large 'Crawled — currently not indexed' or 'Discovered — currently not indexed' buckets need investigation.
Identical or near-identical pages under different URLs dilute rankings. Use canonicals for legitimate duplicates (print versions, filter variants). Use 301 redirects to merge pages with the same intent. Use content audit to find these.
Multi-region sites need hreflang tags to map equivalent pages across locales. Missing hreflang means Google shows users the wrong variant. GSC → International Targeting reports hreflang errors.
03 · Performance & Core Web Vitals
Google's threshold for 'good' LCP. Optimize the hero image (WebP, responsive srcset, preload), inline critical CSS, defer non-critical JS. Test via PageSpeed Insights on the specific URL patterns users enter from.
Replaced FID as a Core Web Vital in March 2024. Measures real interaction responsiveness. Long-running JavaScript (analytics scripts, heavy third-party embeds) is the usual culprit. Break up long tasks with scheduler.yield() or setTimeout.
Images without dimensions, ads/iframes without reserved space, and fonts swapping cause layout shift. Set width and height on every image. Reserve aspect-ratio slots for embeds. Use font-display: swap with size-adjust.
WebP is supported everywhere that matters. AVIF adds additional compression (30-50% smaller) with wide browser support as of 2024. Use srcset to serve appropriate sizes. Most CMS platforms and frameworks handle this automatically now.
Google Fonts via <link> is acceptable with <link rel='preconnect'> and display=swap. Self-hosting and subsetting can be faster. Never use @font-face to load from a slow CDN without preconnect.
Analytics, chat widgets, A/B test platforms, heatmap trackers — each adds weight and main-thread work. Audit what's actually earning its keep every 12 months. Defer non-critical scripts; load chat widgets only after interaction.
04 · Schema & structured data
Single canonical LocalBusiness JSON-LD block, referenced by @id from other schemas. Use the Schema Generator to build valid markup for your specific category.
Breadcrumbs help Google understand site structure and show visibly in SERPs. Match the BreadcrumbList JSON-LD to the visible breadcrumb nav on the page. Missing breadcrumbs = missing rich result opportunity.
Every blog post gets BlogPosting schema with author referenced via @id to your single canonical Person entity (typically on /about). Don't duplicate full Person data across 166 posts — reference by ID.
Google devalued FAQPage rich results for most categories in 2023. It still works for authoritative sites. Apply to /faqs hub and individual FAQ pages; skip for pages with 1-2 inline questions (Google ignores those anyway).
Test via Google's Rich Results Test (search.google.com/test/rich-results) and Schema.org Validator. Both should return zero errors. Warnings are often safe to ignore; errors suppress rich results.
05 · Internal architecture
Keep URLs to 3-4 segments max (/services/local-seo-management, not /services/local/seo/management/overview). Use dashes not underscores, all lowercase, short and descriptive. Avoid dates, IDs, session params, or CMS fingerprints in URLs.
Main nav should hold 5-8 top-level items. Deeper categorization happens in section landing pages, footer, or in-body navigation. Overloaded header navs confuse users and bloat every page's internal-link graph.
Every internal link should use descriptive anchor text that matches the linked page's topic. 'Read the GBP optimization checklist' beats 'click here'. Helps users and feeds relevance signals to Google.
Group footer links into 3-5 categories (Services, Resources, Contact, Legal). Flat 40-link footers signal low information architecture. See homepage footer for a working example.
Related posts at the bottom of each blog post, related services on each service page, related FAQs on FAQ pages. These are ranking-signal goldmines — internal links between topically-related pages.
Visible breadcrumbs help users orient and help Google understand hierarchy. Match the visible breadcrumb to BreadcrumbList JSON-LD (see #20).
06 · Mobile usability
Test via search.google.com/test/mobile-friendly — homepage, service page, blog post, FAQ, checkout. Failing mobile-friendly means you're excluded from mobile SERPs for most queries.
Buttons, links, and form controls must be large enough to hit accurately on a touchscreen. GSC Mobile Usability reports 'Clickable elements too close' — fix before it becomes a ranking issue.
Body text must be 16px minimum. Line-height 1.5+ for comfortable reading. Don't trust designers claiming 14px is 'fine' — every accessibility and mobile guideline says 16px.
Test at 360px (small Android), 375px (iPhone), 414px (large phone). Any horizontal scroll indicates an element wider than the viewport — typically an image without max-width:100%, a wide table, or overflowing code blocks.
<input type='email'> triggers the email keyboard. <input type='tel'> triggers the numpad. autocomplete='email' / 'tel' / 'name' lets browsers autofill. Tiny details that meaningfully improve mobile form completion.
07 · Security & HTTPS
Non-negotiable in 2024. Let's Encrypt is free and takes 10 minutes. Mixed content (HTTP assets on HTTPS pages) triggers browser warnings and hurts trust signals. Use HSTS headers for extra protection.
Configure a 301 redirect from HTTP to HTTPS at the server or CDN level. Test by typing http://yoursite.com — should land on HTTPS. Missing redirects cause duplicate content + mixed-canonical issues.
Strict-Transport-Security: max-age=31536000; includeSubDomains tells browsers to always use HTTPS. Eliminates the HTTP-first vulnerability on first visits. Set at the CDN level if possible.
Content-Security-Policy prevents XSS. X-Frame-Options prevents clickjacking. X-Content-Type-Options stops MIME sniffing. Test at securityheaders.com — aim for A grade minimum.
Common admin paths (/wp-admin, /admin, /login) get constant brute-force attempts. Require 2FA on every admin account. Consider IP allowlists or basic auth on CMS admin for extra protection.
Outdated WordPress plugins are the #1 source of hacked sites. Enable auto-updates for minor versions. Review majors quarterly. Remove unused plugins — every installed plugin is attack surface whether activated or not.
Technical checklists find issues. Audits prioritize them .
You can work the 40-point list yourself. If you want someone with 15 years of triage experience looking at your site to tell you which issues matter most for your specific business — that's the audit.