Local SEO Technical Crawlability Audit
Key Points
- If Google can't crawl and index your pages, those pages simply don't exist in search results — no matter how good your content is.
- The most common crawlability killers are accidental: a misconfigured robots.txt file or a leftover "noindex" tag from a developer test can wipe out months of SEO work.
- Google Search Console is free and gives you a direct line to exactly what Google sees (and can't see) on your site.
- 404 errors on important service pages don't just hurt that page — they waste Google's crawl budget and signal site neglect.
- Most crawlability fixes require no developer help when you're using WordPress with the right plugins.
Why This Matters for Your Business
Imagine spending three months building citations, collecting reviews, and writing optimized content — then finding out Google couldn't index half your site the whole time. It happens more often than you'd think.
Crawlability is the foundation of everything in local SEO. Before Google can rank your plumbing service page for "emergency plumber in Denver," it has to find the page, access it, and add it to its index. If anything blocks that process, all your other work is wasted effort.
For local businesses specifically, crawlability issues often concentrate on the most important pages: service pages, location pages, and contact pages. A dentist's office might have a perfectly set up Google Business Profile but have their "teeth whitening" service page accidentally blocked from indexing — and never know it.
A 2023 study by Ahrefs found that over 65% of small business websites have at least one significant crawlability issue. The good news: most are simple to fix once you know where to look.
Getting Started
Before diving into tools, start with the simplest check possible. Open a browser in incognito mode and type this into Google's search bar:
site:yourdomain.com
Replace "yourdomain.com" with your actual domain. Google will show you every page it has indexed. If you expect 20 pages but only see 6, you have a problem. If you see 200 pages when you only have 20 real pages, you may have duplicate content issues.
Write down roughly how many pages Google has indexed. That number is your baseline.
Understanding Crawlability vs. Indexability
These two terms get confused, but the distinction matters.
Crawlability means Google's bot (called Googlebot) can physically access and read your page. If a page is blocked in your robots.txt file, Googlebot can't even see it.
Indexability means Google can add that page to its search index. Even if Googlebot can crawl a page, a "noindex" tag tells Google "don't include this in search results."
Both issues result in the same outcome — your page doesn't appear in search — but they're fixed differently.
Common Crawlability Blockers
Robots.txt problems: Your robots.txt file lives at yourdomain.com/robots.txt. Check it right now. You're looking for lines like Disallow: /services/ or Disallow: / — those block Googlebot from entire sections of your site. Many WordPress sites accidentally get set to "discourage search engines" in the Settings > Reading panel, which adds a broad block to robots.txt. Go to Settings > Reading in your WordPress dashboard and confirm "Discourage search engines from indexing this site" is NOT checked.
Noindex tags: A developer might add a noindex tag during site construction to prevent a half-finished site from appearing in search. If that tag is never removed, your live pages stay hidden. In WordPress, the Yoast SEO or Rank Math plugin lets you check each page's indexability without touching any code.
Broken internal links: If a page links to another page that no longer exists (a 404 error), Googlebot wastes time chasing dead ends. On small sites with limited crawl budgets, this can mean important pages get skipped.
Finding and Fixing 404 Errors
A 404 error means a page doesn't exist. When a real customer or Googlebot follows a link to a dead page, they hit a wall.
How to Find 404 Errors for Free
Google Search Console (free): Go to Coverage > Excluded and look for "Not found (404)" errors. Google Search Console shows you exactly which URLs it found broken, when it found them, and what pages link to them.
Screaming Frog SEO Spider (free up to 500 URLs): Download Screaming Frog, enter your domain, and run a crawl. Under the "Response Codes" tab, filter for 4xx errors. This shows every broken link on your site in minutes.
How to Fix 404 Errors
Once you've found dead pages, you have two options:
- Recreate the page if the content still matters (a service you still offer, a blog post that earned backlinks)
- Set up a 301 redirect pointing the dead URL to the most relevant live page
In WordPress, the Redirection plugin (free) lets you set up 301 redirects without touching code. You simply type in the old URL and the new destination, click save, and Google's next crawl will follow the redirect.
Using Google Search Console for Your Audit
Google Search Console is your most important free tool for crawlability. If you haven't verified your site yet, do that first at search.google.com/search-console.
The Coverage Report
Go to Index > Coverage. This report is split into four categories:
- Valid: Pages successfully indexed. These are good.
- Valid with warnings: Indexed but with potential issues. Review these.
- Error: Pages Google tried to index but couldn't. Fix these immediately.
- Excluded: Pages Google chose not to index or was told not to. Review carefully — some exclusions are correct (thank-you pages, admin pages), others are mistakes.
Click into each error type to see the specific URLs affected. For a local business site, your goal is to have all your service pages, location pages, and your contact page showing as "Valid."
The URL Inspection Tool
If you're not sure whether a specific page is indexed, paste its URL into the URL Inspection Tool at the top of Search Console. It tells you the last time Google crawled it, whether it's indexed, and any issues found. You can also click "Request Indexing" to push Google to crawl the page sooner.
Checking XML Sitemaps
Your XML sitemap is a file that lists all the pages on your site, helping Google find and prioritize them. It typically lives at yourdomain.com/sitemap.xml.
Verify your sitemap is submitted in Search Console under Sitemaps. Check that it includes your important service and location pages, and that none of those URLs return errors when you open them in a browser.
In WordPress, Yoast SEO and Rank Math both generate and maintain sitemaps automatically.
Tools to Help
- Semrush Local SEO Tools - Complete local SEO toolkit
- Ahrefs - Rank tracking and competitor analysis
- Moz Local - Local SEO management platform
Next Steps
- Run
site:yourdomain.comin Google and note how many pages are indexed - Check yourdomain.com/robots.txt and confirm no important pages are blocked
- Verify your site in Google Search Console if you haven't already
- Review the Coverage report and fix any "Error" status pages first
- Download Screaming Frog and run a crawl to find all 404 errors
- Install the Redirection plugin in WordPress and set up 301 redirects for any dead URLs
- Confirm your XML sitemap is submitted in Search Console and includes your key pages
Common Mistakes to Avoid
Leaving "discourage search engines" enabled: This WordPress setting is meant for staging sites. It's the single most common crawlability mistake and takes 10 seconds to check.
Ignoring excluded pages in Search Console: "Excluded" doesn't always mean there's a problem, but it warrants a look. A service page showing as "Excluded - noindex tag detected" is a serious problem disguised as a minor notice.
Building new pages without checking indexation: After you publish a new service page, use the URL Inspection Tool to confirm Google has indexed it. Don't assume it happened automatically.
Letting 404 errors accumulate: One or two 404s on minor pages isn't catastrophic. Dozens of 404 errors — especially on pages that earned backlinks or used to rank — actively hurts your site's authority.
Overlooking pagination and parameter URLs: E-commerce or appointment booking plugins sometimes generate thousands of unique URLs (filters, dates, session IDs). These can overwhelm your crawl budget. Make sure your robots.txt or canonical tags handle these correctly.
Frequently Asked Questions
Q: How do I know if my robots.txt is blocking Google? A: Go to Google Search Console > Settings > robots.txt Tester. It lets you test any URL against your current robots.txt rules and tells you immediately whether Googlebot can access it. You can also just open yourdomain.com/robots.txt in any browser and read it directly — look for any "Disallow" rules that cover pages you want ranked.
Q: My site has 30 pages but Google only shows 12 indexed. Is that a problem? A: It depends on what the missing pages are. If the 12 indexed pages include all your service and location pages, the other 18 might be correctly excluded (privacy policy, thank-you pages, admin pages). Pull up the Coverage report in Search Console and check what's in the "Excluded" category to identify any service or location pages that should be indexed but aren't.
Q: How often should I run a crawlability audit? A: For most small business sites, a quarterly audit is sufficient. Run an additional check anytime you make major changes to your site — switching themes, adding new pages, migrating to a new host, or installing new plugins. Any of these can introduce new crawlability issues without obvious warning signs.
Learn More
Get your free Local SEO Audit Template to evaluate your current setup and create an action plan.