ChatGPT now has over 880 million monthly users. AI-referred traffic to websites grew 527% year-over-year. And according to Gartner's forecast, traditional search volume will decline 25% by 2026 as users shift to AI-powered alternatives.
The question is no longer "Should we care about AI search?" — it's "Can AI search engines even see our website?"
SEO Is Not Enough Anymore
Traditional SEO optimizes for Google's link-based ranking algorithm. You build backlinks, target keywords, and climb the SERPs. But when a user asks ChatGPT "What's the best CRM for small businesses?", there are no SERPs. The AI generates an answer by pulling from crawled content, structured data, and knowledge graphs.
This is where GEO (Generative Engine Optimization) comes in. GEO focuses on making your brand discoverable, understandable, and citable by AI engines — ChatGPT, Gemini, Claude, Perplexity, and DeepSeek.
But before you can optimize for AI answers, you need to know: does your website even pass the basic technical requirements?
That's what a GEO site audit tells you. And unlike AI visibility diagnosis (which requires querying multiple AI engines), a site audit checks things you can fix today — your robots.txt, your structured data, your metadata. These are the fastest wins in GEO.
The 4 Dimensions of a GEO Site Audit
A comprehensive GEO site audit evaluates your website across 4 technical dimensions, each with a weighted score:
Overall GEO Score = Crawler Access × 30% + Structured Data × 25% + Knowledge Graph × 20% + Content Basics × 25%
Let's break down each one.
1. AI Crawler Access (30% of score)
What it checks: Whether your robots.txt file allows or blocks the 9 major AI crawlers — GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, Claude-Web, Google-Extended, PerplexityBot, Bytespider, and CCBot.
Why it matters: If AI crawlers can't access your site, your content simply doesn't exist to AI search engines. A recent study found that websites blocking GPTBot were cited 73% less often in ChatGPT responses compared to similar sites that allowed it.
The reality is alarming: about 5% of all domains block GPTBot, and among top news sites, the number jumps to 62%. Many website owners don't even realize their robots.txt is blocking AI crawlers — it may have been configured years ago before AI search existed, or a CDN like Cloudflare may have changed defaults to block AI bots automatically.
How to fix it: Open your robots.txt file and check for Disallow: / rules targeting AI user agents. If you want AI search engines to index your content, explicitly allow them. This is a 5-minute fix that can dramatically change your AI visibility.
2. Structured Data (25% of score)
What it checks: Whether your homepage contains JSON-LD structured data, and how many recommended Schema.org types are present — Organization, WebSite, Product, FAQPage, Article, and BreadcrumbList.
Why it matters: Structured data helps AI engines understand what your content is about, not just what it says. It defines entities, relationships, and context. According to a Data World study, GPT-4's accuracy jumped from 16% to 54% when content included structured data.
Without JSON-LD, AI engines have to guess what your page is about. With it, you're giving them a machine-readable map of your content. This is the difference between being accurately cited and being hallucinated about.
Important for 2026: AI crawlers like GPTBot and ClaudeBot cannot execute JavaScript. If your JSON-LD is injected client-side (via Google Tag Manager or React hydration), AI crawlers will miss it entirely. Embed it directly in server-rendered HTML.
How to fix it: Add at minimum an Organization and WebSite JSON-LD block to your homepage <head>. If you have products, articles, or FAQs, add those schema types too. Most CMS platforms have plugins that generate this automatically.
3. Knowledge Graph Presence (20% of score)
What it checks: Whether your brand has entries in Wikidata, English Wikipedia, Chinese Wikipedia, and Baidu Baike — the 4 major knowledge sources that AI models rely on.
Why it matters: AI models use knowledge graphs as ground truth for factual answers. When ChatGPT says "Notion is a productivity tool founded in 2013", that information likely comes from Wikidata or Wikipedia, not from Notion's homepage.
Yext's 2026 research confirms that every major AI system — from ChatGPT to Apple Intelligence — uses Wikidata for factual grounding. Brands with Wikidata entries get cited more frequently and more accurately. Without one, AI engines may hallucinate facts about your company or ignore it entirely.
Unlike a Wikipedia article (which requires meeting strict notability guidelines), a Wikidata entry is more accessible. Think of it as a machine-readable birth certificate for your brand — it tells AI systems your official name, founding date, industry, website, and key properties.
How to fix it: Start by creating a Wikidata entry for your brand with comprehensive properties. For Wikipedia, ensure your brand meets notability requirements and consider contributing to relevant articles. For Chinese market visibility, Baidu Baike is equally important.
4. Content Basics (25% of score)
What it checks: 9 fundamental content quality signals — HTTPS, <title> tag, meta description, Open Graph tags (title, description, image), H1 heading, body content length (>500 characters), blog/articles link, and FAQ/help link.
Why it matters: These are the raw materials AI engines use to understand and extract information from your site. A page with no title, no meta description, and 50 words of content gives AI nothing to work with. According to BrightEdge's analysis, 86% of citations in AI responses come from brand-managed sources — your website, your listings, your pages.
Content quality directly impacts whether AI engines can extract useful, citable information. Thin pages, missing metadata, and no blog or FAQ section means less content for AI to learn from.
How to fix it: Ensure every important page has a descriptive title, meta description, and OG tags. Add enough body content (500+ characters minimum). If you don't have a blog or FAQ section, consider adding one — these are content types AI engines particularly favor for generating answers.
These Are the Fastest Wins in GEO
Here's what makes a site audit different from a full AI visibility diagnosis: you can act on every finding immediately.
Unblocking AI crawlers? 5 minutes. Adding JSON-LD? An afternoon. Fixing missing meta tags? An hour. Creating a Wikidata entry? A weekend project.
You don't need to wait for AI engines to re-crawl the internet, and you don't need to run a multi-month content campaign. These are technical fixes with direct, measurable impact on whether AI can even access and understand your site.
A full GEO strategy goes much further — diagnosing how AI engines actually talk about your brand, analyzing competitor citations, and creating AI-optimized content. But this technical foundation is where it all starts.
We Open-Sourced Our Audit Engine
When we looked for an open-source GEO audit tool to point people to, we couldn't find one. GEO is still a young field, and there's no widely accepted, transparent standard for how site audits should be scored. Most tools keep their methodology behind closed doors.
We think that's a problem. If the GEO ecosystem is going to mature, practitioners need an open, auditable scoring standard — one that anyone can inspect, run, and improve.
So we recently extracted the site audit engine from RankWeave and released it as an open-source npm package: rankweave-geo-audit.
npm install rankweave-geo-audit
import { audit } from 'rankweave-geo-audit';
const result = await audit({
domain: 'example.com',
companyName: 'Example Inc',
});
console.log(result.overallScore); // 0-100
console.log(result.dimensions); // 4 dimension scores
console.log(result.recommendations); // bilingual action items (EN/ZH)
The scoring algorithm, the weights, the checks — everything is transparent and open for review. If you think a dimension should be weighted differently, or a new AI crawler should be added to the checklist, open an issue or PR.
It's worth noting that this covers the technical audit portion of GEO — the foundation. Full AI visibility involves additional dimensions like cross-engine brand diagnosis, sentiment analysis, and citation tracking, which require querying live AI engines. If you want the complete picture, try a free GEO health check on RankWeave — no signup required.
What Should You Do Next?
- Run a quick audit — use RankWeave's free check or install the open-source package to see your score
- Fix your robots.txt — this is the single highest-impact, lowest-effort change you can make
- Add JSON-LD — at minimum, Organization and WebSite schema on your homepage
- Check your Wikidata entry — create one if it doesn't exist
- Audit your content basics — title, meta description, OG tags, and sufficient content
The shift to AI search is not coming — it's already here. The brands that act now on their technical GEO foundation will have a significant head start when AI search becomes the primary way people discover products and services.
Want to understand the full picture of GEO vs traditional SEO? Or learn how to get ChatGPT to recommend your brand? Explore more on the RankWeave blog.