Research

ORIGINAL RESEARCH

The HTML Myth in AI Visibility

I had a clean theory: enterprise sites that score well on AI visibility must have cleaner HTML than the ones that don't. So I built a scanner and ran it across 492 enterprise sites I'd already benchmarked. The correlation was -0.007.

492

enterprise sites tested

-0.007

correlation (Pearson r)

0.1pt

top vs bottom A11y delta

77.0

average A11y floor

THE HEADLINE

Clean HTML does not predict AI citation.

Top quartile by AI visibility score (n=123): accessibility score 77.3.

Bottom quartile by AI visibility score (n=123): accessibility score 77.1.

A 0.1-point gap. Out of 100. Across 246 sites. The Pearson correlation across the full 492-site sample was −0.007 — statistician for “you imagined it.”

Klaviyo, Monday, Braze, Zendesk, Yotpo all sit at 78-91 on accessibility. Some are heavily AI-cited. Some are completely invisible. The HTML doesn't predict it either way.

The lazy AI-SEO advice currently doing the rounds — “fix your semantic HTML and the LLMs will find you” — does not survive contact with 492 enterprise sites.

THE DATA

Top quartile vs bottom quartile

Top quartile by AI score (n=123)

AI Visibility (out of 100)48.1
Accessibility (out of 100)77.3

Bottom quartile by AI score (n=123)

AI Visibility (out of 100)2.0
Accessibility (out of 100)77.1

What this means:

AI visibility varies wildly across enterprise sites — from 2 to 97 out of 100. Accessibility does not. It clusters tightly around 77/100. Almost every modern enterprise site already has competent semantic HTML. The variance in AI citation is being driven by something the front-end markup cannot see.

BOTH SIDES OF THE LINE

Sites with great accessibility live on both sides of the AI line

If clean HTML predicted AI citation, the high-accessibility names would cluster on one side. They don't. Pick any 5 from the top end of the A11y scale and you find some heavily cited and others completely invisible.

AI-CITED · HIGH A11Y

  • ZapierA11y 100
  • ZoomInfoA11y 99
  • ForstersA11y 96
  • AdyenA11y 95
  • AvanadeA11y 95

AI-INVISIBLE · HIGH A11Y

  • KlaviyoA11y 78 · AI 2/100
  • Monday.comA11y 73 · AI 2/100
  • BrazeA11y 78 · AI 2/100
  • SegmentA11y 87 · AI 2/100
  • ZendeskA11y 91 · AI 2/100

PER-AXIS AVERAGES

Where enterprise sites still fall short

These are real accessibility findings — even though none of them predict AI citation, several still represent failures of basic inclusive design across hundreds of enterprise sites.

Lang attribute 4.3/5
Document title 4.6/5
Heading hierarchy 13.4/20
Alt text coverage 12.6/20
ARIA landmarks 10.7/15
Link text quality 13.4/15
Form labels 7.5/10
Skip-to-content link (weakest)1.6/5
Button accessible name 4.2/5

Skip-to-content links: 1.6/5

Most enterprise sites are missing them entirely. This is the clearest accessibility failure across the dataset — and the one with the lowest fix cost. Worth noting: it still does not change your AI visibility.

KEY INSIGHTS

What the data tells us

The accessibility floor is high — and uniform

Across 492 enterprise sites the standard deviation on accessibility was just 13.2 points (mean 77.0). Almost every modern enterprise site has competent markup. Most of the differentiation people obsess over does not exist at the homepage HTML level.

AI visibility variance is enormous — and unrelated

AI visibility ranged from 2 to 97 out of 100 (sd 28.4). The two metrics move independently. Whatever drives citation is not visible in the front-end source code.

Named cases on both sides break the narrative

Klaviyo, Monday, Braze, Segment and Zendesk all sit between 73 and 91 on accessibility while scoring 2/100 on AI visibility. Forsters and Adyen sit at 95-96 on accessibility AND score 96-97 on AI. Same markup quality, opposite citation outcomes.

Per-axis weaknesses are real but unrelated to AI

Skip-to-content links (1.6/5), alt text (12.6/20), and landmark structure (10.7/15) are genuinely weak across the enterprise sample. Worth fixing for human users. Not worth doing in the name of AI visibility.

If it isn't the HTML, what is it?

The next study tests four candidate predictors that operate outside the markup: Wikipedia presence, third-party press mentions, schema.org coverage, and domain age. Working hypothesis: external citation graph and entity authority dominate. Findings to follow after the May benchmark.

CITE THIS RESEARCH

Stats you can use

All stats from this study. Link to this page as your source.

−0.007

Pearson correlation between AI visibility and accessibility across 492 enterprise sites

0.1pt

accessibility gap between top-quartile and bottom-quartile AI-cited sites (n=246)

77.3

top-quartile AI sites' accessibility score (out of 100)

77.1

bottom-quartile AI sites' accessibility score (out of 100)

77.0

average accessibility score across all 492 enterprise sites — the floor

1.6/5

average skip-to-content link score — the weakest accessibility axis

12.6/20

average alt-text coverage — consistent gap across all sectors

492 / 524

valid scans from 524 unique domains in the registry

METHODOLOGY

How we ran this study

Sample

524 unique enterprise domains drawn from the GTM Signal Studio AI Visibility Registry — a unified canonical list pulled from three editions of the Enterprise Benchmark (March, April, May 2026) plus the UK Law Firms 2026 spinoff. 492 returned valid HTML and were included in the correlation. 32 returned errors or empty bodies (typically heavy SPAs).

Accessibility scan

Custom Python scanner running BeautifulSoup over the homepage HTML. Scored 9 axes for a total of 100: lang attribute (5), document title (5), heading hierarchy (20), alt text coverage (20), ARIA landmarks (15), link text quality (15), form labels (10), skip-to-content link (5), and button accessible name (5). Scoring is consistent with how axe and Lighthouse evaluate these signals.

AI visibility score

Pulled from the existing benchmark dataset. Each site was already scored 0-100 across Citation Presence, Entity Recognition, Content Structure, and Citation Breadth using Scanner v2.0 (multi-API: OpenAI, Gemini, Brave, Tavily). Most recent scan per domain was used.

Correlation

Pearson r between AI total score and accessibility total score across 492 paired observations. Computed in pure Python — no statistical package, no preprocessing, no outlier removal. Quartile means computed by sorting on AI score and taking the top and bottom 25%.

Limitations

Homepage HTML only — JS-rendered SPAs may underscore on accessibility for the wrong reason. The scanner does not run Playwright. We are measuring static markup quality, not the full rendered DOM. The correlation conclusion is robust to this caveat because the effect we're testing is so close to zero (−0.007) that any plausible JS-rendering correction cannot move it into significance.

If it isn't the HTML, what is it?

The AI Visibility Audit tests the things that actually move the needle — citation presence, entity recognition, third-party authority — not a markup checklist anyone could run themselves.