Methodology

How Citations.io measures AI visibility.

Sampled measurement, confidence tiers, and what every number on your dashboard actually represents.

TL;DR

Citations.io measures AI visibility by sampling answers from leading AI engines - ChatGPT, Perplexity, Gemini and Claude - against your tracked prompt set. Each answer is parsed for brand mentions, competitor mentions, position in the answer, and cited sources. Scores are aggregated per prompt, per engine and per market, and labelled with a confidence tier based on sample size. We do not measure search rankings; we measure how often, where, and with what evidence AI engines surface your brand in synthesized answers.

You

Your brand

ChatGPT

Perplexity

Gemini

Claude

Each engine is sampled independently; per-engine confidence is reported separately.

Engines

ChatGPT · Perplexity · Gemini · Claude

Default cadence

Daily

weekly on trial

Confidence floor

1–20

answers → directional read

Strong read

101+

answers → decision-grade

Sampled measurement, not rank tracking

We do not measure search engine rankings. Instead, we sample answers from each tracked AI engine against your prompt set, then parse every answer for brand mentions, competitor mentions, position in answer, and cited sources. Every metric on your dashboard is an aggregation over those analysed answers.

Confidence tiers

Sample size matters. Every metric carries a confidence label so you know how much to trust it:

Three tiers

Step 01
Directional
Built on a small number of analysed answers. Useful for spotting patterns, not for board reporting.
Step 02
Moderate
Enough analysed answers for trends to be meaningful; absolute numbers still move with new samples.
Step 03
Strong
A deep sample behind the number. Figures and trends are stable enough to make decisions on.
Step 04
Recommendation
Report directional reads as 'early signal'; wait for moderate+ before recommending material spend.

What counts as an analysed answer

An analysed answer is one verbatim response captured from a tracked engine for a tracked prompt, parsed for entities and citations, and stored with timestamp, market, and language metadata. Answers that fail content extraction are excluded from sample counts.

Engine coverage

We currently track ChatGPT (OpenAI), Perplexity, Google Gemini and Claude (Anthropic). Each engine is sampled independently; per-engine confidence is reported separately so you can see where the picture is clearer.

Refresh cadence

Default scan frequency is daily on paid plans and weekly on trial. You can trigger a re-scan manually from Settings → Brand at any time (subject to plan rate limits).

AI answers are non-deterministic. That's why we measure with sample-based confidence, not single-snapshot rankings.

On stochastic answers

Limitations

AI answers are non-deterministic - the same prompt can yield different answers across samples. That is precisely why we measure with sample-based confidence rather than reporting a single "rank". Use directional readings to spot trends; wait for moderate or strong confidence before making material content or PR decisions.

Keep reading

Concept

AI visibility score

How the score is calculated and what good looks like.

Read

Playbook

How to track AI visibility

Prompt sets, cadence, and what to capture per answer.

Read

Reference

Glossary

Definitions for AEO, GEO, SoV, citations and 30+ terms.

Read