Bot Traffic

Statalog detects bot traffic automatically on every site. There is no setting to enable, no extra snippet to install, and no <noscript> pixel — the standard tracking script is enough. Bot hits are stored separately, never count toward your billable pageviews, and never pollute your human stats.

What's detected

The default tracker recognises every modern crawler that renders JavaScript — and the major bots in 2026 all do.

Search engine crawlers Googlebot, Bingbot, AppleBot, YandexBot, DuckDuckBot, Baiduspider.

AI and LLM crawlers GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended, PerplexityBot, CCBot (Common Crawl), Amazonbot, anthropic-ai.

Social media previewers Slackbot, Twitterbot, FacebookExternalHit, LinkedInBot, Discordbot.

SEO and monitoring tools AhrefsBot, SemrushBot, MJ12bot, DotBot, UptimeRobot, Pingdom and others.

Each request is matched against a curated User-Agent signature list at ingestion. Major search-engine bots are additionally cross-checked against published IP ranges where applicable, so spoofed User-Agent headers can't masquerade as Googlebot.

How bot hits behave

  • Excluded from all human-facing reports. Visitors, sessions, bounce rate, goals, funnels, time on page — every report uses the human-only view by default.
  • Excluded from billing. Bot hits never count toward your monthly pageview quota.
  • Available on the dedicated Bots page with a breakdown by bot family, page, and date so you can see exactly which crawler is hitting which URL.
  • Toggleable on individual reports if you want to see human + bot combined.

Why JavaScript-only is enough

Crawlers that don't execute JavaScript — Common Crawl, archive.org, legacy SEO bots — typically aren't actionable for site owners. The bots that matter for SEO and AI exposure tracking all render JavaScript:

Bot Renders JS
Googlebot Yes — headless Chrome since 2019
Bingbot Yes — Edge engine since 2019
GPTBot Yes
ClaudeBot Yes
PerplexityBot Yes
AppleBot Yes
YandexBot Yes
DuckDuckGo Uses Bing's rendered index

So the standard JavaScript tracker captures the bots you actually want to know about — without the privacy-and-trust friction of asking customers to install an extra <noscript> pixel.

Excluding specific traffic

If you spot a User-Agent or IP range you'd rather drop entirely (a custom monitoring tool, a partner's scraper), use the IP / UA blocklist in your site settings — those requests are discarded before reaching ClickHouse and don't appear anywhere, including the Bots page.

DNT (Do Not Track)

Statalog honours the DNT: 1 browser header unconditionally. Requests carrying it are discarded at ingestion regardless of bot status — no visitor ID is computed, no row is written.

FAQ

Do I need to do anything to enable bot tracking? No. It's automatic on every site, on every plan.

Will bot hits use up my monthly pageview quota? No. Bot traffic is excluded from billable usage.

Can I see how often Googlebot crawls my site? Yes — open the Bots page in your dashboard. You'll see Googlebot, GPTBot, ClaudeBot and every other detected crawler, broken down by URL and date.

What if a brand-new bot's User-Agent isn't in your list yet? The signature list is updated regularly. If you spot uncategorised traffic that looks like a bot, report it and it'll be added in the next release. In the meantime, the IP/UA blocklist lets you exclude it manually.

Do you still offer the noscript pixel? No — earlier versions had an opt-in <noscript> pixel for catching non-JS crawlers. We removed it because the bots it caught (Common Crawl, archive.org, legacy SEO tools) aren't the ones customers act on, and the pixel created unnecessary friction. Modern search engines and AI crawlers all render JavaScript and are detected by the standard tracker.