unMTA

Bot Signatures

Label open and click tracking events generated by bots, security scanners, and automated link preview services so you can filter them out of engagement metrics.

Overview

Bot signatures let you identify open and click-tracking events that weren't generated by a real recipient. Email security gateways, inbox-preview services, AI crawlers, and mailbox providers that proxy images (such as Apple Private Relay or Google Image Proxy) all fetch tracking pixels and follow links automatically. Without a way to tell them apart from real recipients, those requests show up as opens and clicks and inflate your engagement metrics.

When a tracking request matches a bot signature, unMTA labels the event with a bot field identifying the kind of automation that produced it — but still records it. See Event Labeling below for how this shows up on Open and Click webhook payloads.

Bot signatures come in two layers, merged at match time:

LayerScopeManaged by
DefaultShared across all clustersunMTA. The larger curated dataset (Apple Private Relay CIDRs, AI crawlers, security gateways, etc.) — updated over time. Can be disabled individually per cluster.
CustomPer clusterYou. Add your own rules to cover bots unMTA doesn't ship defaults for.

Custom rules are always additive. Disables remove matching entries from the built-in and default layers.

Match Types

Every signature — default or custom — has a type that determines how the request is matched:

TypeMatchesExample pattern
user_agentCase-insensitive substring match against the request's User-Agent header.GoogleImageProxy
ip_exactExact match against the request's source IP address (IPv4 or IPv6).203.0.113.10
ip_cidrCIDR-block match against the request's source IP address (IPv4 or IPv6).17.0.0.0/8, 2001:db8::/32

A request is labeled if any signature matches. User-Agent matching runs first, then exact-IP lookups, then CIDR ranges — the first hit wins and its source becomes the event's bot label.

Default Signatures

unMTA ships a curated set of default signatures covering common sources of bot traffic. The list is updated over time as new bots, proxies, and scanners appear.

What's Covered

The dataset evolves over time as new bots appear. Categories currently covered include:

  • Mailbox-provider image proxies — Apple Private Relay, Google Image Proxy, Yahoo Mail Proxy
  • Link-preview services — Microsoft BingPreview, Teams URL Preview
  • Email security gateways — Proofpoint, Mimecast, Barracuda, Symantec, FireEye, MessageLabs, and others
  • AI and LLM crawlers — ChatGPT, Anthropic ClaudeBot, OpenAI GPTBot, Amazonbot
  • Generic HTTP clientscurl, wget, python-requests, Go-http-client, HeadlessChrome, and similar automation tooling
  • Apple Private Relay egress ranges — the CIDRs published by Apple
  • Other provider egress ranges — added as vendors publish authoritative lists (e.g. Microsoft Defender/SafeLinks, additional security vendors)

Disabling a Default Per Cluster

Defaults apply to every cluster automatically. If one of them is filtering real engagement for your audience, you can disable it for a specific cluster without affecting any other cluster.

  1. Navigate to the Bot Signatures page from the sidebar
  2. Open the Defaults tab
  3. Use the source filter or search box to find the signature
  4. Toggle the switch at the end of the row to disable it

Disabling is per-cluster — the same default can be on for one cluster and off for another. You can re-enable it at any time by toggling the switch back on.

Defaults themselves aren't editable — only their enabled state per cluster. If you need to match a bot unMTA doesn't already cover, add a custom rule.

Custom Signatures

Custom signatures are yours to manage. Use them to:

  • Flag events from an internal security scanner or link-checker you control
  • Cover a bot that unMTA's defaults don't handle yet
  • Tag a specific range of IPs that is consistently generating non-human opens

Adding a Custom Rule

  1. Navigate to the Bot Signatures page
  2. Click Add custom rule
  3. Choose the match type (User Agent, IP Address, or IP CIDR)
  4. Enter the pattern
  5. Optionally, enter a label to help you remember what the rule is for
  6. Click Save

Pattern Requirements

Match typeRequirement
user_agentAny non-empty string up to 1000 characters. Matched case-insensitively as a substring of the User-Agent header.
ip_exactA valid IPv4 or IPv6 address.
ip_cidrA valid CIDR block. IPv4 prefixes must be in the range /0/32; IPv6 prefixes must be in the range /0/128.

For user-agent rules, prefer the most distinctive portion of the header — for example BadBot/ rather than Mozilla/5.0, which would also match real browsers.

Deleting a Custom Rule

  1. Open the Custom rules tab
  2. Click the menu button on the rule and select Delete
  3. Confirm

Deleting is immediate. The rule is removed from the cluster and MTAs stop matching on it once the next config sync completes (see below).

Prefetch Detection

In addition to signature matching, unMTA flags pixel or click requests that arrive implausibly soon after the message was sent.

If the delta is under 5 seconds and no user-agent or IP signature matched, the event is labeled bot: "prefetch". This catches link-scanners and preview fetchers that don't match any known signature — a real recipient opening an email within 5 seconds of delivery is almost always automation.

More specific signature hits always win over prefetch, so a fast Apple Mail Privacy Protection load still labels as apple.

Event Labeling

When a signature (or prefetch detection) matches, unMTA adds a nested bot object to the Open or Click webhook payload:

{
  "type": "Open",
  "id": "...",
  "recipient": "...",
  "bot": { "kind": "proxy", "source": "apple" }
}

Human traffic emits the event without the bot field, so treat the field's presence — not its truthiness — as the signal.

bot.kind — the field you bucket on

A stable, two-value enum that tells you whether the hit was a mailbox-provider proxy standing in for a real recipient, or automation with no human behind it:

kindMeaning
proxyA mailbox-provider image or link proxy fetched the URL on a real recipient's behalf. Most consumers count these as engagement.
automationSecurity scanners, AI crawlers, generic HTTP clients, prefetchers, and customer-defined bots. No evidence of human intent.
(absent)No signature matched and the hit wasn't a prefetch. Treat as human.

This is the field to filter on. It's stable against future change — if Google were to launch a prefetcher tomorrow, the corresponding signature would ship with kind: "automation" and every dashboard that buckets on kind would be automatically correct without any rule rewrites.

bot.source — diagnostic detail

The fine-grained identifier for the specific provider or category that matched. Use it for slicing dashboards ("how much of our automation traffic is AI crawlers vs Proofpoint?"), not for bucketing human-vs-bot. Current source values include:

sourceTypical kindMatched against
appleproxyApple Mail Privacy Protection
googleproxyGoogle Image Proxy / Gmail image caching
yahooproxyYahoo Mail image proxy / link preview
microsoftproxy or automationBingPreview and Skype/Teams URL Preview (proxy); Defender/SafeLinks scanners (automation)
securityautomationEmail security gateways (Proofpoint, Mimecast, Barracuda, etc.)
aiautomationAI/LLM crawlers (ChatGPT, ClaudeBot, GPTBot, Amazonbot)
crawlerautomationGeneric HTTP clients (curl, wget, python-requests, etc.)
customautomation (default)A custom rule you added matched
prefetchautomationNo UA or IP signature matched, but the hit arrived under the prefetch threshold

The sourcekind relationship is set per-signature, not fixed globally — a future microsoft signature could ship with either kind depending on what it matches.

API Reference

For programmatic management of bot signatures, see the Bot Signatures API documentation.

On this page