Bot Signatures

Label open and click tracking events generated by bots, security scanners, and automated link preview services so you can filter them out of engagement metrics.

Overview

Bot signatures let you identify open and click-tracking events that weren't generated by a real recipient. Email security gateways, inbox-preview services, AI crawlers, and mailbox providers that proxy images (such as Apple Private Relay or Google Image Proxy) all fetch tracking pixels and follow links automatically. Without a way to tell them apart from real recipients, those requests show up as opens and clicks and inflate your engagement metrics.

When a tracking request matches a bot signature, unMTA labels the event with a bot field identifying the kind of automation that produced it — but still records it. See Event Labeling below for how this shows up on Open and Click webhook payloads.

Bot signatures come in two layers, merged at match time:

Layer	Scope	Managed by
Default	Shared across all clusters	unMTA. The larger curated dataset (Apple Private Relay CIDRs, AI crawlers, security gateways, etc.) — updated over time. Can be disabled individually per cluster.
Custom	Per cluster	You. Add your own rules to cover bots unMTA doesn't ship defaults for.

Custom rules are always additive. Disables remove matching entries from the built-in and default layers.

Match Types

Every signature — default or custom — has a type that determines how the request is matched:

Type	Matches	Example pattern
`user_agent`	Case-insensitive substring match against the request's `User-Agent` header.	`GoogleImageProxy`
`ip_exact`	Exact match against the request's source IP address (IPv4 or IPv6).	`203.0.113.10`
`ip_cidr`	CIDR-block match against the request's source IP address (IPv4 or IPv6).	`17.0.0.0/8`, `2001:db8::/32`

A request is labeled if any signature matches. User-Agent matching runs first, then exact-IP lookups, then CIDR ranges — the first hit wins and its source becomes the event's bot label.

Default Signatures

unMTA ships a curated set of default signatures covering common sources of bot traffic. The list is updated over time as new bots, proxies, and scanners appear.

What's Covered

The dataset evolves over time as new bots appear. Categories currently covered include:

Mailbox-provider image proxies — Apple Private Relay, Google Image Proxy, Yahoo Mail Proxy
Link-preview services — Microsoft BingPreview, Teams URL Preview
Email security gateways — Proofpoint, Mimecast, Barracuda, Symantec, FireEye, MessageLabs, and others
AI and LLM crawlers — ChatGPT, Anthropic ClaudeBot, OpenAI GPTBot, Amazonbot
Generic HTTP clients — curl, wget, python-requests, Go-http-client, HeadlessChrome, and similar automation tooling
Apple Private Relay egress ranges — the CIDRs published by Apple
Other provider egress ranges — added as vendors publish authoritative lists (e.g. Microsoft Defender/SafeLinks, additional security vendors)

Disabling a Default Per Cluster

Defaults apply to every cluster automatically. If one of them is filtering real engagement for your audience, you can disable it for a specific cluster without affecting any other cluster.

Navigate to the Bot Signatures page from the sidebar
Open the Defaults tab
Use the source filter or search box to find the signature
Toggle the switch at the end of the row to disable it

Disabling is per-cluster — the same default can be on for one cluster and off for another. You can re-enable it at any time by toggling the switch back on.

Defaults themselves aren't editable — only their enabled state per cluster. If you need to match a bot unMTA doesn't already cover, add a custom rule.

Custom Signatures

Custom signatures are yours to manage. Use them to:

Flag events from an internal security scanner or link-checker you control
Cover a bot that unMTA's defaults don't handle yet
Tag a specific range of IPs that is consistently generating non-human opens

Adding a Custom Rule

Navigate to the Bot Signatures page
Click Add custom rule
Choose the match type (User Agent, IP Address, or IP CIDR)
Enter the pattern
Optionally, enter a label to help you remember what the rule is for
Click Save

Pattern Requirements

Match type	Requirement
`user_agent`	Any non-empty string up to 1000 characters. Matched case-insensitively as a substring of the `User-Agent` header.
`ip_exact`	A valid IPv4 or IPv6 address.
`ip_cidr`	A valid CIDR block. IPv4 prefixes must be in the range `/0`–`/32`; IPv6 prefixes must be in the range `/0`–`/128`.

For user-agent rules, prefer the most distinctive portion of the header — for example BadBot/ rather than Mozilla/5.0, which would also match real browsers.

Deleting a Custom Rule

Open the Custom rules tab
Click the menu button on the rule and select Delete
Confirm

Deleting is immediate. The rule is removed from the cluster and MTAs stop matching on it once the next config sync completes (see below).

Prefetch Detection

In addition to signature matching, unMTA flags pixel or click requests that arrive implausibly soon after the message was sent.

If the delta is under 5 seconds and no user-agent or IP signature matched, the event is labeled bot: "prefetch". This catches link-scanners and preview fetchers that don't match any known signature — a real recipient opening an email within 5 seconds of delivery is almost always automation.

More specific signature hits always win over prefetch, so a fast Apple Mail Privacy Protection load still labels as apple.

Event Labeling

When a signature (or prefetch detection) matches, unMTA adds a nested bot object to the Open or Click webhook payload:

{
  "type": "Open",
  "id": "...",
  "recipient": "...",
  "bot": { "kind": "proxy", "source": "apple" }
}

Human traffic emits the event without the bot field, so treat the field's presence — not its truthiness — as the signal.

`bot.kind` — the field you bucket on

A stable, two-value enum that tells you whether the hit was a mailbox-provider proxy standing in for a real recipient, or automation with no human behind it:

`kind`	Meaning
`proxy`	A mailbox-provider image or link proxy fetched the URL on a real recipient's behalf. Most consumers count these as engagement.
`automation`	Security scanners, AI crawlers, generic HTTP clients, prefetchers, and customer-defined bots. No evidence of human intent.
(absent)	No signature matched and the hit wasn't a prefetch. Treat as human.

This is the field to filter on. It's stable against future change — if Google were to launch a prefetcher tomorrow, the corresponding signature would ship with kind: "automation" and every dashboard that buckets on kind would be automatically correct without any rule rewrites.

`bot.source` — diagnostic detail

The fine-grained identifier for the specific provider or category that matched. Use it for slicing dashboards ("how much of our automation traffic is AI crawlers vs Proofpoint?"), not for bucketing human-vs-bot. Current source values include:

`source`	Typical `kind`	Matched against
`apple`	`proxy`	Apple Mail Privacy Protection
`google`	`proxy`	Google Image Proxy / Gmail image caching
`yahoo`	`proxy`	Yahoo Mail image proxy / link preview
`microsoft`	`proxy` or `automation`	BingPreview and Skype/Teams URL Preview (`proxy`); Defender/SafeLinks scanners (`automation`)
`security`	`automation`	Email security gateways (Proofpoint, Mimecast, Barracuda, etc.)
`ai`	`automation`	AI/LLM crawlers (ChatGPT, ClaudeBot, GPTBot, Amazonbot)
`crawler`	`automation`	Generic HTTP clients (`curl`, `wget`, `python-requests`, etc.)
`custom`	`automation` (default)	A custom rule you added matched
`prefetch`	`automation`	No UA or IP signature matched, but the hit arrived under the prefetch threshold

The source ↔ kind relationship is set per-signature, not fixed globally — a future microsoft signature could ship with either kind depending on what it matches.

API Reference

For programmatic management of bot signatures, see the Bot Signatures API documentation.

Bot Signatures

On this page