How To Block Unknown Bots Using Cloudflare WAF Rules

Table of contents [show]

Understanding cf.client.bot: How Cloudflare Identifies Good Bots
- These verification methods back the cf.client.bot signal.
The Rule Number 5 Expression to Block Unknown Bots
How Each Part Blocks Unknown Bots (Rewritten as Bullet Points)
Final Word

Unknown bots drain your server, steal your content, and probe for vulnerabilities. After months of testing, I discovered how to block unknown bots, but the Cloudflare free plan still has limitations. In this guide, I will explain the exact expression to block unknown bots, why some still bypass it, and how to verify it’s working.

This is Rule 5 in the complete Cloudflare WAF strategy. If you haven’t implemented the four core protection rules, start there—they address the most common attacks. Then layer this rule on top. Just visit this link “https://techsaa.com/cloudflare-waf-rules/“, and you will find a comprehensive guide to understand and deploy the first four rules.

Understanding cf.client.bot: How Cloudflare Identifies Good Bots

Before you implement this rule, you need to understand cf.client.bot. This field protects legitimate bots while you block unknown ones.

What it does: cf.client.bot tells you if a request came from a verified bot approved by Cloudflare (for example, Googlebot, Bingbot, widely used link‑preview services, and other verified agents). Cloudflare maintains a directory of verified bots and exposes this signal so you can allow them in custom rules.

How it works: Cloudflare verifies bots using multiple methods:

Reverse DNS validation (confirming IP matches the bot’s domain)
Network ownership checks (IP/ASN‑based validation)
Managed allowlists and additional internal signals

These verification methods back the cf.client.bot signal.

Why this matters: When you write and not cf.client.bot in your rule, you’re saying: “Block/challenge this traffic unless Cloudflare recognizes it as a verified good bot.” This protects your search engine rankings while stopping unknown bots.
Current verified bots: Cloudflare’s directory includes search engines, social media/link preview crawlers, monitoring tools, SEO and other categories; the set evolves over time. Readers should consult the Verified Bots directory on Cloudflare Radar for the most current list.

The Rule Number 5 Expression to Block Unknown Bots

Based on months of testing and Cloudflare’s recommendations, here’s the expression that works on the Cloudflare free plan. This is the updated and deployed Rule #5.

(
not (
http.request.uri.path eq "/robots.txt" or
ends_with(http.request.uri.path, "/ads.txt") or
http.request.uri.path contains "/sitemap" or
http.request.uri.path eq "/wp-sitemap.xml" or
http.request.uri.path contains "/feed/"
)
)
and
(
ip.src.asnum in {8075 16509 15169 14061 24940 63949 20473}
)
and
(
lower(http.user_agent) contains "curl" or
lower(http.user_agent) contains "wget" or
lower(http.user_agent) contains "python-requests" or
lower(http.user_agent) contains "scrapy" or
lower(http.user_agent) contains "httpx" or
lower(http.user_agent) contains "aiohttp" or
lower(http.user_agent) contains "go-http-client" or
lower(http.user_agent) contains "node-fetch" or
lower(http.user_agent) contains "okhttp" or
lower(http.user_agent) contains "libwww-perl" or
lower(http.user_agent) contains "java/"
)
and not cf.client.bot
and (http.request.method eq "GET" or http.request.method eq "POST")

Using lower() makes UA matching case‑insensitive in the Cloudflare Rules language. Excluding OPTIONS prevents CORS preflight requests from breaking (browsers don’t send cookies on preflight requests).

ip.src.asnum is the correct field to match autonomous system numbers (ASNs).

Action: Block — Cloudflare will apply the lightest viable check first and escalate only if needed.

How Each Part Blocks Unknown Bots (Rewritten as Bullet Points)

Empty UA (http.user_agent eq ""): Real browsers typically send a UA; empties are common in automation or misconfigured clients.
Automation UAs (curl, wget, python‑requests, scrapy, httpx, aiohttp, go-http-client, node-fetch, okhttp, libwww-perl, java/): These are common scraping libraries and indicate programmatic traffic.
ASN + bot-like UA combination: Hosting/cloud egress IPs combined with bot-like or empty UAs significantly increases confidence that the request is automated.
Path restrictions: The rule covers most site paths but avoids breaking critical ones such as /robots.txt, sitemaps, feeds, /wp-json/, and admin-ajax.php.
And not cf.client.bot: Ensures verified good bots (Googlebot, Bingbot, etc.) bypass the rule entirely.
And not http.request.method eq “OPTIONS”: Prevents accidental CORS failures by allowing OPTIONS requests (preflight) while still inspecting GET, POST, and HEAD.

Why Bots Still Get Through (Cloudflare Free Plan Limitation)

I tested this rule thoroughly. Months of implementation showed me that 95% of unknown bots are blocked, but some still bypass the block rule.

Here’s why: The rule relies on static signals — User‑Agent pattern matching and ASN heuristics. A sophisticated bot can originate from residential networks or spoof realistic browser UAs.
Root cause: On the Cloudflare free plan, you cannot use cf.bot_management.score — Cloudflare’s machine‑learning bot score (1–99) is available only on Enterprise Bot Management.
The honest truth: Sophisticated bots can bypass a Free‑plan rule. For most WordPress sites, layering good authentication practices and origin hardening still mitigates real‑world risk. Upgrade only if the economics justify it.

Advantages of Blocking Unknown Bots

Server Performance: Cloudflare stops most bot traffic at the edge before it hits your origin.
SEO Protection: Verified search engines still crawl; unverified scrapers are challenged/blocked, reducing duplicate content risk.
Content Security: Common scraping libraries and many AI crawlers are intercepted.
Simple Deployment: Works on the Cloudflare free plan with Custom Rules (Free supports up to 5 rules).

How to Verify That Unknown Bots Are Actually Blocked

How to Block Unknown Bots1 — Your Custom Rules dashboard should show Rule 5 active and enabled. Monitor the Activity last 24hr column to see how many unknown bots are being challenged.

Don’t rely solely on local curl—it can’t emulate verified bots (only Google/Bing IP space will be treated as verified). Instead:

Your Custom Rules dashboard should show Rule 5 active and enabled. Monitor the Activity (last 24h) column.
Ensure your custom Rule 5 is enabled.
Set Action to block and deploy.
Check Security > Events in the Cloudflare dashboard.
Look for unknown/automation UAs being challenged.
Verify good bots pass by confirming Client bot: true in event details.
Monitor daily for 7 days; adjust UA lists or thresholds as needed.
Security Events tab shows which requests Rule 5 is blocking or allowing.

Cloudflare Security Events showing blocked and allowed requests with IP addresses, ASN details, and matched rules — Security Events tab shows which requests Rule 5 is blocking or allowing. Check the Action and Matched service columns to verify Rule 5 is working correctly.

Tip: A response that is a Challenge Page will carry the header cf-mitigated: challenge.

Action Plan to Block Unknown Bots

Log into Cloudflare > Security > WAF > Custom Rules
Click Create Rule
Name it: Block Unknown Bots – Rule 5
Paste the expression above
Set Action: Block
Deploy to all paths (/)
Check Security > Events within 1 hour

Curl Verification Tests (Updated)

1. The programmatic client and empty UA should be blocked/challenged:

curl -I -A “okhttp/4.9.0” https://example.com/

Cloudflare Security Custom WAF Rules — curl -I -A expected result

Expected Result: HTTP/1.1 200 ok

2. Normal browser UA should pass (unless another rule triggers)

curl –I –A “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121 Safari/537.36” https://yourdomain.com/

Expected Result: 200/301 (normal response)

3. CORS preflight (OPTIONS) should not be blocked:

curl –s –o NUL –D – –X OPTIONS “https://yourdomain.com/” –H “Origin: https://yourdomain.com” –H “Access-Control-Request-Method: GET”“`

Expected Result: 200 OK (OPTIONS is not inspected by this rule)

4. UA–spoof sanity tests (Google/Bing user agents)

curl -I -H “User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” https://yourdomain.com/
curl -I -H “User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)” https://yourdomain.com/

Note: Replace https://example.com and yourdomain.com with your own site URL.

When to Upgrade to a Paid Plan

Your Free‑plan rule blocks most unknown bots. Consider upgrading if you face revenue loss or high infrastructure costs due to bots.

Pro/Business: Super Bot Fight Mode adds stronger bot controls and analytics beyond the Free toggle.
Enterprise: Bot Management exposes the ML score (cf.bot_management.score) and rich signals for precise, programmatic enforcement.

Final Word

This rule works. It stops most unknown bots from accessing your WordPress site. It doesn’t solve everything—nothing on the Cloudflare free plan does. Layer it with the four core WAF rules, server‑level security, strong passwords, and Two‑Factor Authentication.

Your mission: Allow good bots. Block unknown bots. Protect your site.

This rule accomplishes exactly that.

How to Block Unknown Bots Using Cloudflare WAF Rules

Table of contents [show]

Understanding cf.client.bot: How Cloudflare Identifies Good Bots

These verification methods back the cf.client.bot signal.

The Rule Number 5 Expression to Block Unknown Bots

How Each Part Blocks Unknown Bots (Rewritten as Bullet Points)

Why Bots Still Get Through (Cloudflare Free Plan Limitation)

Advantages of Blocking Unknown Bots

How to Verify That Unknown Bots Are Actually Blocked

Action Plan to Block Unknown Bots

Curl Verification Tests (Updated)

1. The programmatic client and empty UA should be blocked/challenged:

2. Normal browser UA should pass (unless another rule triggers)

3. CORS preflight (OPTIONS) should not be blocked:

4. UA–spoof sanity tests (Google/Bing user agents)

When to Upgrade to a Paid Plan

Final Word

Most Popular

Social Engineering: How Attackers Manipulate Trust to Steal Your Data

Cryptographic Authentication: How Modern Organizations Are Securing Digital Identity

Alternative Ways to Avoid SIM Swap Attacks: 6 Proven Protection Methods

The Anatomy of a SIM Swap Attack: How Hackers Steal Your Identity

Overcoming Cloudflare Free WAF Limitations: Essential Tactics

The Ultimate Guide to Securing Your WordPress Site with Cloudflare WAF Rules

What Are Some Real-world Examples of AI-driven Cyberattacks

Cybersecurity Jobs and Salaries: An In-Depth Look at 2025 Trends

More From Same Category

Shared and Dedicated Internet Access: 5 Key Differences You Must Know

Understanding Cybersecurity Threats Facing Small Businesses: A Comprehensive Guide

How to Make Sense of VPN Encryption: A Complete Guide

What is Cybersecurity? Components, Cyberthreats, and Solutions