How to Block Unknown Bots Using Cloudflare WAF Rules

Unknown bots drain your server, steal your content, and probe for vulnerabilities. After months of testing, I discovered how to block unknown bots, but the Cloudflare free plan still has limitations. In this guide, I will explain the exact expression to block unknown bots, why some still bypass it, and how to verify it’s working.

This is Rule 5 in the complete Cloudflare WAF strategy. If you haven’t implemented the four core protection rules, start there—they address the most common attacks. Then layer this rule on top. Just visit this link “https://techsaa.com/cloudflare-waf-rules/“, and you will find a comprehensive guide to understand and deploy the first four rules.

Understanding cf.client.bot: How Cloudflare Identifies Good Bots

Before you implement this rule, you need to understand cf.client.bot. This field protects legitimate bots while you block unknown ones.

What it does: cf.client.bot tells you if a request came from a verified bot approved by Cloudflare (for example, Googlebot, Bingbot, widely used link‑preview services, and other verified agents). Cloudflare maintains a directory of verified bots and exposes this signal so you can allow them in custom rules.

How it works: Cloudflare verifies bots using multiple methods:
  • Reverse DNS validation (confirming IP matches the bot’s domain)
  • Network ownership checks (IP/ASN‑based validation)
  • Managed allowlists and additional internal signals

These verification methods back the cf.client.bot signal.

  • Why this matters: When you write and not cf.client.bot in your rule, you’re saying: “Block/challenge this traffic unless Cloudflare recognizes it as a verified good bot.” This protects your search engine rankings while stopping unknown bots.
  • Current verified bots: Cloudflare’s directory includes search engines, social media/link preview crawlers, monitoring tools, SEO and other categories; the set evolves over time. Readers should consult the Verified Bots directory on Cloudflare Radar for the most current list.

The Rule Number 5 Expression to Block Unknown Bots

Based on months of testing and Cloudflare’s recommendations, here’s the expression that works on the Cloudflare free plan. This is the updated and deployed Rule #5.

(
(
http.request.uri.path contains "/"
and not (
http.request.uri.path eq "/robots.txt" or
ends_with(http.request.uri.path, "/ads.txt") or
http.request.uri.path contains "/sitemap" or
http.request.uri.path eq "/wp-sitemap.xml" or
http.request.uri.path contains "/feed/"
)
)
and
(
lower(http.user_agent) contains "curl" or
lower(http.user_agent) contains "wget" or
lower(http.user_agent) contains "python-requests" or
lower(http.user_agent) contains "scrapy" or
lower(http.user_agent) contains "httpx" or
lower(http.user_agent) contains "aiohttp" or
lower(http.user_agent) contains "go-http-client" or
lower(http.user_agent) contains "node-fetch" or
lower(http.user_agent) contains "okhttp" or
lower(http.user_agent) contains "libwww-perl" or
lower(http.user_agent) contains "java/" or
(
ip.src.asnum in {8075 16509 15169 14061 24940 63949 20473}
and (
lower(http.user_agent) contains "curl" or
lower(http.user_agent) contains "wget" or
lower(http.user_agent) contains "python-requests" or
lower(http.user_agent) contains "scrapy" or
lower(http.user_agent) contains "httpx" or
lower(http.user_agent) contains "aiohttp" or
lower(http.user_agent) contains "go-http-client" or
lower(http.user_agent) contains "node-fetch" or
lower(http.user_agent) contains "okhttp" or
lower(http.user_agent) contains "libwww-perl" or
lower(http.user_agent) contains "java/"
)
)
)
)
and not cf.client.bot
and (http.request.method eq "GET" or http.request.method eq "POST")

Using lower() makes UA matching case‑insensitive in the Cloudflare Rules language. Excluding OPTIONS prevents CORS preflight requests from breaking (browsers don’t send cookies on preflight requests).

ip.src.asnum is the correct field to match autonomous system numbers (ASNs).

Action: Managed Challenge (much safer than block) — Cloudflare will apply the lightest viable check first and escalate only if needed.

How Each Part Blocks Unknown Bots (Rewritten as Bullet Points)

  • Empty UA (http.user_agent eq ""): Real browsers typically send a UA; empties are common in automation or misconfigured clients.
  • Automation UAs (curl, wget, python‑requests, scrapy, httpx, aiohttp, go-http-client, node-fetch, okhttp, libwww-perl, java/): These are common scraping libraries and indicate programmatic traffic.
  • ASN + bot-like UA combination: Hosting/cloud egress IPs combined with bot-like or empty UAs significantly increases confidence that the request is automated.
  • Path restrictions: The rule covers most site paths but avoids breaking critical ones such as /robots.txt, sitemaps, feeds, /wp-json/, and admin-ajax.php.
  • And not cf.client.bot: Ensures verified good bots (Googlebot, Bingbot, etc.) bypass the rule entirely.
  • And not http.request.method eq “OPTIONS”: Prevents accidental CORS failures by allowing OPTIONS requests (preflight) while still inspecting GET, POST, and HEAD.

Why Bots Still Get Through (Cloudflare Free Plan Limitation)

I tested this rule thoroughly. Months of implementation showed me that 95% of unknown bots are blocked, but some still bypass the Managed Challenge.

  1. Here’s why: The rule relies on static signals — User‑Agent pattern matching and ASN heuristics. A sophisticated bot can originate from residential networks or spoof realistic browser UAs.
  2. Root cause: On the Cloudflare free plan, you cannot use cf.bot_management.score — Cloudflare’s machine‑learning bot score (1–99) is available only on Enterprise Bot Management.
  3. The honest truth: Sophisticated bots can bypass a Free‑plan rule. For most WordPress sites, layering good authentication practices and origin hardening still mitigates real‑world risk. Upgrade only if the economics justify it.

Advantages of Blocking Unknown Bots

  • Server Performance: Cloudflare stops most bot traffic at the edge before it hits your origin.
  • SEO Protection: Verified search engines still crawl; unverified scrapers are challenged/blocked, reducing duplicate content risk.
  • Content Security: Common scraping libraries and many AI crawlers are intercepted.
  • Simple Deployment: Works on the Cloudflare free plan with Custom Rules (Free supports up to 5 rules).

How to Verify That Unknown Bots Are Actually Blocked

How to Block Unknown Bots1
Your Custom Rules dashboard should show Rule 5 active and enabled. Monitor the Activity last 24hr column to see how many unknown bots are being challenged.
Don’t rely solely on local curl—it can’t emulate verified bots (only Google/Bing IP space will be treated as verified). Instead:
  • Your Custom Rules dashboard should show Rule 5 active and enabled. Monitor the Activity (last 24h) column.
  • Ensure your custom Rule 5 is enabled.
  • Set Action to Managed Challenge and deploy.
  • Check Security > Events in the Cloudflare dashboard.
  • Look for unknown/automation UAs being challenged.
  • Verify good bots pass by confirming Client bot: true in event details.
  • Monitor daily for 7 days; adjust UA lists or thresholds as needed.
  • Security Events tab shows which requests Rule 5 is blocking or allowing.
Cloudflare Security Events showing blocked and allowed requests with IP addresses, ASN details, and matched rules
Security Events tab shows which requests Rule 5 is blocking or allowing. Check the Action and Matched service columns to verify Rule 5 is working correctly.
Tip: A response that is a Challenge Page will carry the header cf-mitigated: challenge.

Action Plan to Block Unknown Bots

  • Log into Cloudflare > Security > WAF > Custom Rules
  • Click Create Rule
  • Name it: Block Unknown Bots – Rule 5
  • Paste the expression above
  • Set Action = Managed Challenge
  • Deploy to all paths (/)
  • Check Security > Events within 1 hour

Curl Verification Tests (Updated)

1. The programmatic client and empty UA should be blocked/challenged:

  • curl -s -o NUL -D – https://yourdomain.com/
Expected Result: 403 + cf-mitigated: challenge
curl -I results
Curl Verification Tests
Expected Result: 403 + cf-mitigated: challenge

2. Normal browser UA should pass (unless another rule triggers)

  • curl I A “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121 Safari/537.36” https://yourdomain.com/

Expected Result: 200/301 (normal response)

3. CORS preflight (OPTIONS) should not be blocked:

  • curl s o NUL D X OPTIONS “https://yourdomain.com/” H “Origin: https://yourdomain.com” H “Access-Control-Request-Method: GET”“`
Expected Result: 200 OK (OPTIONS is not inspected by this rule)

4. UAspoof sanity tests (Google/Bing user agents)

  • curl -I -H “User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” https://yourdomain.com/
  • curl -I -H “User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)” https://yourdomain.com/

Note: Replace https://example.com and yourdomain.com with your own site URL.

When to Upgrade to a Paid Plan

Your Free‑plan rule blocks most unknown bots. Consider upgrading if you face revenue loss or high infrastructure costs due to bots.

  • Pro/Business: Super Bot Fight Mode adds stronger bot controls and analytics beyond the Free toggle.
  • Enterprise: Bot Management exposes the ML score (cf.bot_management.score) and rich signals for precise, programmatic enforcement.

Final Word

This rule works. It stops most unknown bots from accessing your WordPress site. It doesn’t solve everything—nothing on the Cloudflare free plan does. Layer it with the four core WAF rules, server‑level security, strong passwords, and Two‑Factor Authentication.

Your mission: Allow good bots. Block unknown bots. Protect your site.

This rule accomplishes exactly that.

Most Popular

More From Same Category

- A word from our sponsors -

Read Now

How Quantum Computing Can Transform Cybersecurity

Quantum computing can transform cybersecurity by revolutionizing data processing, creating both opportunities and risks. The Quantum Cybersecurity Impact describes how quantum technologies could both strengthen and challenge existing cybersecurity frameworks. This article delves into the implications of quantum computing on digital security, exploring its potential threats and...

How Certificates Encrypt Data Transmitted for Enhanced Online Security

In an age where cyberattacks and data breaches are more prevalent than ever, safeguarding sensitive information during its transmission is essential for online businesses. One of the most effective tools for securing data exchanged over the internet is the use of certificates that encrypt data transmitted between...

Implementing Secure Payment Gateways in E-commerce Stores

In the digital age, ensuring secure payment processing is paramount for e-commerce businesses. Implementing Secure payment gateways in ecommerce not only protects customer data but also fosters trust and drives sales. This article delves into the importance of secure payment gateways in e-commerce and provides a step-by-step...

The Best Practices to Enhance Your Chatbot Security

In 2025, chatbots have evolved to become crucial tools in customer service, sales, and user interaction. However, with this increased reliance on AI-driven automation comes an increased risk of cyberattacks and data breaches. Safeguarding these intelligent systems has never been more critical. Chatbot security plays a central...

What are DoS and DDoS Attacks & How to Prevent Them?

In today's interconnected world, where businesses and organizations increasingly rely on digital platforms to operate, cyber threats are a growing concern. Among these, Denial of Service DoS and DDoS attacks are among the most common and damaging threats. These attacks can bring down websites, cause server outages,...

SQL Injection Attacks: Understanding the Risks

Web applications are a fundamental part of modern technology, from e-commerce sites to enterprise software. However, they can also be prime targets for malicious actors seeking to exploit vulnerabilities. Among the most dangerous and widespread threats to web applications are SQL injection attacks. These attacks exploit weak...

Common Network Security Vulnerabilities: Be Careful

We live in an era of constant connectivity. Our networks are the lifeblood of business and communication, yet they are under constant threat. Common Network Security Vulnerabilities are more than just a technical issue; they are an ongoing battle to safeguard our most valuable information. If left...

10 Best Paid and Free Firewall Software

A firewall is a critical piece of your cybersecurity puzzle. It serves as the first line of defense between your device or network and the internet, monitoring incoming and outgoing traffic to prevent unauthorized access, data theft, and malicious attacks. Paid and free firewall software provide varying...

Cybercriminals: Unmasking the Dark Side of the Digital World

In today’s hyper-connected era, the digital landscape has become a battleground where cybercriminals exploit vulnerabilities for profit, notoriety, or political gain. As technology evolves, so do the tactics of these modern-day outlaws. This article delves into the world of cybercriminals, examining who they are, the methods they...

Cyberattacks: Available Hardware, Software & apps to Defend

Cyberattacks are serious risks in today’s digital world. They harm systems, steal data, and disrupt operations. Individuals, businesses, and governments face constant threats from hackers and malicious software. Strong defense strategies are essential for protecting sensitive information and maintaining smooth operations. This guide outlines an extensive range...

Power Automate Services for Enhanced Data Access Control

Data Access Control is critical in ensuring that your apps are fully secured to protect your company's sensitive information. For business owners, it is even more crucial to guarantee that their Data Access Control mechanisms are strong and reliable in the contemporary world. As cyber threats increase,...

Google Cybersecurity Certification: Guide to Enhance Your Career

In an era where digital threats are outpacing the time, talent, and money we're putting toward them in increasing fashion every day, cybersecurity has been elevated to the top of the business, governmental and individual agenda items. Coming from a leading technology firm, Google has just launched...