15 March 2026

When nftables Become the Bottleneck: Our Experience with 50.000 CIDR Prefixes

When geo-blocking requires tens of thousands of CIDR prefixes, the choice between nftables and iptables can upend performance expectations.

Iptables-VS-Netfilter-GEO-IP

 

At Managed Server SRL, we have been dealing with over 15 years of Linux systems engineering, high-performance infrastructure management, and advanced consulting for mission-critical web environmentsOver this time, we've managed thousands of servers, optimized complex stacks for high-traffic CMS and e-commerce, and faced all the typical challenges of professional hosting on a daily basis: security, performance, and infrastructure automation.

 Having been dealing with Linux systems and consultancy for years, like many in this sector we have always had an intimate relationship with Netfilter, iptables, and everything else related to packet filtering in the Linux kernelThe kernel-level firewall is not simply a security tool: it is often a critical component of infrastructure stability, especially when operating public services exposed to the Internet.

When we started developing CFM 4 Linux (Centralized Firewall Manager), our internal tool for centralized management of firewall rules across server fleets, choosing the filtering backend was one of the first architectural decisions we had to make. We had to decide which technology to use to efficiently manage large amounts of rules distributed across tens or hundreds of machines.

01_CFM 4 Linux Dashboard

And it was a decision that took us where we didn't expect.

This article tells what we discovered, because we think it could be useful to anyone who is considering nftables for high-volume rule scenarios — particularly geo-blocking — and because “more modern” does not always automatically mean “faster” or “more efficient” in all operational contexts.

The context: geo-blocking on a scale

CFM manages firewall policies distributed across dozens of servers within infrastructures used for hosting, e-commerce, and high-traffic web applications. In these scenarios, the firewall is not just a security barrier, but also an operational tool for controlling the type of traffic reaching the exposed services. Among the features most requested by our customers is the geo-blocking, or the ability to block (or allow) traffic based on the country of origin of the IP address.

The most common use case is not so much the selective blocking of a few countries, but the reverse model: allow only certain geographic areas and block everything else. This approach is very popular for sites and platforms that operate exclusively in specific markets—for example, Europe or Italy—and want to reduce attack surfaces, unwanted traffic, or mass scraping from other regions of the world.

When applying a policy of this type “only allow these countries”, the number of CIDR prefixes to be blocked is growing rapidly. In practice, it is necessary to insert into the firewall all IP ranges belonging to the excluded countries, which in most scenarios represent the vast majority of the global IP space.

Blocked Firewall Map

In concrete terms this means getting to very easily 50.000–100.000 CIDR prefixes to manage. We are talking about all the IP ranges allocated to 150 or more countries that are not included in the whitelist.

It's not a made-up number.

The GeoIP database we use — DB-IP Lite, integrated and normalized with data from the Regional Internet Registries (RIPE, ARIN, APNIC, LACNIC and AFRINIC) — typically produces datasets of this size:

  • about 51.000 IPv4 prefixes for a scenario “only allow Europe”

  • about 44.000 IPv6 prefixes for the same scenario

The total is therefore in the order of 95.000 CIDR prefixes which must be entered into the firewall as rules DROP.

And this is just the base case. In more restrictive scenarios—for example, allow only one or two specific countries — the number of prefixes can grow further, because the whitelist becomes smaller and the blacklist much larger.

The architectural requirement for CFM was therefore clear from the beginning: apply these rules so that the firewall continues to function without degrading the server's network performanceA modern web server can handle tens of thousands of connections per second, and any inefficiencies in the packet filtering path are likely to become immediately visible in terms of latency or throughput.

This seems like an obvious requirement.

But, as we will see in the next sections, when we enter the order of tens of thousands of CIDR rules, the differences between various packet filtering technologies become much more apparent — and not always in the way you might expect.

The choice of nftables: everyone said it was better

When we designed CFM4Linux, we did what any serious team does: we have done a lot of researchWe've read official documentation, technical articles, published benchmarks, mailing list discussions, and talks presented at Linux networking conferences.

And the message that emerged was always the same: nftables is the future, iptables is legacy technology.

This wasn't just marketing or community narrative. In recent years, the Netfilter project has clearly pushed in this direction: nftables was designed to progressively replace iptables, introducing a more modern, more consistent and—at least on paper— more efficient.

The pros of nftables that we found cited everywhere were quite convincing:

  • Native sets with O(1) lookup via hash table or red-black tree, instead of linear chains of rules with complexity He) typical of iptables

  • Atomic Operations across entire tables and chains, avoiding inconsistent intermediate states during rule updates

  • Unified syntax for IPv4 and IPv6 via tables inet, which eliminate duplication of rules between iptables e ip6tables

  • Greater flexibility of language, with support for maps, key concatenations and more advanced data structures

  • Superior performance in published benchmarks, including the official one published by Red Hat

This last point played a significant role in our decision.

The benchmark published by Red Hat in 2017, (Benchmarking nftables — Red Hat Developer) clearly showed how the sets of nftables maintained stable lookup times regardless of the number of elements, thanks to the use of more efficient data structures than sequential rule scanning.

In other words: Adding 10 elements or 100.000 elements to an nftables set should not significantly change the rule evaluation time..

For a system like CFM, which had to handle tens of thousands of IP prefixes, it seemed the perfect solution.

So we implemented geo-blocking using named set nftables with the flag interval, which allows CIDR ranges to be represented efficiently.

table inet cfm {
    set geo_blocked_v4 {
        type ipv4_addr
        flags interval
        elements = { 1.0.0.0/24, 1.0.4.0/22, 1.0.16.0/20, ... }
        # ~51.000 prefissi
    }

    chain cfm_input {
        type filter hook input priority 0; policy accept;
        ip saddr @geo_blocked_v4 drop comment "CFM:geo-allow-only"
    }
}

Conceptually it was a very clean solution:

  • all the networks to be blocked were contained in a single set

  • the chain input contained only one rule

  • the dataset update could take place atomically

  • IPv4 and IPv6 could be managed in the same table inet

The flag interval is essential in this context. Without it, an nftables set only accepts exact IP addresses, while in our case we had to represent entire CIDR blocks.

With flags interval, nftables uses internally a red-black tree-based data structure, implemented in the kernel in the file nft_set_rbtree.cThis is necessary because the firewall must be able to handle:

  • address ranges

  • CIDR range

  • possible overlaps between prefixes

  • lookups based on the inclusion of the IP in a range

In other words, the system is no longer doing a simple lookup in a hash table, but a search in a balanced interval tree.

On paper, however, it was an extremely efficient structure.

And that's where things started to go wrong.

The disaster in production

The first deployment on a client server that needed to keep the online time clocking system accessible only from Italy (Hetzner, CentOS 7, kernel 6.1, nftables) seemed to work. The rules were loaded, the set was populated, nft list set inet cfm geo_blocked_v4 It showed all 51.000 prefixes. All good.

Then we tried using the server.

The SSH connection has become unusable. Each command had a 5-10 second delay. Web pages wouldn't load. A simple curl https://www.google.com took over 5 seconds — and the reason was revealing: exactly 5.05 seconds, which is the default timeout for DNS resolution.

The DNS was broken. Not completely—queries were coming and going—but with such high latency that it was causing timeouts. And without a functioning DNS, everything else collapses.

The diagnosis

We isolated the problem methodically:

  1. Geo set emptied keeping only the chains with ct state established,related accept → DNS returned instantly (0.02s)
  2. Reloaded the 51.000 prefixes in the set → DNS broken again (5+ seconds)
  3. Repeated the test multiple times → deterministic and reproducible result

The cause was unequivocal: the nftables set with flags interval and 51.000+ elements caused catastrophic per-packet overhead at the kernel level.

And it wasn't a loading or user-space problem. The set was loaded correctly into the kernel. The problem was in the lookup time for each single packet which traversed the chain containing the set reference.

Why DNS? Why everything?

This is a key point to understand. In Netfilter, when a packet arrives on a chain with a match to a set with flags interval, the kernel must traverse the red-black tree to determine if the source IP falls within one of the ranges. With a 51.000-node rbtree, each lookup takes up to ~16 comparisons (log₂ of 51.000), but with the added complexities of managing overlapping ranges, the real cost is significantly higher.

The problem is not the single lookup — it's the millions of lookups per second that a normal server generates. Every outgoing DNS packet, every TCP ACK, every packet from an SSH connection, every fragment of a web page—each of these must pass through the set. Even packets belonging to connections already established (ESTABLISHED,RELATED) undergo lookup if the rule ct state It comes after the match on the set, or if they are in different chains with different priorities.

This last point deserves further elaboration. In nftables, base chains with different priorities are all evaluated independently and a accept in a -10 priority chain (our safety chain) does not prevent for the packet to be evaluated also in the priority 0 chain (where the geo set was). Only one drop It is terminal in the sense that it blocks the packet, but a accept In a chain, it doesn't skip other chains—the packet continues its path through all chains registered on the same hook. This means that even loopback packets (127.0.0.1), already accepted by the safety chain, were still evaluated against the set of 51.000 prefixes.

Bug reports we should have read before

After discovering the problem, we looked for confirmation from the community and found it. We weren't alone.

Netfilter Bug #1735“Adding nftables interval sets progressively gets slower and makes the nft CLI less responsive with each added set”. Reported in January 2024, it describes how sets with flags interval progressively degrade performance. The first iteration takes 0.12 seconds, the fortieth iteration takes 1.59 seconds. And the memory consumption reaches 180 MB. But this bug speaks of the time loading of sets, not per-packet lookup — our problem was even worse.

Netfilter Bug #1439“Atomically updating/reloading a large set with nft -f is excessively slow”. Confirmed that large sets, specifically in the geo-IP scenario, were unworkable with atomic reload. This bug dates back to July 2020 —six years ago.

OpenWrt Forum: “Nftables chokes on very large sets” — OpenWrt users with kernel 5.15 reporting strange behavior with large sets, including memory spikes during import.

OpenWrt Forum: “Some thoughts about nftables performance” — An in-depth discussion of nftables' actual performance versus expectations, with several users reporting similar issues to ours.

The recent kernel (6.19) introduced optimizations in the pipapo backend (nft_set_pipapo.c) with the flag .abort_skip_removal (source code on GitHub), but these mainly concern the time of cancellation of the elements, not the per-package lookup.

The solution: chain-splitting with iptables

After realizing the nature of the problem—the interval set red-black tree simply doesn't scale to 50.000+ prefixes under real-world network load—we had to find an alternative.

The solution we have adopted is called chain-splitting and is implemented using good old iptables-restore. The concept is simple but effective.

How it works

Instead of a single monolithic set, we group CIDR prefixes by first octet (the /8 block it belongs to). For each /8 containing at least one prefix to be blocked, we create a dedicated subchain:

*filter
:CFM_GEO - [0:0]
:CFM_GEO_01 - [0:0]
:CFM_GEO_02 - [0:0]
:CFM_GEO_05 - [0:0]
...

# Chain principale: jump per /8
-A CFM_GEO -s 1.0.0.0/8 -j CFM_GEO_01
-A CFM_GEO -s 2.0.0.0/8 -j CFM_GEO_02
-A CFM_GEO -s 5.0.0.0/8 -j CFM_GEO_05
...

# Sub-chain per il /8 = 1
-A CFM_GEO_01 -s 1.0.0.0/24 -j DROP -m comment --comment "CFM:geo"
-A CFM_GEO_01 -s 1.0.4.0/22 -j DROP -m comment --comment "CFM:geo"
...

COMMIT

The main chain CFM_GEO contains only the jump rules for /8 — typically 100-150 rules (not all 256 possible /8 are populated). When a packet arrives with source IP, say, 1.2.3.4:

  1. Cross the chain CFM_GEO: match on 1.0.0.0/8 → jump a CFM_GEO_01
  2. Cross CFM_GEO_01: ~350 rules specific to that /8
  3. If no match, return and continue with the next /8

Ratings per package: ~150 (jump) + ~350 (sub-chain) = ~500

Against the 51.000 of the monolithic set (or the ~16+ comparisons in the rbtree, multiplied by the complexity of the interval management). In practice, chain-splitting reduces the evaluations per packet by a factor of ~100x.

For IPv6 we use the same principle but grouping by /16 (the first two bytes of the address), since IPv6 prefixes are generally wider and fewer in number.

The results

After switching to chain-splitting:

Metric nft interval set (51k) iptables chain-split
DNS query 5.05s (timeout) 0.02s
curl google.com 5.10s 0.05s
Interactive SSH unusable normal
Loading rules ~ 3s ~ 2s

This isn't a marginal or, let's say, "cosmetic" improvement. It's the difference between a lag-free, fully functional server and an unusable one.

Let's also take into account that this IPTables configuration was running on an extremely modest configuration, namely 2 vCPUs of a Hetzner instance, and just 4GB of RAM, taking into account that on board there was also the PHP interpreter active with WordPress installed, a Percona Server 5.7 database, Memcache and obviously the WebServer most loved by system administrators in the world: NGINX, for an overall use of about a couple of gigabytes at least as shown in the following screenshot.

VPS-low-RAM-and-2-vCPU

The performance paradox

There's a profound irony in this story. nftables was designed—and is universally promoted—as the most performant replacement for iptables. And for many use cases, it is. Hash-based sets (without flags interval) for exact IPs work great. The atomic operations are elegant. The unified IPv4/IPv6 syntax is a pleasure.

But the specific use case of geo-blocking with tens of thousands of CIDR prefixes—which is probably the most common use case for very large sets—is precisely where nftables fails catastrophically.

The reason is structural: the set with flags interval they use a red-black tree, not a hash table. CIDR ranges can't be hashed directly because they require inclusion lookups (does IP 1.2.3.4 fall within the 1.2.0.0/16 range?), and this type of lookup requires an ordered data structure. The red-black tree guarantees O(log n) per operation, but with 51.000 elements and the network traffic of a production server, that log multiplies by millions of packets per second and becomes an intolerable bottleneck.

iptables, with its “dumb” approach of linear and trivial rules, doesn't have this problem because chain-splitting reduces the portion of rules evaluated for each packet to a manageable subset. Is it less elegant? Absolutely yes. Is it outdated? Absolutely yes. Is it antipattern, unsightly and unpleasant? Again, yes, yes, yes! Does it work? Absolutely yes.

The rest is just talk.

Lessons learned

1. Benchmarks are not your use case

The 2017 Red Hat benchmark tested lookup speed on sets with thousands of elements, but under controlled conditions. It didn't test what happens when a real server, with a local DNS resolver, active SSH connections, a web server, and various services, has to evaluate each packet against an interval set of 51.000 prefixes. Synthetic benchmarks measure maximum throughput; real-world benchmarks measure latency under mixed load.

2. “More modern” does not mean “better for every scenario”

nftables is objectively superior to iptables for most scenarios. But software engineering is a tradeoff, and the choice of data structure for interval sets (rbtree vs. hash) is a tradeoff that severely penalizes the case of very large sets. This isn't a bug—it's an architectural consequence.

3. Always test with real data

If we had tested with 51.000 real prefixes right away instead of a few hundred test rules, we would have discovered the problem during development. The lesson is simple: if your system needs to handle N elements in production, test with N elements in development.

4. Fallback is key

CFM now supports three backends (iptables, nftables, firewalld), and all three use chain-splitting for geoblocking. Having a multi-backend architecture allowed us to quickly isolate the problem and implement the solution without rewriting the entire system.

Conclusion

We're not writing this article to denigrate nftables—we use it daily and appreciate its qualities. We're writing it because, in our experience, the dominant narrative ("nftables is always faster than iptables") requires an important qualification: depends on the use case.

If you're managing sets of a few thousand exact IPs (blacklists, rate limiting, fail2ban), nftables with set hash is excellent. If you need to geoblock tens of thousands of CIDR prefixes, carefully consider chain-splitting with iptables before assuming that nftables is the right choice.

Sometimes the "old" solution works better. And in our business—production Linux systems engineering—what works trumps what's elegant, every single time.

References

Do you have doubts? Don't know where to start? Contact us!

We have all the answers to your questions to help you make the right choice.

Chat with us

Chat directly with our presales support.

0256569681

Contact us by phone during office hours 9:30 - 19:30

Contact us online

Open a request directly in the contact area.

DISCLAIMER, Legal Notes and Copyright. RedHat, Inc. holds the rights to Red Hat®, RHEL®, RedHat Linux®, and CentOS®; AlmaLinux™ is a trademark of the AlmaLinux OS Foundation; Rocky Linux® is a registered trademark of the Rocky Linux Foundation; SUSE® is a registered trademark of SUSE LLC; Canonical Ltd. holds the rights to Ubuntu®; Software in the Public Interest, Inc. holds the rights to Debian®; Linus Torvalds holds the rights to Linux®; FreeBSD® is a registered trademark of The FreeBSD Foundation; NetBSD® is a registered trademark of The NetBSD Foundation; OpenBSD® is a registered trademark of Theo de Raadt; Oracle Corporation holds the rights to Oracle®, MySQL®, MyRocks®, VirtualBox®, and ZFS®; Percona® is a registered trademark of Percona LLC; MariaDB® is a registered trademark of MariaDB Corporation Ab; PostgreSQL® is a registered trademark of PostgreSQL Global Development Group; SQLite® is a registered trademark of Hipp, Wyrick & Company, Inc.; KeyDB® is a registered trademark of EQ Alpha Technology Ltd.; Typesense® is a registered trademark of Typesense Inc.; REDIS® is a registered trademark of Redis Labs Ltd; F5 Networks, Inc. owns the rights to NGINX® and NGINX Plus®; Varnish® is a registered trademark of Varnish Software AB; HAProxy® is a registered trademark of HAProxy Technologies LLC; Traefik® is a registered trademark of Traefik Labs; Envoy® is a registered trademark of CNCF; Adobe Inc. owns the rights to Magento®; PrestaShop® is a registered trademark of PrestaShop SA; OpenCart® is a registered trademark of OpenCart Limited; Automattic Inc. holds the rights to WordPress®, WooCommerce®, and JetPack®; Open Source Matters, Inc. owns the rights to Joomla®; Dries Buytaert owns the rights to Drupal®; Shopify® is a registered trademark of Shopify Inc.; BigCommerce® is a registered trademark of BigCommerce Pty. Ltd.; TYPO3® is a registered trademark of the TYPO3 Association; Ghost® is a registered trademark of the Ghost Foundation; Amazon Web Services, Inc. owns the rights to AWS® and Amazon SES®; Google LLC owns the rights to Google Cloud™, Chrome™, and Google Kubernetes Engine™; Alibaba Cloud® is a registered trademark of Alibaba Group Holding Limited; DigitalOcean® is a registered trademark of DigitalOcean, LLC; Linode® is a registered trademark of Linode, LLC; Vultr® is a registered trademark of The Constant Company, LLC; Akamai® is a registered trademark of Akamai Technologies, Inc.; Fastly® is a registered trademark of Fastly, Inc.; Let's Encrypt® is a registered trademark of the Internet Security Research Group; Microsoft Corporation owns the rights to Microsoft®, Azure®, Windows®, Office®, and Internet Explorer®; Mozilla Foundation owns the rights to Firefox®; Apache® is a registered trademark of The Apache Software Foundation; Apache Tomcat® is a registered trademark of The Apache Software Foundation; PHP® is a registered trademark of the PHP Group; Docker® is a registered trademark of Docker, Inc.; Kubernetes® is a registered trademark of The Linux Foundation; OpenShift® is a registered trademark of Red Hat, Inc.; Podman® is a registered trademark of Red Hat, Inc.; Proxmox® is a registered trademark of Proxmox Server Solutions GmbH; VMware® is a registered trademark of Broadcom Inc.; CloudFlare® is a registered trademark of Cloudflare, Inc.; NETSCOUT® is a registered trademark of NETSCOUT Systems Inc.; ElasticSearch®, LogStash®, and Kibana® are registered trademarks of Elastic NV; Grafana® is a registered trademark of Grafana Labs; Prometheus® is a registered trademark of The Linux Foundation; Zabbix® is a registered trademark of Zabbix LLC; Datadog® is a registered trademark of Datadog, Inc.; Ceph® is a registered trademark of Red Hat, Inc.; MinIO® is a registered trademark of MinIO, Inc.; Mailgun® is a registered trademark of Mailgun Technologies, Inc.; SendGrid® is a registered trademark of Twilio Inc.; Postmark® is a registered trademark of ActiveCampaign, LLC; cPanel®, LLC owns the rights to cPanel®; Plesk® is a registered trademark of Plesk International GmbH; Hetzner® is a registered trademark of Hetzner Online GmbH; OVHcloud® is a registered trademark of OVH Groupe SAS; Terraform® is a registered trademark of HashiCorp, Inc.; Ansible® is a registered trademark of Red Hat, Inc.; cURL® is a registered trademark of Daniel Stenberg; Facebook®, Inc. owns the rights to Facebook®, Messenger® and Instagram®. This site is not affiliated with, sponsored by, or otherwise associated with any of the above-mentioned entities and does not represent any of these entities in any way. All rights to the brands and product names mentioned are the property of their respective copyright holders. All other trademarks mentioned are the property of their respective registrants. MANAGED SERVER® is a European registered trademark of MANAGED SERVER SRL, with registered office in Via Flavio Gioia, 6, 62012 Civitanova Marche (MC), Italy and operational headquarters in Via Enzo Ferrari, 9, 62012 Civitanova Marche (MC), Italy.

JUST A MOMENT !

Have you ever wondered if your hosting sucks?

Find out now if your hosting provider is hurting you with a slow website worthy of 1990! Instant results.

Close the CTA
Back to top