In this article, we will help you understand sizing, estimating performance, and licensing for the F5’s BIG-IP Virtual Edition (VE). If you’re new to virtualizing BIG-IP on commodity hardware, you’ve probably noticed you’re unable to find a clear datasheet for the F5 BIG-IP Virtual Edition (VE) like you’re able to find the datasheet the custom hardware boxes F5 sells. That’s because it’s impossible to provide an exact prescription as every environment is not the same, ie each has differing CPU families, types, core speeds, hypervisor platforms, etc. Subsequently, this will be the closest content you’ll find for an F5 VE datasheet. We’re calling this a dynamic datasheet as technology is constantly changing. As we move through this article, we will outline both hard and soft limits while highlighting factors that can influence performance in your environment. For the purposes of this article, we will assume you are deploying version 15.1 at a minimum as the latest versions of 15.1 are the minimal requirements for maximum performance in virtual environments at the time of this article.
Hard Throughput Limits – On-Premise aka “Private Clouds”
When deploying F5 BIG-IP VE in your own private cloud, you will have some hard throughput limits based on your hypervisor platform that cannot be exceeded, regardless of licensing, due to the nature of the underlying Hypervisor. At a high-level KVM will be the most performant Hypervisor platform with support for enhanced hardware offload such as Intel Smart NIC, Intel QAT, and SR-IOV on high-end 100Gbps NICs, VMWare will be the 2nd best performance, with Hyper-V and others at the bottom of the pack.
|Hypervisor||Max Throughput||Additional Technology Support|
|KVM||85Gbps||SR-IOV, Intel Smart Nic1.0/2.0, Intel QAT, Cisco eNic|
|VMware||Up to 40 Gbps||SR-IOV (Required for 10Ggbps – 40Gbps), TSO|
Hard Throughput Limits – Public Cloud
When deploying F5 BIG-IP VE in the major public clouds; for the purposes of this article, we will only look at Azure, AWS, and GCP; there are also “hard” throughput limits as well as considerations around the instance type baking the VE. In all 3 of the public clouds mentioned, we suggest a hard limit of 20Gbps through a single F5 BIG-IP VE instance. That is likely to increase over time, but at the time of this article, that’s our recommendation. Instance sizing will be very dependent on the modules that you are using as well as the licensed throughput. In some cloud environments, you may need to build your own AMI/VHD image using the F5 image builder to support the most ideal instance type for your workloads. Our baseline suggestions are as follows for most instance deployments:
- 5Gbps or below, we suggest using “balanced” type instances
- 10Gbps to 20Gbps, we suggest using “compute optimized” type instances to get access to higher clocked CPUs.
Note – Environments requiring a substantial number of SSL TPS should always target compute optimized instances and may need to allocate more cores than are licensed for optimal performance. This will be discussed later in the article.
|Cloud||Max Throughput||Additional Technology Support|
|AWS||20Gbps||Enhanced Networking (SR-IOV)|
|Azure||20Gbps||Accelerated Networking (SR-IOV)|
Backing Hardware Performance Considerations
Much like other Network Virtual Appliances (NVA)’s, F5 BIG-IP VE can be very sensitive to the hardware that is backing it. In general, you will want the most efficient CPUs (Highest IPC) as well as the highest core speed (More MHz is better) to get the maximum performance out of your VEs. Typically, we suggest running VEs on Hypervisors that are optimized for NVA functions. These are denoted by having high throughput NICs, High Core Speed SKU CPUs, and CPUs that are as modern as possible. You typically do not want to run VE’s on Hypervisors optimized for density, also known as high core count SKUs, as these typically have much lower core speed. Additionally, you want to ensure that your Hypervisor configuration exposes as many of the CPU functions as possible – This is typically an issue in mixed-family VMWare clusters where newer CPU’s cannot be fully leveraged to allow for VMotion compatibility. I will further highlight how CPUs impact various F5 BIG-IP VE functions below:
F5 BIG-IP VE, like many other NVAs, leverages DPDK (Data Plane Development Kit) to abstract the underlying hardware layer and provide a standard interface by which to use Hardware acceleration or CPU-driven packet processing. Due to the architecture of DPDK, Core speed will be critical as throughput levels increase due to the FIFO ring buffer. Additionally, having modern CPUs and NICs will enable DPDK to hardware offload as many features as possible. For additional information on how DPDK works, please see the documentation – https://doc.dpdk.org/guides/prog_guide/overview.html
F5 BIG-IP VE will leverage CPU fixed function offload of many cryptographic functions (such as AES encryption and hashing) where possible. Since BIG-IP VE does not have a licensed SSL TPS (Transactions per Second) limit, your SSL performance will be purely capped by the capabilities of the underlying hardware and your SSL cipher string and key size configurations. Additionally, SSL functions will be tied somewhat to the clock speed of the CPU as it processes these transactions through the system. In environments requiring high SSL performance, you will want to ensure your CPU has the latest fixed functions, which will typically be the following in ranking of performance:
|Random Number (ephemeral key generation)||RDRAND, Software (No Accel)|
|AES||AES-NI, Software (No Accel)|
|GCM||AVX2, AVX, SSSE3, Software (No Accel)|
|SHA1||SHA-NI, AVX2, AVX, SSSE3, Software (No Accel)|
|SHA2||SHA-NI, AVX2, AVX, SSSE3, Software (No Accel)|
* Example Data using 4k Key Sizes and full CPU allocated to VE w/Estimated maximums
|Example CPU||Estimated RSA Sign/Second||Estimated RSA Verify/Second|
|Xeon Gold 6226R (16 Cores)||~4000||~250K|
|AMD EPYC 7F52 (16 Cores)||~4500||~280K|
|Xeon E-2288G (8 Cores)||~2000||~150K|
|AMD EPYC 72F3 (8 Cores)||~2000||~140K|
Advanced Modules – AWAF/APM
When using either the Advanced WAF(AWAF) and/or APM Modules, the BIG-IP will be performing functions outside of the usual TMM (Traffic Management Microkernel Processes) that can be very core speed dependent. Additionally, providing a VE with more cores than licensed can help under certain conditions, such as heavy AWAF usage or instances where high volumes of APM authentication are being performed. As with SSL, you will typically want the newest CPUs that have the most instruction sets and the highest core speed as compared to core density.
Anytime you do more advanced traffic inspection/processing via iRules or Traffic Policies, you must remember that the instance now has to burn additional clock cycles for every transaction that occurs (Depending on the style of rule/policy). In this case, CPU clockspeed will be king to ensure the least amount of added latency or overhead to your traffic. In latency-sensitive environments or those with heavy iRule usage, we will always suggest they back the instance with the highest performing (IPC and Clockspeed) CPU that they can.
F5 BIG-IP VE Cloud Specific Instances Recommendations
When running in cloud environments, the hardware family is often noted but abstracted from the instance family naming. At the time of this article, the following instance families are what we suggest for higher-performance applications where the throughput will be 10Gbps or higher, large numbers of SSL transactions will be taking place, and/or a high volume of WAF/APM is occurring on the instances. Please do note that many of these instance types may require the creation of a custom image to have them selectable as an option. F5 BIG-IP Image Builder Tool
|Cloud||Instance Family||Alt Families|
|Azure||DSv5||Dasv5, Dsv4, Dasv4, Fsv2|
Licensing BIG-IP Virtual Edition – Throughput Limits
F5 prices their BIG-IP Virtual Edition licenses based on throughput limits. Below we will break down the most common licenses and consumption models. For those with edge case needs, we can always consult about licensing specific to your environment. Under most cases, F5 BIG-IP VE is licensed by the amount of throughput and the Good/Better/Best module bundle. This licensing model enables a fixed number of TMM cores based on the bundle and licensed throughput selected. The following modules come with each bundle:
- Good – Local Traffic Manager (LTM) module
- Better – LTM, Advanced Firewall Manager (AFM), and Domain Name System (DNS) modules Note – the DNS module was formerly known as the Global Traffic Manager (GTM)
- Best – LTM, AFM, DNS, Access Policy Manager (APM), and Advanced Web Application Firewall (AWAF)
The F5 BIG-IP VE has a special low-cost LAB license that is great for testing but prohibited from any production use. The throughput limitation on a LAB license is up to 10Mbps – past that, the system will drop traffic. The lab license is only available with the BEST module bundle. Still, certain functionality is impossible due to licensing around vented products used inside of the BIG-IP. Typically, this will be anything with a subscription component (BOT SDK, IPI, Threat Campaigns, etc.). When you need to test these add-on subscriptions, we recommend purchasing a 25mb license for your lab use instead of the 10mb lab license.
Up to 10Gbps Licenses
In situations where 10Gbps is your maximum throughput limit, you can purchase a license from 25mb all the way up to 10Gbps. Note the license limitation is NOT an aggregate of ingress and egress; you actually get the bandwidth limit in each direction. For example, with a 10Gbps license, you get 10Ggbps leaving your device, as well as coming into your device (20Gbps total). The following table lists the throughput licenses and the allowable cores per module bundle:
|Throughput License||Good (TMM Cores)||Better (TMM Cores)||Best (TMM Cores)|
High Performance Licenses – Beyond 10Gbps
When you need more than 10Gbps of bandwidth through an F5 BIG-IP VE the licensing changes over to “High Performance,” which has no hard throughput limitation but is instead is licensed based on the number of cores and modules enabled. The actual throughput limit for a High Performance VE is going to be dependent on the backing hypervisor or cloud as well underlying hardware limitations. There is only a singular bundle for High-Perf VEs, which is the VNF better bundle; however, that is typically not what we see consumed in most cases. Instead, we typically see High Perf VEs licensed with LTM + Another Module, such as LTM + AWAF or LTM + APM . The different CPU license options you can buy for a high-performance VE are listed below:
Dedicated / Stand-Alone Modules
While you can technically buy all the modules standalone, we typically only recommend that for the DNS module. For example, while you can buy the AWAF as a dedicated standalone license, you typically want to pair it up with the LTM module for full functionality – and vice versa. You can buy the LTM module as a standalone license ie the “Good” license, but you typically want to pair it up with a security module like AWAF because if you’re already terminating SSL at the LTM, it really lends its hand to providing security for that cracked open HTTP traffic. Not to mention F5 has the best WAF out there 😉
Dedicated DNS License – For those who need dedicated high-performance DNS/GSLB functionality in a virtual edition, F5 licenses them by volume of DNS that they can handle ie, Request per Second (RPS), not by throughput like the over VE licenses. The most common dedicated VE BIG-IP DNS licenses are listed below:
|DNS License Type||Request Per Second (RPS)|
|VE DNS 250K||250000 RPS|
|VE High Performance DNS||>250K RPS|
F5 BIG-IP Virtual Edition TIPs and Tricks
Many of you may be familiar with the HTSplit functionality on BIG-IP hardware, where the data plane (TMM) runs on the physical core and control/analysis plane functions run on the virtual(hyperthreaded) cores. This performance optimization is native to the hardware but not present on the Virtual Edition. You can emulate this by “overprovisioning” the number of CPU cores allocated to the instance. Many folks think that the licensed core count applies to all functions, but in reality, it only applies to the number of TMM instances that will be launched in the VE. When overprovisioning, one must monitor the TMM and Non-TMM CPU usage, as looking at overall CPU utilization can be misleading if the TMM cores are under high contention. Overprovisioning CPU can be useful under the following situations with reasons listed below:
- Heavy Control Plane usage – Automation, Large Configurations, Heavy Monitoring, Etc – The Operating System plane of the BIG-IP VE will indeed use the extra CPU cores for various functions, as we have seen time and time again. Still, you often do not need to double the number of CPU cores relative to the licensed limit. i.e., HT split may show 8TMMs and 8Control plane cores on Hardware, but you may only need to add 4 cores to an 8TMM licensed VE. One example would be a 1Gbps Best bundle instance provisioned with 12 cores with 8 TMMs running, leaving 4 extra cores for non-TMM processes. This can be tuned in the environment to ensure that no compute is “wasted” on the instance in question.
- Heavy SSL/TLS Throughput on older CPUs – When running older CPU families that may not be able to hardware accelerate all crypto functions, we have found that increasing the number of CPU cores can help an instance better handle the load. This has to do with how SSL/TLS processing occurs in software, how offloading instruction sets/software works in the Linux kernel, and how DPDK works under the hood. Depending on the instance’s load, the CPU’s age, and other module usage, we may “double up” on the licensed number of TMMs. i.e. Provision a 1Gbps Best bundle with 16 cores so that there are 8 TMMs and 8 extra cores for non-TMM processes.
- Heavy usage of AWAF and/or APM – In instances of heavy usage of the Advanced WAF or APM modules, many processes are not running inside TMM. One example would be the BD engine inside of Advanced WAF, and another would be many of the authentication processes used by APM when doing Kerberos or Active Directory functions. In these cases, we often “double up,” like with the Heavy SSL, to help ensure that control plane and analysis plane processes have additional overhead. We have often seen this significantly relieve CPU contention in cases of high throughput (3-10Gbps) Best bundles where high traffic levels are being processed in conjunction with AWAF/APM and heavy SSL/TLS usage.
Putting it all together
As you can see, many facets of VE licensing, sizing, and performance must be considered. When starting to deploy Virtual Editions, we always recommend first looking at the desired traffic level in throughput, then looking at your needed SSL capacity. From there, you can start to decipher licensing requirements and where the instance should live – in addition to what configuration you will need from the backing Hypervisor to meet your goals. If you want to move to F5 BIG-IP Virtual Edition and need help purchasing, sizing, or deploying them into your environment, please don’t hesitate to Contact Us.