About the author:
Seagle Yang is Zenlayer’s Bare Metal Product Architect. He has over 20 years of experience in the industry as a Business Solutions Architect with expertise in system and network design and administration. Seagle loves taking on new technical challenges so he can keep learning and picking up new skills as he works to overcome them.
The data deluge and the bottleneck blues
The modern world runs on data. From real-time AI-powered applications to lightning-fast video streaming, our insatiable appetite for information is pushing the boundaries of technology. We’ve witnessed the evolution of CPUs and GPUs, with the power of compute performance increasing significantly in recent decades. However, a silent enemy lurks beneath the surface of this digital revolution: the bottleneck of bandwidth. Traditional server network speeds are struggling to keep pace with the ever-growing deluge of data, creating a frustrating gridlock that hinders performance and stifles innovation.
This is where the 100G Server Revolution steps in. As a monumental leap forward in server network technology, 100 Gigabit Ethernet (100GbE) doesn’t just offer unprecedented bandwidth; it represents a shift in how businesses can leverage data to achieve their goals.
In this 3-part series, we’ll explore the rising business demands driving the need for faster servers, take a look at the technological advancements that make 100GbE possible, and dive into the design of these powerful machines, highlighting how this technology translates into real-world benefits.
Business value: 10G vs. 100G server showdown
In the data center arena where information reigns supreme, power is the lifeblood that keeps everything humming. While 10 Gigabit Ethernet (10G) servers have been the workhorse for years, 100 Gigabit Ethernet (100G) is emerging as a strong contender. But before you upgrade, let’s crunch the numbers and see how they stack up in terms of cost.
We’ll begin by defining the parameters:
TDP: Every CPU has a Thermal Design Power (TDP) rating, which essentially tells you how much heat it generates at peak performance. This directly translates to power consumption, as the CPU needs to dissipate that heat. For dual-socket configurations, expect the TDP to double compared to a single-socket setup.
Peripherals: It’s not just the CPU that gobbles up power. System boards, memory, storage drives, and network adapters all eat up their fair share. On average, peripherals can draw between 140W and 180W at peak usage. 100G network adapters tend to be more power-hungry compared to their 10G counterparts (think 24W vs. 8W).
Server utilization: Server power consumption isn’t a constant hum. It fluctuates between peak periods (think busy hours) and non-peak periods (think off-hours). To account for this, we can use a representative value like 67% as an average utilization rate.
PSU efficiency: Modern servers come equipped with Power Supply Units (PSUs) that boast impressive efficiency ratings, typically around 80%. This means that for every 80 watts the server pulls from the wall, only 64 watts are actually used to power the system.
PUE impact: Data centers are complex ecosystems with their own power requirements for cooling and infrastructure. This is reflected in the Power Usage Effectiveness (PUE) metric. A lower PUE (ideally close to 1.0) indicates a more efficient facility. For this comparison, let’s assume a PUE of 1.6.
Monthly cost: Now that we’ve explored the key factors, let’s assess them in the real world. We’ll calculate the monthly operational cost for both 10G and 100G servers based on our assumptions. This will give us a clear picture of the financial implications associated with each network speed.
Formula for power-cost calculations
To determine how many kilowatt-hours (kWh) a system consumes in a month (730 hours):
kWh = ((TDP * Server Utilization) / PSU Efficiency Adjustment) * PUE * 730 / 1000
* Example: Power Cost Breakdown (1-Socket Server with 200W CPU)
Let’s take a sample 1-socket server with a 200W CPU. Here’s how we can estimate its monthly power cost:
- Peak power consumption: 200W (CPU) + 140W (peripherals) = 340W
- Average power consumption: 340W * 67% = 227W
- PSU efficiency adjustment: 227W / 0.8 (PSU efficiency) = 284.75W
- Monthly kWh consumption: 284.75W * 730 hours/month / 1000W/kW = 207.86 kWh
- Facility PUE impact: 207.86 kWh * 1.6 (PUE) = 332.59 kWh
- Total monthly power cost: 332.59 kWh * $0.30/kWh (assuming California electricity cost of June 2024) = $99.78
- Power cost to generate 1Mbps traffic:
- Even with high utilization (around 67% of the 10Gbps bandwidth or 6.7 Gbps), the cost to generate 1 Mbps of data on this 10G server would be:
- $99.78 / 6700 Mbps = $0.0149 per Mbps
Cost efficiency showdown
We can use the same formula to run the numbers for various configurations below. Simply factor in the appropriate TDP for each setup and assume all configurations use the same CPU—but keep in mind, a 2-socket configuration will double the peak CPU TDP. Similarly, you’ll want to factor in higher peripheral TDPs for 2-socket and 100G configurations. Doing this will result in the following power-cost comparison chart:
Even if we assume a lower average network utilization of around 50% of the 100Gbps bandwidth (or 50 Gbps), and a slightly higher power consumption introduced by 100GbE optic module, the cost per Mbps can still be significantly lower compared to a 10G single-socket system (refer to the chart below for more detailed comparisons). This cost advantage becomes more evident when factoring in server hardware design. Some single-socket servers are now capable of achieving 100GbE throughput, eliminating the need for dual-socket configurations in some cases. This translates to even greater cost savings by reducing overall server footprint and power consumption.
*Comparison notes and exclusions:
- These cost estimates primarily focus on power utilization, a significant operational expense for data centers.
- As data centers typically incur additional fixed costs like space/footprint surcharges and IP services, regardless of whether you choose 10G or 100G servers, these additional costs are not factored into the calculations presented here.
- While the initial cost of a dual-socket system or 100G network adapter might be higher compared to 10G configurations, the increased Mbps throughput of 100G can potentially offset this difference over time. In a scenario with 50% network utilization, the depreciation cost per Mbps generated by the 100G setup becomes a relatively minor factor (around 0.001x).
Server utilization: multiple ports bonding vs. single 100G
Technically, we can achieve 100GbE throughput by bonding multiple 10G or 25G ports to create a virtual 100G pipe. However, this approach comes with some performance overhead. Let’s take 4x25G as an example to explore the factors involved:
Multiple ports LACP bonding overhead
- LACP processing: The CPU needs to handle the Link Aggregation Control Protocol (LACP) overhead for managing the aggregation of the four links. This includes tasks like link negotiation, monitoring link health, and distributing traffic based on the chosen hash algorithm.
- Interrupt handling: Each of the four physical links can generate interrupts for the CPU when they receive or transmit data packets. This can increase the overall interrupt handling burden on the CPU compared to a single 100G port.
- Packet reassembly/distribution: For incoming packets, the server might need to reassemble packets that were fragmented across multiple links within the bond. Additionally, the CPU might be involved in distributing packets to the appropriate bonded link based on the hash algorithm.
Single 100G port overhead:
- Simpler processing: A single 100G port requires less CPU overhead for managing the link. There’s no LACP protocol overhead or the need for complex traffic distribution across multiple links.
- Reduced interrupts: The CPU only needs to handle interrupts from a single link, reducing the overall interrupt processing workload.
In general, you can expect the CPU overhead of 4x25G LACP bonding to be 10-20% higher compared to a single 100G port. A single 100G port offers a significant advantage in terms of CPU efficiency as it avoids LACP bonding overhead, letting you dedicate more CPU power to your applications compared to a system using bonded 4x25G ports.
Final thoughts
100GbE represents a monumental leap forward in server network technology. It breaks limitations by delivering unparalleled bandwidth that empowers real-time applications like AI and high-res video streaming. But its benefits extend beyond speed. 100GbE enables significant cost savings through optimized server utilization and data transfer. Additionally, it simplifies network management by reducing CPU overhead compared to traditional bonded ports. This translates to smoother user experiences and frees up valuable processing power that applications need to run at peak performance.
Indeed, 100GbE servers unlock a new standard of data center performance, cost-efficiency, and agility for businesses. This article merely scratches the surface. To follow, we’ll take a deeper look into the intricate design and construction of these powerful machines, explore the specific features of our 100G server solutions, and highlight how they can propel your business to the forefront of the data-driven world.
Stay tuned!