Currently, there are four categories of businesses that explicitly require low latency: First, financial and electronic trading users, especially those familiar with high-frequency trading of products such as futures; second, high-definition video services based on the TCP protocol, including 4K/8K high-definition video live streaming and on-demand services, high-definition video conferencing, and future virtual reality (VR) services with extremely high real-time requirements and high bandwidth; third, some cloud services, especially virtual machine migration, data hot backup, and cloud desktops and cloud payments with high real-time requirements; and fourth, 5G mobile network transmission services, as the latency indicators reserved for the transmission layer in 5G networks are currently very stringent, requiring some new low-latency transmission technologies to ensure their reliability.
Financial/trading businesses have an extreme need for low latency.
Information asymmetry is the golden rule for profitable trading. Lower latency means your information arrives faster than others, your orders reach the trading center sooner, and you are more likely to profit. In developed financial and trading markets (especially the United States), high-frequency trading (HFT), also known as machine trading or algorithmic trading, has been prevalent for many years, covering futures, stocks, forex, and other areas. Statistics show that in 2009, HFT trading volume accounted for 61% of the total trading volume in the United States, and this figure climbed to 70% in 2012, while in the United Kingdom it even reached 77% in 2011.
Back in 2008, a research report titled "The Value of a Millisecond: Finding the Optimal Speed of a Trading Infrastructure" published by the American consulting firm TABB Group pointed out that for a trading company in the US electronic trading market, if its trading system processing time (including transmission latency) is 5 milliseconds (ms) slower than its competitors, it will lose 1% of its profits; if it is 10 ms slower, the loss will increase to 10%.
Figure 1: Schematic diagram of time delay performance of the New York-Chicago microwave relay circuit
The adage "time is money" is vividly demonstrated in high-frequency trading. As early as 2007, Information Week magazine, in a report titled "Wall Street's Quest To Process Data At The Speed Of Light," claimed that in the US electronic financial trading market, the advantage of 1ms latency was worth $100 million. A 2014 report disclosed by Virtue Financial, a US-based HFT company preparing for its IPO, showed that it had only one unprofitable trading day out of 1278 trading days over the past five years.
The development of HFT has driven financial and electronic trading companies to pursue low latency to the extreme: on the one hand, these companies deploy their servers as close as possible to the servers of trading institutions (NYSE, NASDAQ, CME, etc.), preferably in the same data center; on the other hand, these companies' pursuit of low-latency transmission circuits has also reached a fever pitch.
High throughput demand for services such as 4K/8K/virtual reality
The TCP protocol has become the mainstream protocol on the Internet. TCP's acknowledgment mechanism ensures reliability, but it also introduces throughput limitations, as shown in the following formula: TCP throughput is limited by three factors: bandwidth (BW), round-trip time (RTT), and packet loss rate (ρ). Assuming sufficient bandwidth and good network quality, packet loss rate can be disregarded, then latency becomes the decisive factor. If latency is too high, the client experience cannot be improved by increasing bandwidth alone; this is figuratively called a "bandwidth black hole."
Currently, the TCP protocol header indicates the Congestion Window (CWND) size using 16 bits, therefore the maximum CWND value is 64KB (65536 Bytes); MSS (Maximum Segment Size) is the maximum segment size, typically 1460 bytes. In transmission networks, the packet loss rate can generally be assumed to be zero, so the last term does not need to be considered; assuming a bandwidth (BW) of 10Gbps and a one-way latency of 10ms (round-trip time RTT of 20ms); according to the above formula, the maximum throughput of the TCP protocol is only 26.3Mbps, far lower than the network bandwidth.
Considering actual network conditions, the industry generally believes that the throughput of real-time high-throughput services such as 4K/8K high-definition video needs to reach 1.5 times the actual bitrate to guarantee service quality. Therefore, the throughput requirement for 4K high-definition video is 30~45Mbps, and according to the above formula, the maximum tolerable round-trip time (RTT) is 12~17ms.
While the industry has proposed some new technologies to address the sliding window limitation of the TCP protocol—for example, RFC7323 expands the total sliding window size to 30 bits (2^30 = 1073725440 bytes), and application layer software using UDP transmission or multiple TCP threads can also improve the throughput limitation caused by the sliding window—these solutions require overall network upgrades and are difficult to deploy comprehensively in the short term. Given the existing network environment, reducing latency is the most direct and effective way to solve the TCP sliding window limitation problem. The deployment of IDC and CDN needs to consider the latency requirements of high-throughput services such as 4K/8K high-definition video and be deployed appropriately.
Low latency requirements for real-time cloud services
The development trends of cloud computing, big data, and the Internet of Things have led to an increasing number of businesses operating in the cloud. The cloud has become an unavoidable trend for communication networks, and data centers, as the physical carriers of the cloud, are gradually becoming the core of network traffic. Data migration within and between data centers (DCI) is becoming increasingly frequent, which we call cloud communication services. Some cloud communication services have strict real-time requirements, and these services also require low latency.
The most typical real-time cloud communication service is virtual machine migration, such as hot migration, which usually requires latency of less than 10ms. In addition, cloud data hot backup, cloud disaster recovery, high-throughput collaborative computing and other cloud communication services also have strict real-time requirements.
As more and more upper-layer services migrate to the cloud, strict latency requirements will be placed on the cloud network to meet user experience needs. For example, the best experience for cloud payment services requires a latency of less than 10ms, and the best experience for cloud desktop services requires a latency of less than 20ms, and so on.
Low latency requirements of 5G mobile communication
5G is currently in its early stages, but it has set very ambitious goals. The ITU has officially named the 5G standard IMT-2020. In 2014, my country's IMT-2020 (5G) Promotion Group published a white paper entitled "5G Vision and Requirements," which outlined the key technical capabilities of 5G, including: user experience rates of 0.1~1Gbps, connection density of one million per square kilometer, end-to-end latency in milliseconds (ms), traffic density of tens of Tbps per square kilometer, mobility of over 500 kilometers per hour, and peak speeds of tens of Gbps (see Figure 2). This white paper clearly defines the three basic performance indicators of 5G networks: user experience rate, connection density, and latency.
Figure 2: Key Capability Indicators for IMT-2020 (5G) as defined by ITU-R M. 2083
Compared to 4G/LTE, 5G has much stricter requirements for latency. In the future development of 5G, in-depth collaboration between technical experts in both wireless and bearer networks is needed to formulate clearer and more reasonable latency metrics to better support 5G development.
In conclusion, with the development of electronic transactions, high-definition video, cloud computing, and future 5G services, latency has become a crucial performance indicator for communication networks, and low latency will be a key competitive advantage for operators' network capabilities. In developed countries and international leased line markets, low-latency circuits have become a specialized product category, and latency has become a vital SLA and differentiating factor for major operators' leased line services.
For example, PACNET, a submarine cable company owned by Telstra, categorizes its dedicated submarine cable transmission between two nodes into three levels based on latency performance: Low Latency, Standard Latency, and Best Effort, offering differentiated pricing. Tata Communications announced the completion of its global low-latency network in 2012, primarily serving global financial companies and high-frequency trading firms. Verizon also established a dedicated division for financial companies (VFN: Verizon Financial Network) and launched a dedicated low-latency network between six major financial and trading data centers in New York, New Jersey, and Chicago in 2012. The most representative example is the dedicated transmission line between the CME Chicago Cermak Road data center and the New Jersey Carteret data center, with a bidirectional latency of 14.5ms, 40% lower than conventional circuits.