[CN notes] Router Design and Algorithms (Part 2)

6 min readDec 22, 2022

CN notes 前情提要：傳送門

目錄：
• Why is packet classification needed?
• What are three established variants of packet classification?
• What are the simple solutions to the packet classification problem?
• How does fast searching using set-pruning tries work?
• What’s the main problem with the set pruning tries?
• What is the difference between the pruning approach and the backtracking approach for packet classification with a trie?
• What’s the benefit of a grid of tries approach?
• Describe the “Take the Ticket” algorithm.
• What is the head-of-line problem?
• How is the head-of-line problem avoided using the knockout scheme?
• How is the head-of-line problem avoided using parallel iterative matching?
• Describe FIFO with tail drop.
• What are the reasons for making scheduling decisions more complex than FIFO?
• Describe Bit-by-bit Round Robin scheduling.
• Bit-by-bit Round Robin provides fairness; what’s the problem with this method?
• Describe Deficit Round Robin (DRR).
• What is a token bucket shaping?
• In traffic scheduling, what is the difference between policing and shaping?
• How is a leaky bucket used for traffic policing and shaping?

Why is packet classification needed?

因為互聯網變得越來越複雜，網絡需要為其流量提供服務質量和安全保證。我們需要根據 TCP flags 和 source addresses 等多種標準來處理 packets。我們將這種更精細的 packets 處理稱為 Packet classification。

What are three established variants of packet classification?

Firewalls —路由器在網絡的入口和出口點實施防火牆，以過濾掉不需要的流量或執行其他安全策略。
Resource reservation protocols — 例如 DiffServ 已用於在源和目標之間預留帶寬。
Routing based on traffic type — 基於特定流量類型的路由有助於避免對時間敏感的應用程序的延遲。

What are the simple solutions to the packet classification problem?

Linear search — 防火牆實現對規則數據庫執行線性搜索並跟踪最佳匹配規則。此解決方案對於一些規則可能是合理的，但搜索可能包含數千條規則的大型數據庫的時間可能會令人望而卻步。
Caching — 另一種方法是緩存結果，以便將來的搜索可以更快地運行。這有兩個問題：(1) 緩存命中率可能很高（80–90%），但我們仍然需要執行未命中的搜索。 (2) 即使有 90% 的高命中率緩存，規則空間的緩慢線性搜索也會表現不佳。
Passing labels — done in the header and typically at the edges which saves time. MPLS 和 DiffServ 使用此技術

How does fast searching using set-pruning tries work?

Build a trie on destination prefixes and then at every leaf-node we “hang” the source tries that are compatible (or whatever other dimension we are considering for packet classification). To match a prefix, we first traverse the destination trie and then the source trie while keeping track of the lowest-cost matching rule.

What’s the main problem with the set pruning tries?

Memory explosion — a source prefix can occur in multiple destination tries

What is the difference between the pruning approach and the backtracking approach for packet classification with a trie?

Set pruning has a high cost in memory with a lower cost in time. Backtracking saves memory but costs more in time. Each rule is only stored once.

What’s the benefit of a grid of tries approach?

It is a middle-ground approach that balances the memory and time costs by using precomputation with switch pointers. These are basically shortcuts.

Describe the “Take the Ticket” algorithm.

Each output line maintains a distributed queue for all input lines that want to send packets to it. When an input line wants to send a packet to a specific output line, it requests a ticket. The input line waits for the ticket to be served. At that point, the input line connects to the output line, the crosspoint is turned on, and the input line sends the packet.

What is the head-of-line problem?

當 A 在第一次迭代中發送它的數據包時，B 和 C 的整個隊列都在等待。我們將此問題稱為 head-of-line (HOL) blocking，因為整個隊列被隊列頭的進度阻塞。

How is the head-of-line problem avoided using the knockout scheme?

Break up the packets into a fixed size (cell). Have the fabric running N times faster than the input links where k is the expected number of cells received by an output link. For cases where the expectation is violated, randomly pick the output that is chosen. Complex to implement.

How is the head-of-line problem avoided using parallel iterative matching?

All inputs send requests in parallel to all outputs they want to connect with. This is the request phase. In the grant phase, the outputs randomly pick an input out of its requestors. In the accept phase, inputs randomly pick an output to send to. This way, all of the inputs are sending packets from the start.

Describe FIFO with tail drop.

Packets are sent to the output ports. The output ports are FIFO and any packets that overflow the buffer (tail of the queue) are dropped. This results in fast scheduling decisions but loss of packets.

What are the reasons for making scheduling decisions more complex than FIFO?

To provide additional (router) support for congestion — 由於使用量的增長速度超過了鏈接速度，因此互聯網上出現擁塞的可能性越來越大。雖然大多數流量基於 TCP（它有自己的方法來處理擁塞），但額外的路由器支持可以通過幫助處理擁塞來提高源的吞吐量。
To promote fair sharing of links among competing flows — 在備份期間，這些數據包往往會淹沒輸出鏈路的緩衝區。如果我們使用帶有尾部丟棄的 FIFO，這會阻塞其他流，導致客戶端的重要連接凍結。這為用戶提供了次優體驗，表明需要進行更改
To provide quality of service (QoS) guarantees on measures such as delay and bandwidth— 實現公平共享的一種方法是保證流的某些帶寬。另一種方法是通過路由器保證流的延遲。這對於視頻流非常重要，如果沒有延遲限制，實時視頻流將無法正常工作。因此，找到為帶寬和延遲提供保證的高效調度算法非常重要

Describe Bit-by-bit Round Robin scheduling.

Gives bandwidth and delay guarantees. We calculate the packet finishing time for each packet and send the packet with the smallest finishing round number based on the previous round of the algorithm.

Bit-by-bit Round Robin provides fairness; what’s the problem with this method?

Requires introducing extra time complexities such as keeping track of the finishing time (requires priority queue). The extra complexities make it hard to implement at gigabit speeds.

Describe Deficit Round Robin (DRR).

Solves some of the time complexities of bit-by-bit round robin by using a deficit counter instead of performing all the calculations of finishing time. This ensures fairness.

What is a token bucket shaping?

Used for the scenarios where we want bandwidth guarantees for flows in the same queue without separating them. Limits the burstiness of a flow by limiting the average rate and limiting the maximum burst size. The technique assumes a bucket per flow that fills with tokens at a rate R per second with a max of B tokens. Additional tokens are dropped. When packets arrive, they can go through if there are enough tokens, otherwise it must wait for more tokens to fill the bucket.

In traffic scheduling, what is the difference between policing and shaping?

policing 和 shaping 是限制鏈路輸出速率的機制。通過 identifying traffic descriptor violations 以兩種不同的方式對其進行響應來控制輸出速率。

Policer ：當流量速率達到最大配置速率時，丟棄多餘的流量，或者更改數據包的設置或“標記”。輸出速率顯示為鋸齒波。
Shaper：整形器通常將多餘的數據包保留在隊列或緩衝區中，並將這些多餘的數據包安排用於以後的傳輸。結果是過多的流量被延遲而不是被丟棄。因此，當數據速率高於配置的速率時，流被整形或平滑。

Traffic shaping and policing 可以協同工作。

圖展示了 policing 和 shaping 情況下輸出速率的外觀差異。 Leaky Bucket 是一種可用於流量 policing 和流量 shaping 的算法

How is a leaky bucket used for traffic policing and shaping?

The leaky bucket algorithm 類似於水流入漏桶，水以恆定速率洩漏。容量為 b 的桶代表一個保存數據包的緩衝區，而水對應於傳入的數據包。洩漏率 r 是允許數據包進入網絡的速率，它是恆定的，與數據包到達的速率無關。

如果一個到達的數據包在添加到桶中時沒有導致溢出，則稱它是合格的。否則，稱為不合格。分類為合格的數據包被添加到桶中，而不合格的數據包被丟棄。因此，如果桶已滿，到達桶的新數據包將被丟棄。

無論數據包的輸入速率如何，輸出速率都是恆定的，這導致發送到網絡的數據包分佈均勻。該算法可以實現為單個服務器隊列。

[CN notes] Router Design and Algorithms (Part 2)

Written by 莉森羊

No responses yet