CN notes 前情提要:傳送門
目錄:
• What is the drawback to using the traditional approach of having a single, publicly accessible web server?
• What is a CDN?
• What are the six major challenges that Internet applications face?
• What are the major shifts that have impacted the evolution of the Internet ecosystem?
• Compare the “enter deep” and “bring home” approach to CDN server placement.
• What is the role of DNS in the way CDN operates?
• What are the two main steps in CDN server selection?
• What is the simplest approach to selecting a cluster? What are the limitations of this approach?
• What metrics could be considered when using measurements to select a cluster?
• How are the metrics for cluster selection obtained?
• Explain the distributed system that uses a 2-layered system. What are the challenges of this system?
• What are the strategies for server selection? What are the limitations of these strategies?
• What is consistent hashing? How does it work?
• Why would a centralized design with a single DNS server not work?
• What are the main steps that a host takes to use DNS?
• What are the services offered by DNS, apart from hostname resolution?
• What is the structure of the DNS hierarchy? Why does DNS use a hierarchical scheme?
• What is the difference between iterative and recursive DNS queries?
• What is DNS caching?
• What is a DNS resource record?
• What are the most common types of resource records?
• Describe the DNS message format.
• What is IP Anycast?
• What is HTTP Redirection?
What is the drawback to using the traditional approach of having a single, publicly accessible web server?
- Users are located all across the globe, interruptions for geographically separated users can be prevalent
- Viral videos will cause the server to be overloaded
- Single point of failure in the case of a natural disaster
What is a CDN?
基本上,CDN (Content Distribution Network) 是:
- 由多個地理分佈的服務器和/或數據中心組成的網絡
- 具有內容副本(視頻,還有許多其他類型的 Web 內容)
- 可將用戶引導至最能滿足用戶請求的服務器或服務器集群
What are the six major challenges that Internet applications face?
- Peering point congestion
- Inefficient routing protocols
- Unreliable networks
- Inefficient communication protocols
- Scalability
- Application limitations and slow rate of change adoption
What are the major shifts that have impacted the evolution of the Internet ecosystem?
- Increased demand for online content, especially videos
- Topological flattening of the Internet
Compare the “enter deep” and “bring home” approach to CDN server placement.
- Enter deep — phrase used to describe placing CDNs deep into the access networks of the world. Makes the distance between the user and the closest server cluster as small as possible. Downside is that it’s difficult to manage and maintain so many clusters.
- Bring home — place fewer, larger clusters at key points — less servers to maintain but the users will experience higher delay and lower throughput
What is the role of DNS in the way CDN operates?
DNS servers will consult local DNS servers for the ISP / CDN and determine the CDN that contains the requested video. The DNS will proceed to provide the client with the IP address of the CDN cluster / server containing their requested content.
What are the two main steps in CDN server selection?
- Mapping the client to a cluster
- Selecting a server from the cluster
What is the simplest approach to selecting a cluster? What are the limitations of this approach?
- Selecting the geographically closest cluster
- Selecting the geographically closest cluster is actually picking the closest cluster to the LDNS which might not be the closest to the client.
- The closest cluster might not have the best performance either.
What metrics could be considered when using measurements to select a cluster?
The end-to-end metrics to be considered for cluster selection are delay and bandwidth.
How are the metrics for cluster selection obtained?
- Active metric collection through probing, pinging.
- Passive metric collection to track network conditions.
Explain the distributed system that uses a 2-layered system. What are the challenges of this system?
- The cluster selection strategy proposes requires a centralized controller that has a real-time view of the network conditions — difficult to do given the scale of today’s networks.
- This model also needs to have data for different subnet-cluster pairs. Some clients will be deliberately routed to sub-optimal clusters.
What are the strategies for server selection? What are the limitations of these strategies?
- A server could be assigned randomly. Not optimal because a highly stressed server could be selected randomly.
- Load balancing could be used, but also not optimal
What is consistent hashing? How does it work?
Distributed hash table used to balance load, assigning roughly the same number of content IDs and requires relatively little movement of these content IDs when nodes join and leave the system.
Why would a centralized design with a single DNS server not work?
Introduces a single point of failure
What are the main steps that a host takes to use DNS?
- The user host runs the client side of the DNS application
- The browser extracts the hostname and passes it to the client side of the DNS application
- DNS Client sends a query containing the hostname of DNS
- DNS Client eventually receives a reply which includes the IP address of the hostname
- As soon as the host receives the IP address, it can initiate a TCP connection to the HTTP server located at that IP
What are the services offered by DNS, apart from hostname resolution?
- Mail server / host aliasing
- Load distribution
What is the structure of the DNS hierarchy? Why does DNS use a hierarchical scheme?
- The DNS hierarchy solves the scalability problem.
- The hierarchy has root servers, top level domain servers, authoritative servers, and local DNS servers.
What is the difference between iterative and recursive DNS queries?
- Iterative — the client is referred to a different DNS server in the chain until it can resolve the request
- Recursive — each DNS server will resolve the hostname on behalf of the client, client doesn’t have to submit more than one request
What is DNS caching?
- Saving hostname resolutions locally
- DNS Caching 的思想是,在迭代和遞歸查詢中,服務器收到從任何主機映射到 IP 地址的 DNS 回復後,將此信息存儲在緩存內存中,然後再發送給客戶端
What is a DNS resource record?
- A method of storing the hostname to IP address resolution
- 假設主機 A (apricot.poly.edu) 向本地 DNS 服務器 dns.poly.edu 查詢 cnn.com 的 IP。一段時間後,另一台主機 B (kiwi.poly.fr) 向本地 DNS 服務器 dns.poly.edu 查詢同一主機名 (cnn.com) 的 IP。由於緩存本地 DNS 將能夠立即為 cnn.com 發送響應。
What are the most common types of resource records?
- Type A — domain name and IP address
- Type NS — domain name and appropriate authoritative DNS server
- Type CNAME — alias hostname and canonical name
- Type MX — alias hostname of a mail server and the canonical name of the mail server
Describe the DNS message format.
What is IP Anycast?
Route a client to the closest server as determined by BGP. Assigns the same IP address to multiple servers and lets BGP handle getting the client to the closest server.
What is HTTP Redirection?
Just sending a client a 300-level code to request the content from a different server. Useful for load balancing, doesn’t require central coordination.