Smart Content Delivery Technology: CCN or CDN

47
Smart Content Delivery Technology: CCN or CDN? KyoungSoo Park ([email protected]) Department of Electrical Engineering KAIST

Transcript of Smart Content Delivery Technology: CCN or CDN

Smart Content Delivery Technology: CCN or CDN?

KyoungSoo Park ([email protected])

Department of Electrical Engineering KAIST

Talk Overview

Content distribution networks

HTTP CDN

Redundancy Elimination

Content-centric Networking (CCN)

Comparisons of CDN and CCN

2

Content Distribution Network (CDN)

Goals Load balancing

Fast response

High availability

Handling flash crowd

content

Origin Server

replica

replica replica

replica

replica

replica

3

Popular Contents for CDNs

4

Recent Trend: Large Contents

5

Current Traffic (U.S.A., 2011)

6

How Web CDN Works?

7

DNS or HTTP redirection

Origin server (content provider)

HTTP Content Distribution Networks (CDNs)

Cache Miss?

Core CDN Technolgy

External load balancing Request redirection

Proximity, server load, content availability

Internal load balancing Consistent hashing

IntraCDN request routing & splitting

Server technology High-performance caching proxy

Disk/memory trade-offs • Custom file system & parallel disk access

8

External Load Balancing

9

Request Redirection

Problem: given a request, which CDN server to retrieve the content from?

Choosing the best servers

Network geography (e.g., TCP)

CDN server load

Content availability

Existing techniques

DNS redirection/ HTTP redirection / IP Anycast

10

DNS Redirection

DNS name mapping

Most widely-used

11

http://graphics8.nytimes.com/images/misc/nytlogo379x64.gif

NY Times Logo Image

URL: http://graphics8.nytimes.com/images/misc/nytlogo379x64.gif

[kyoungso@opus]$ dig graphics8.nytimes.com

;; ANSWER SECTION:

graphics8.nytimes.com. 300 IN CNAME graphics478.nytimes.com.edgesuite.net.

graphics478.nytimes.com.edgesuite.net. 9250 IN CNAME a1116.x.akamai.net.

a1116.x.akamai.net. 20 IN A 204.153.49.136

a1116.x.akamai.net. 20 IN A 204.153.49.137

12

Short TTL Akamai

DNS Redirection

[kyoungso@opus ~]$ ping 204.153.49.137 (Princeton, NJ, USA)

PING 204.153.49.137 (204.153.49.137) 56(84) bytes of data.

64 bytes from 204.153.49.137: icmp_seq=1 ttl=60 time=0.731 ms

64 bytes from 204.153.49.137: icmp_seq=2 ttl=60 time=0.321 ms

64 bytes from 204.153.49.137: icmp_seq=3 ttl=60 time=0.388 ms

[kyoungsoo@ndsl ~]$ dig graphics8.nytimes.com (Daejeon, Korea)

a1116.x.akamai.net. 20 IN A 61.111.58.33

a1116.x.akamai.net. 20 IN A 61.111.58.65

[kyoungsoo@ndsl ~]$ host 61.111.58.65

65.58.111.61.in-addr.arpa domain name pointer 61-111-58-65.kidc.net.

13

LG Telecom Data Center

DNS Redirection

Assumptions

Client’s local DNS server (LDNS) is close to client

LDNS does a name lookup

CDN DNS finds the best server for LDNS

Problem & workaround

What if LDNS is far from client?

• A KAIST client uses Princeton’s DNS server?

HTTP redirection

• Redirection again based on actual client IP

14

HTTP Redirection

How it works?

A client sends a request to a CDN server

CDN server responds with HTTP 302 redirection

• Better server for this client

• Based on actual client IP

15

DNS

graphics8.nytimes.com

CDN Server (Princeton)

CDN Server (LG Telecom)

HTTP Redirection

Pros & Cons

Accurate server selection (+)

Extra app-level RTT (-)

App-level processing overhead (-)

Transparent method?

16

IP Anycast

How it works?

One IP mapped to multiple servers in different PoPs

IP Anycast allows routing to nearest server

17

DNS

235.45.98.88

CDN Server2 (USA)

235.45.98.88

CDN Server1 (Korea)

235.45.98.88

Router

IP Anycast

Pros & Cons No DNS abuse (+) Accurate network geography (+) Fault tolerance (+) TCP is a statefull protocol (-) Abrupt route change leads to download failure (-) Require ISP cooperation (-)

18

x.y.z.w/16

x.y.z.w/24

x.y.z.w

x.y.z.w

Longest prefix matching x.y.z.w

Internal Load Balancing

19

Internal Load Balancing

Problem: how to balance load among CDN servers?

Front-end distributes requests to back-end nodes

Internal request routing & splitting

Load-aware request distribution (LARD)

Split request into n mini requests (CoBlitz)

Consistent hashing

Reliable service even with node churn

Can be applied globally with DNS

20

Consistent Hashing

How it works?

Map each back-end server to an IDx

Calculate a request IDy = Hash(request)

Find a server IDk closest to IDy

21

Request ID=1010

ID=1011

ID=1100

ID=0011

ID=0001

ID=1110

ID=1101

Consistent Hashing

Pros

Hashing distributes load among the servers

Request locality

No request shuffling at node churn

• Cf. static hashing = f(x) % N

Cons

CPU overhead

Corner case can disrupt load balancing

• R1, R2, R3 are mapped to Node X

• Workaround : Highest Random Weight (HRW)

22

Redundancy Elimination (RE)

23

Network-wise Redundancy Elimination(RE)

Per-packet content-based caching

Many research papers

[SIGCOMM’00], [SIGCOMM’08], [SIGCOMM’09]

Basic idea

1. Fingerprint each packet (divide it into N fragments)

2. Send fingerprints (compressed, or encoded)

3. Destination looks up the fingerprints

4. Reconstruct the original packet from the fingerprint cache

24

Network-wise Redundancy Elimination(RE)

25

Routers keep a cache of recent pkts

New packets get “encoded” or “co

mpressed” w.r.t cached pkts

Encoded pkts are “decoded” or “unc

ompressed” downstream

Slide borrowed from the SmartRE talk

Network-wise Redundancy Elimination(RE)

Pros: protocol independence

Can even cache HTTP uncacheable content

Fine-grained redundancy elimination (packet)

Cons: index explosion

Index can be too big (at high-speed networks)

Not sure if the reconstruction cost is small

Larger than the packet level?

Chunk-based RE

26

ISP Networks

Transparently divide content into chunks

Chunk-based Caching

users Content Provider

1. Content request

CN RN

46273811

34282346

93472423

82346754

73649568

Chunk hash storage (Hash, Chunk) storage

2. Deliver content

3. Deliver chunk hashes

4. Look up content for the chunk hashes

5. Deliver cached chunks

6. Deliver content at cache miss

27

46273811

34282346

93472423

82346754

73649568

Amplify the bandwidth of the bottleneck link

Content-Centric Networking (CCN)

28

Content Centric Networking

Fundamental change to the current Internet Specify what to deliver instead of how

Suppresses the redundancy in data delivery

One of the four projects by funded by NSF FIA program Named Data Networking (NDN)

MobilityFirst

Nebula

Expressive Internet Architecture

The idea looks really cool! Is it realizable?

29

How CCN Works?

Interest (request) and Data (response)

One Interest matches one Data

No (IP) addresses!

30

CCN Names

Opaque, structured byte strings

/parc.com/van/cal/417.vcf/v3/s0/0x3fdc96a4…

represented as a sequence of components

component = byte count (n) + n bytes

8:parc.com 3:van 3:cal …

(Following slides (31-40) are from Van Jacobson’s talk)

31

Basic CCN Forwarding

Consumer ‘broadcasts’ an interest over any available communications media:

get ‘/parc.com/van/slides.pdf’

Interest identifies a collection of data – all data items whose name has the interest as a prefix

Anything that hears the interest and has an element of the collection can respond with it:

HereIs ‘/parc.com/van/slides.pdf/v6/p1’

<data>

32

IP Packet Forwarding

33

IP Packet Forwarding

34

CCN Node Model

36

get /parc.com/videos/

WidgetA.mpg/v3/s2

CCN Node Model

37

get /parc.com/videos/

WidgetA.mpg/v3/s2

Advantages

Trivial any-to-any communications

Natural support for mobility

No IP address shortage, no NAT, no scalable address management

Optimal CDN solution

41

CDN vs. CCN: Different Layers

CDN: Operates at application Layer

HTTP content distribution (even P2P)

CDN server: end system (atop TCP)

CCN: Operates at IP layer

Packet level content distribution

Need CCN routers

E2E Principle?

43

CDN vs. CCN: Efficiency

CDN: Cache hit at object level

One hit – 1-2GB movie download

Suboptimal network geography

• No direct support from routers

CCN: Cache hit at packet level

One hit – one packet download

Optimal network geography

• Download from the closest router

44

CCN: Practical Implications

1 Interest packet = 1 Data packet

ContentStore at 10G Router?

10 Gbps = 14 million IP packets / sec

Lookup speed (14 millions/sec?)

Index size (name + Data)

• 100-byte name x 10 billion entries = 1 TB

• What about the Data?

Pending Interest Table (PIT)

How big should it be?

Too small => no Data can be delivered

45

CCN: Practical Implications

Many security issues

Fill the PIT with bogus entries

Fill the ContentStore with bogus entries

Incur many expensive table lookups

Easily kill the router

Challenging to be solved in the near future

What if technology evolves?

46