Sistemas Distribu´ıdos Kademlia, Bittorrent e Magnetic Links
-
Upload
khangminh22 -
Category
Documents
-
view
0 -
download
0
Transcript of Sistemas Distribu´ıdos Kademlia, Bittorrent e Magnetic Links
MC714 - Sistemas Distribuıdos
Kademlia, Bittorrent e Magnetic Links
Islene Calciolari Garcia
Instituto de Computacao - Unicamp
Primeiro Semestre de 2015
Exercıcio para entrega
I Pesquise um metodo de busca/compartilhamento em redespeer-to-peer que nao tenha sido visto em aula e
I compare com outro que ja tenha sido apresentado.
I Formato: arquivo .pdf (3 a 6 paginas). Incluir referencias!!!
I Data de entrega: 31 de marco
I Apenas uma pessoa do grupo precisa entregar. O arquivodeve conter o nome dos integrantes do grupo.
SEARCHING TECHNIQUES IN PEER-TO-PEER NETWORK
SAuthor: Xiuqi Li and Jie WuPresenter: Zia Ush ShamszamanANLAB, ICE, HUFS
Fonte: Zia Ush Shamszaman, a partir do trabalho de Xiuqi Li and Jie Wu
CONCEPT OF P2P NETWORK P2P networks are overlay networks on top of Internet, where nodes
are end systems in the Internet and maintain information about a set of other nodes (called neighbors) in the P2P.
P2P networks offer the following benefits They do not require any special administration or financial arrangements. They are self-organized and adaptive. Peers may come and go freely. P2P
systems handle these events automatically. They can gather and harness the tremendous computation and storage r
esources on computers across the Internet. They are distributed and decentralized. Therefore, they are potentially fau
lt-tolerant and load-balanced.
3
Fonte: Zia Ush Shamszaman, a partir do trabalho de Xiuqi Li and Jie Wu
P2P NETWORK CLASSIFICATION-1/2
P2P networks can be classified based on the control over data location and network topology.
There are three categories: Unstructured: In an unstructured P2P network such as Gnutell
a, no rule exists which defines where data is stored and the network topology is arbitrary.
Loosely structured: In a loosely structured network such as Freenet and Symphony, the overlay structure and the data location are not precisely determined.
Highly structured: In a highly structured P2P network such as Chord, both the network architecture and the data placement are precisely specified.
4
Fonte: Zia Ush Shamszaman, a partir do trabalho de Xiuqi Li and Jie Wu
P2P NETWORK CLASSIFICATION-2/2
P2P networks can also be classified as centralized and decentralized In a centralized P2P such as Napster, a central directory of object location, ID a
ssignment, etc. is maintained in a single location. Decentralized P2Ps adopt a distributed directory structure. These
systems can be further divided into Purely decentralized systems, such as Gnutella and Chord, peers are totally equal. Hybrid systems, some peers called dominating nodes or super-peers serve the search r
equest of other regular peers.
Another classification of P2P systems is hierarchical & non-hierarchical based on whether the overlay structure is a hierarchy or not. All hybrid systems and few purely decentralized systems such as Kelips, are hie
rarchical systems. Hierarchical systems provide good scalability, opportunity to take advantage of node heterogeneity, and high routing efficiency
Most purely decentralized systems have flat overlays and are non-hierarchical systems. Non-hierarchical systems offer load-balance and highresilience
5
Fonte: Zia Ush Shamszaman, a partir do trabalho de Xiuqi Li and Jie Wu
WHAT IS “SEARCHING” IN P2P? Searching means locating desired data. Most existing P2P systems support thesimple object lookup by key
or identifier. Some existing P2P systems can handle more complex keyword que
ries, which find documents containing keywords in queries. More than one copy of an object may exist in a P2P system. There
may be more than one document that contains desired keywords. Some P2P systems are interested in a single data item; others are
interested in all data items or as many data items as possible that satisfy a given condition.
Most searching techniques are forwarding-based. Starting with the requesting node, a query is forwarded (or routed) to the desired node/s.
6
Fonte: Zia Ush Shamszaman, a partir do trabalho de Xiuqi Li and Jie Wu
DESIRED FEATURES OF SEARCHING ALGORITHMS IN P2P SYSTEMS High-quality query results Minimal routing state maintained per node High routing efficiency Load balance Resilience to node failures Support of complex queries
7
Fonte: Zia Ush Shamszaman, a partir do trabalho de Xiuqi Li and Jie Wu
QUALITY OF QUERY RESULT
The quality of query results is application dependent. Generally, it is measured by the number of results and relev
ance.
The routing state refers to the number of neighbors each node maintains.
The routing efficiency is generally measured by the number of overlay hops per query.
In some systems, it is also evaluated using the number of messages per query.
Different searching techniques make different trade-offs between these desired characteristics.
8
Fonte: Zia Ush Shamszaman, a partir do trabalho de Xiuqi Li and Jie Wu
1/31
Kademlia: A Peertopeer Information System Based on the XOR Metric
Based on slides by Amir H. Payberah ([email protected])
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
2/31
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
3/31
Kademlia Basics
•Kademlia is a keyvalue(object) store.
•Each object is stored at the k closest nodes to the object's ID.
•Distance between id1 and id2: d(id1, id2) = id1 XOR id2 If ID space is 3 bits:
d(1, 4) = d(0012, 1002) = 0012 XOR 1002
= 1012 = 5
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
4/31
Kademlia Routing Table
P
Node
KBucket List
KBucket
•Kbucket: each node keeps a list of references to nodes (contacts) of distance between 2i and 2i+1 for i=1 to i=N.
•Each Kbucket has max k entries.
[1, 2)
[2, 4)
[4, 8)
[8, 16)
[16, 32)
[32, 64)
[64, 128)
[128, 256)
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
5/31
Kademlia Tuning Parameters
•B is the size in bits of the keys used to identify nodes and store and retrieve data; in basic Kademlia this is 160, the length of an SHA1 digest (hash).
•k is the maximum number of contacts stored in a Kbucket; this is normally 20.
•alpha () represents the degree of parallelism in network calls, usually 3.
•Other constants used in Kad: tExpire = 86400s, the time after which a key/value pair expires; this is a time-to-live
(TTL) from the original publication date tRefresh = 3600s, after which an otherwise unaccessed bucket must be refreshed tReplicate = 3600s, the interval between Kademlia replication events, when a node
is required to publish its entire database tRepublish = 86400s, the time after which the original publisher must republish a
key/value pair
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
6/31
FIND_NODE in Kademlia
P
Node
KBucket List
Lookup Q
closest nodes to Q are stored here
•Closest nodes in ID space
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
7/31
FIND_NODE in Kademlia
P
Node
KBucket List
closest nodes to Q are stored here
A B C
... and select nodes from the appropriate kbucket
Lookup Q
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
8/31
FIND_NODE in Kademlia
FIND_NODE(Q)
P
A
B
C
FIND_NODE(Q)
FIND_NODE(Q)
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
9/31
FIND_NODE in Kademlia
A
Find k closest nodes to Q
Find k closest nodes to Q
B
Find k closest nodes to Q
Find k closest nodes to Q
C
Find k closest nodes to Q
Find k closest nodes to Q
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
10/31
FIND_NODE in Kademlia
Returns k closest nodes to Q
P
A
B
C
Returns k closest nodes to Q
Returns k closest nodes to Q
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
11/31
FIND_NODE in Kademlia, Update Kbuckets
Received responses from A, B and C
P
When P receives a response from a node, it updates the appropriate Kbucket for the sender’s node ID.
KBucket List
M
P issues up to new requests to nodes it has not yet queried from the set of nodes received in the responses
N
O
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
12/31
FIND_NODE in Kademlia
FIND_NODE(Q)
P
M
N
O
FIND_NODE(Q)
FIND_NODE(Q)
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
13/31
FIND_NODE in Kademlia
Received information in round n1
P
Received information in round n
Repeats this procedure iteratively until received information in round n1 and n are the same.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
14/31
FIND_NODE in Kademlia
P
T S XReceived information in round n R
P resends the FIND_NODE to k closest nodes it has not already queried ...
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
15/31
Let's Look Inside Kademlia
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
16/31
Node State
•Kbucket: each node keeps a list of information for nodes of distance between 2i and 2i+1.
0 <= i < 160
Sorted by time last seen.
110
111
100101
000011 010 001
[1, 2)
[2, 4)
[4, 8)
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
17/31
Node State
•Kbucket: each node keeps a list of information for nodes of distance between 2i and 2i+1.
0 <= i < 160
Sorted by time last seen.
110
111
100101
000011 010 001
[1, 2) Two first bits in common
[2, 4) First bit in common
[4, 8) No common prefix
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
18/31
Kademlia RPCs
•PING Probes a node to see if it is online.
•STORE Instructs a node to store a <key, value> pair.
• FIND_NODE Returns information for the k nodes it knows about closest to the target ID. It can be from one kbucket or more.
• FIND_VALUE Like FIND_NODE, ... But if the recipient has stored they <key, value>, it just returns the stored value.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
19/31
Store Data
• The <key, value> data is stored in k closest nodes to the key.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
20/31
Lookup Service
001
000
011010
110 100 111
[1, 2)
[2, 4)
[4, 8)
110
111
100
000011 010 001
[1, 2)
[2, 4)
[4, 8)
100
101
110111
011001 000 010
[1, 2)
[2, 4)
[4, 8)
Step1
Step2
Step3
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
21/31
Maintaining Kbucket List (Routing Table)
•When a Kademlia node receives any message from another node, it updates the appropriate kbucket for the sender’s node ID.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
22/31
Maintaining Kbucket List (Routing Table)
•When a Kademlia node receives any message from another node, it updates the appropriate kbucket for the sender’s node ID.
• If the sending node already exists in the kbucket: Moves it to the tail of the list.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
23/31
Maintaining Kbucket List (Routing Table)
•When a Kademlia node receives any message from another node, it updates the appropriate kbucket for the sender’s node ID.
• If the sending node already exists in the kbucket: Moves it to the tail of the list.
•Otherwise: If the bucket has fewer than k entries:
• Inserts the new sender at the tail of the list. Otherwise:
• Pings the kbucket’s leastrecently seen node:• If the leastrecently seen node fails to respond:
– it is evicted from the kbucket and the new sender inserted at the tail.• Otherwise:
– it is moved to the tail of the list, and the new sender’s contact is discarded.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
24/31
Maintaining Kbucket List (Routing Table)
•Buckets should generally be kept constantly fresh, due to traffic of requests travelling through nodes.
•When there is no traffic: each peer picks a random ID in kbucket's range and performs a node search for that ID.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
25/31
Join
•Node P contacts an already participating node Q.
•P inserts Q into the appropriate kbucket.
•P then performs a node lookup for its own node ID.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
26/31
Leave And Failure
•No action!
• If a node does not respond to the PING message, remove it from the table.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
27/31
Kademlia vs. Chord
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
28/31
Kademlia vs. Chord
• like Chord When = 1 the lookup algorithm resembles Chord's in term of message
cost.
•Unlike Chord XOR metric is symmetric, while Chord's metric is asymmetric.
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
29/31
Summary
001
000
011010
110 100 111
[1, 2)
[2, 4)
[4, 8)
110
111
100
000011 010 001
[1, 2)
[2, 4)
[4, 8)
100
101
110111
011001 000 010
[1, 2)
[2, 4)
[4, 8)
Step1
Step2
Step3
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
30/31
References
•Kademlia Specification http://xlattice.sourceforge.net/components/protocol/kademlia/specs.html
• Petar Maymounkov and David Mazieres, "Kademlia: A Peer-to-Peer Information System Based on the XOR Metric", IPTPS '02
http://www.cs.rice.edu/Conferences/IPTPS02/109.pdf
•Daniel Stutzbach and Reza Rejaie, "Improving Lookup Performance over a Widely-Deployed DHT", INFOCOM '06
http://www.barsoom.org/~agthorr/papers/infocom-2006-kad.pdf
•Raul Jimenez, Flutra Osmani and Bjorn Knutsson, “Sub-Second Lookups on a Large-Scale Kademlia-Based Overlay”, P2P '11.
http://people.kth.se/~rauljc/p2p11/jimenez2011subsecond.pdf
Fonte: https://www.kth.se/social/upload/516479a5f276545d6a965080/3-kademlia.pdf
The BitTorrent Protocol
Fonte: Prof. Sukumar Ghosh
What is BitTorrent?
Efficient content distribution system using file
swarming. Does not perform all the functions of a typical
p2p system, like searching.
The throughput increases with the number of down
loaders via the efficient use of network bandwidth
Fonte: Prof. Sukumar Ghosh
File sharingTo share a file or group of files, the initiator first creates a .torrent file, a small file that contains
Metadata about the files to be shared, and Information about the tracker, the computer
that coordinates the file distribution.
Downloaders first obtain a .torrent file, and then connect to the specified tracker, which tells them from which other peers to download the pieces of the file.
Fonte: Prof. Sukumar Ghosh
How it works
The file to be distributed is split up into pieces and an SHA-1 hash is calculated for each piece
Fonte: Prof. Sukumar Ghosh
BT ComponentsBT Components
The peers first obtain a metadata file for each objectThe metadata contains:
The SHA-1 hashes of all pieces A mapping of the pieces to files Piece size Length of the file A tracker reference
Fonte: Prof. Sukumar Ghosh
BT ComponentsBT Components
The tracker is a central server keeping a list of all peers participating in the swarm
A swarm is the set of peers that are participating in distributing the same files A peer joins a swarm by asking the tracker for a peer list and connects to those peers.
Fonte: Prof. Sukumar Ghosh
BitTorrent LingoSeeder = a peer that provides the complete file.Initial seeder = a peer that provides the initial copy.
Initial seeder
Seeder
Leecher
One who is downloading(not a derogatory term)
Leecher
Fonte: Prof. Sukumar Ghosh
Simple exampleSimple example
Seeder: A
Downloader B
{1,2,3,4,5,6,7,8,9,10}
{}{1,2,3}
Downloader C
{}{1,2,3}
{1,2,3,4}
{1,2,3,5}
{1,2,3,4,5}
Fonte: Prof. Sukumar Ghosh
Basic Idea
As a leecher downloads pieces of the file, replicas of the pieces are created. More downloads mean more replicas available
As soon as a leecher has a complete piece, it can potentially share it with other downloaders. Eventually each leecher becomes a seeder by obtaining all the pieces, and assembles the file. Verifies the checksum.
Fonte: Prof. Sukumar Ghosh
Pipelining
When transferring data over TCP, always have several
requests pending at once (typically 5), to avoid a
delay between pieces being sent.
Every time a piece or a sub-piece arrives, a new
request is sent out.
Fonte: Prof. Sukumar Ghosh
Piece Selection
• The order in which pieces are selected by different peers is critical for good performance
• If an inefficient policy is used, then peers may end up in a situation where each has all identical set of easily available pieces, and none of the missing ones.
• If the original seed is prematurely taken down, then the file cannot be completely downloaded! What are “good policies?”
Fonte: Prof. Sukumar Ghosh
Piece Selection
Small overlap is good Large overlap is bad wastes bandwidth
Fonte: Prof. Sukumar Ghosh
Piece selectionPiece selection
• Strict Priority• Rarest First
– General rule• Random First Piece
– Special case, at the beginning• Endgame Mode
– Special case
Fonte: Prof. Sukumar Ghosh
Random First Piece
• Initially, a peer has nothing to trade• Important to get a complete piece ASAP• Select a random piece of the file and
download it
Fonte: Prof. Sukumar Ghosh
Rarest Piece First
• Determine the pieces that are most rare among
your peers, and download those first.
• This ensures that the most commonly available
pieces are left till the end to download.
Fonte: Prof. Sukumar Ghosh
Endgame Mode
Near the end, missing pieces are requested from every peer containing them.
This ensures that a download is not prevented from completion due to a single peer with a slow transfer rate.
Some bandwidth is wasted, but in practice, this is not too much.
Fonte: Prof. Sukumar Ghosh
BT: internal mechanismBT: internal mechanism
• Built-in incentive mechanism (where all the
magic happens):
– Choking Algorithm
– Optimistic Unchoking
Fonte: Prof. Sukumar Ghosh
Choking• Choking is a temporary refusal to upload. It is one
of BT’s most powerful idea to deal with free
riders (those who only download but never
upload).
• Tit-for-tat strategy is based on game-theoretic
concepts.
Fonte: Prof. Sukumar Ghosh
Choking
Reasons for choking: – Avoid free riders– Network congestion
A good choking algorithm caps the number of
simultaneous uploads for good TCP performance.
Fonte: Prof. Sukumar Ghosh
More on Choking
Peers try out unused connections once in a while to find out if they might be better than the current ones (optimistic unchoking).
Fonte: Prof. Sukumar Ghosh
Optimistic unchokingOptimistic unchoking
• A BT peer has a single “optimistic unchoke” to which it uploads regardless of the current download rate from it. This peer rotates every 30s
• Reasons:– To discover currently unused connections that
are better than the ones being used– To provide minimal service to new peers
Fonte: Prof. Sukumar Ghosh
Upload-Only mode
• Once download is complete, a peer can only upload. The question is, which nodes to upload to?
• Policy: Upload to those with the best upload rate. This ensures that pieces get replicated faster, and new seeders are created fast
Fonte: Prof. Sukumar Ghosh
Questions about BT
• What is the effect of bandwidth constraints?
• Is the Rarest First policy really necessary?
• Must nodes perform seeding after downloading is complete?
• How serious is the Last Piece Problem?
• Does the incentive mechanism affect the performance much?
Fonte: Prof. Sukumar Ghosh
Trackerless torrents
BitTorrent also supports "trackerless" torrents,
featuring a DHT implementation that allows the client to download torrents that have been created without using a BitTorrent tracker.
Fonte: Prof. Sukumar Ghosh
UOZ–FS-CS
Magnet LinksMagnet Links
An Introduction..
Karwan Jacksi
An Introduction..
Karwan JacksiFaculty of Science
Computer Science DepartmentUniversity of Zakho
22/04/2012
Faculty - Department Seminar
Fonte: Karwan Jacksi
Outline:
• Background
• Client-Server vs. Peer to Peer Model.
UOZ–FS-CS
• Client-Server vs. Peer to Peer Model.
• BitTorrent Protocol.
• DHT Networks.
• Peer Exchange.
• Magnet Links
• History
• Use of Content Hashes• Use of Content Hashes
• Technical Description
• The Pirate Bay
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Background
• Client Server Model
– The server has to upload the file to all clients that are requesting.
UOZ–FS-CS
– The server has to upload the file to all clients that are requesting.
– The server bandwidth is the bottleneck when many concurrent applicants request.
– Would get congested and overload the server with too many requests.
– Lacks the robustness.
– since it has a single point of failure.
• Peer to Peer (P2P) Model• Peer to Peer (P2P) Model
– Offers more than a single source for files to be downloaded.
– Getting pieces from other peers increase while the number of concurrent peers increase.
– bandwidth is used efficiently.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Background
• BitTorrent Protocol
– One of many P2P file sharing prototypes in existence. e.g. Napster, Kazaa… etc.
UOZ–FS-CS
– One of many P2P file sharing prototypes in existence. e.g. Napster, Kazaa… etc.
– One of few P2P protocols that has managed to attract millions of users.
– One of the most common protocols for transferring large files.
– Its power comes from splitting the file into several smaller pieces.
– once a piece is obtained by a peer, it can be shared with other peers in the swarm.
– To download a file via BitTorrent, you need:
– Torrent file: a small metadata file with .torrent extension.
• Contains: information about files e.g. names, size, etc., and URL of a Tracker.
– Tracker: a navigation centre for the swarm and is responsible for helping clients to
find each other in their swarm.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Background
• Distributed hash table (DHT)
– A class of a decentralized distributed system that provides a lookup
UOZ–FS-CS
– A class of a decentralized distributed system that provides a lookup
service similar to a hash table;
– Usually most file sharing programs use a distributed hash table.
– The DHT Network is used to find IP addresses of peers present in a
swarm, instead of those provided by a tracker.
– DHT allows to search for peers using queries based on info hash and – DHT allows to search for peers using queries based on info hash and
requires no interaction whatsoever with the tracker(s) of that torrent.
– Search engines use DHT networks to look up
– what search terms are the most popular, and
– what different parts of the search engine most people use most frequently.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Background
• Peer Exchange (PEX)
– A communications protocol that augments the BitTorrent protocol.
UOZ–FS-CS
– A communications protocol that augments the BitTorrent protocol.
– It allows a group of peers that are collaborating to share a given file.
– The original design of the BitTorrent protocol, peers in a "swarm“ relied
upon a central computer server “tracker” to find each other.
– PEX greatly reduces the reliance of peers on a tracker by allowing each
peer to directly update others in the swarm as to which peers are
currently in the swarm.
– By reducing dependency on a centralized tracker, PEX increases the
speed, efficiency, and robustness of the BitTorrent protocol.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
What is Magnet Link?
– According to the original BitTorrent design, .torrent files are
downloaded from torrent web sites (usually index sites).
UOZ–FS-CS
– Upon downloading the file, the BitTorrent client calculates a 20-byte
SHA-1 hash of the info key from the .torrent file which it uses in the
query made to the tracker to uniquely identify the torrent and find out
IP addresses of other peers sharing that torrent
– to which it will subsequently connect and download the contents referred in the
.torrent file.
– Magnet Links take that a step further, since they contain embedded as
a parameter, not the link to a .torrent file but instead, the info-hash a parameter, not the link to a .torrent file but instead, the info-hash
value already calculated for that specific torrent file.
– Therefore, by clicking on a Magnet Link your client gets the info-hash of
the torrent passed to it, which it further uses to query the DHT Network
and find other peers which share that torrent.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Background
• ‘.torrent ‘ files
– For years, BitTorrent clients, trackers and indexers have relied on .torrent files to
UOZ–FS-CS
– For years, BitTorrent clients, trackers and indexers have relied on .torrent files to
store information on the files shared with the popular p2p protocol.
– These files are stored by indexing sites and are used by BitTorrent clients to
connect to the tracker sites.
– The files hold several types of data, a URL of the tracker site, names for the files
it shared, as well as hash codes of files.
– All of this is used by the client to connect with peers that have the files in the torrent,
or portions of them, and also to ensure that the downloaded data is accurate.
– This system has several disadvantages, some technical, but one of the
biggest is that BitTorrent indexers have to store the .torrent files on
their servers, which leaves them vulnerable to legal threats if the
content shared happens to be infringing despite containing no actual
infringing data by themselves.Karwan Jacksi
22/04/2012
Fonte: Karwan Jacksi
Magnet Links
– Magnet links though are just links, they have no files associated with
them just data.
UOZ–FS-CS
– The links are an evolving URI standard developed primarily to be used by
P2P networks.
– They differ from URLs in that they don't hold information on the
location of a resource but rather on the content of the file or files to
which they link.
– Technically, magnet links are made up of a series of parameters
containing various data in no particular order. containing various data in no particular order.
– In the case of BitTorrent :
– they hold the hash value of the torrent which is then used to locate copies of the files
among the peers.
– they may also hold file name data or links to trackers used by the torrent.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
– With magnet links, BitTorrent indexers don't have to store any file at all,
just a few snippets of data leaving the individual client apps to do all the
UOZ–FS-CS
heavy lifting.
– Magnet links can be copy-pasted as plain text by users and shared via
email, IM or any other medium.
– For the indexer sites, the allure is clear, using magnet links makes it
harder for them to be accused of any wrong-doing in court.
– Theoretically, magnet links should not have any disadvantages for the
users over .torrent files either. users over .torrent files either.
– It would also potentially make downloads faster as it would enable the
clients to download from peers which have identical files but with
different names.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
– In practice though, since the technology is still being actively developed,
some kinks still creep up.
UOZ–FS-CS
– Up until very recently, many of the major BitTorrent clients didn't support
magnet links at all.
– After the Pirate Bay introduced them, this is no longer a problem, but there
are still things to work out.
– Indexer sites haven't agreed on a single link format, so it’s up to the
clients to support the various implementations.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
– And for the users, the experience isn't on par with using plain .torrent
files yet.
UOZ–FS-CS
– For example, magnet links on the Pirate Bay don't have any additional data
on the torrent other than its content so when the link is opened in uTorrent,
for example, the torrent won't have a name or list the files in it.
– This leads to a second problem, without knowing the contents of the
torrent, uTorrent starts downloading it directly in the default location,
preventing users from selecting a custom location or selecting just some files
in a multiple-file torrent.
– These are likely to be just temporary set-backs, the recently-launched – These are likely to be just temporary set-backs, the recently-launched
TorrIndex, the world's first magnet link-only BitTorrent indexer, is listing
links which have additional information like tracker URLs and the torrent's
name.
– And with broader support from BitTorrent clients and indexers, magnet links
will eventually replace .torrent files sooner than you might expect.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
– Magnet links don't require a tracker (since it uses DHT), nor does it
require you to download a separate file before starting the download,
which is convenient.
UOZ–FS-CS
which is convenient.
– The main reason torrent sites are moving toward magnet links—apart
from convenience to the user—is that these links (probably) free torrent
sites like The Pirate Bay from legal trouble. Since The Pirate Bay won't
be hosting files that link to copyrighted content—that is, the torrent
files—it's more difficult to claim the site is directly enabling the
downloading of copyrighted material.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
• History
– The standard was developed in 2002, partly as a "vendor- and project-neutral
UOZ–FS-CS
– The standard was developed in 2002, partly as a "vendor- and project-neutral
generalization" of the ed2k: and freenet: URI schemes used
by eDonkey2000 and Freenet, respectively, and attempts to follow
official (Internet Engineering Tast Force) IETF URI standards as closely as
possible.
– Applications supporting magnet links include
μTorrent, aMule, BitComet, BitSpirit, BitTorrent, DC++, Deluge, FrostWire, gtkg
nutella, I2P, KTorrent, MLDonkey, Morpheus, Qbittorrent, rTorrent,Shareaza, Tra
nsmission and Vuze.nsmission and Vuze.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
• Use of content hashes
– The most common use of magnet links is to link to a particular file based
UOZ–FS-CS
– The most common use of magnet links is to link to a particular file based
on a hash of its contents, producing a unique identifier for the file,
similar to an ISBN or catalog number.
– Unlike traditional identifiers, however, content-based signatures can be
generated by anyone who already has the file, and so do not need a
central authority to issue them.
– This makes them popular for use as "guaranteed" search terms within
the file sharing community where anyone can distribute a magnet link to the file sharing community where anyone can distribute a magnet link to
ensure that the resource retrieved by that link is the one intended,
regardless of how it is retrieved.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
• Use of content hashes
– While it is theoretically possible that two files could have the same hash
UOZ–FS-CS
– While it is theoretically possible that two files could have the same hash
value (known as a "hash collision"), cryptographic hash functions are
designed so that the probability of this event is a practical impossibility
– even if an expert is intentionally looking to find two files with the same hash value.
– Another advantage of magnet links is their open nature and platform
independence:
– the same magnet link can be used to download a resource from one of any number of
applications on almost any operating system. applications on almost any operating system.
– Because magnets are concise and plain-text, it is possible for users to
simply copy-and-paste the links into emails or instant messages, a
property not found in, for example,BitTorrent files.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
• Technical description
– Magnet links consist of a series of one or more parameters, the order of
UOZ–FS-CS
– Magnet links consist of a series of one or more parameters, the order of
which is not significant, formatted in the same way as the query
string on the end of many HTTP URLs.
– The most common parameter is "xt", meaning "exact topic", which is
generally a URN formed from the content hash of a particular file, e.g..
magnet:?xt=urn:sha1:YNCKHTQCWBTRNJIV4WNAE52SJUQCZO5C
referring to the Base32 encoded SHA-1 hash of the file in question.
– Note that although this refers to a particular file, a search must still be
carried out by the client application to determine where, if anywhere, it
can obtain that file.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
• Technical description
– Other parameters defined by the draft standard are:
UOZ–FS-CS
– Other parameters defined by the draft standard are:
– "dn" ("display name"): a filename to display to the user, for convenience
– "kt" ("keyword topic"): a more general search, specifying search terms
rather than a particular file
– "mt" ("manifest topic"): a URI pointing to a "manifest", e.g. a list of
further items
– The standard also suggests that multiple parameters of the same type – The standard also suggests that multiple parameters of the same type
can be used by appending ".1", ".2", etc. to the parameter name, e.g.:
magnet:?xt.1=urn:sha1:YNCKHTQCWBTRNJIV4WNAE52SJUQCZO5C&xt.2=urn:sha1:T
XGCZQTH26NL6OUQAJJPFALHG2LTGBC7
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
• The Pirate Bay
– The world's largest BitTorrent tracker is shutting down!.
UOZ–FS-CS
– The world's largest BitTorrent tracker is shutting down!.
– As of January 2012, The Pirate Bay has switched to magnet links as the
default option and may use magnet links exclusively eventually.
– On February 28, 2012, The Pirate Bay started using magnet links
entirely.
– It has decided that there is no need to run a tracker anymore, so it will
remain down! It's the end of an era, but the era is no longer up to date.
– We have put a server in a museum already, and now the tracking can be
put there as well,” the Pirate Bay announced.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
• The Pirate Bay
– Recently though, technologies like Distributed Hash Table (DHT) and
UOZ–FS-CS
– Recently though, technologies like Distributed Hash Table (DHT) and
Peer Exchange (PEX) have rendered trackers useless as they are able to
find peers without the need for a tracker server. This makes the whole
system a lot more stable and resilient to technical problems, but
perhaps even more important, it makes it a lot harder to be attacked by
anti-piracy organizations.
– Instead, the Pirate Bay, now feature a magnet link, allowing users to get
access to a torrent without the need to download any file making the access to a torrent without the need to download any file making the
sites even less susceptible to legal threats.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Magnet Links
• Advantages
– They do not need a central authority to issue.
UOZ–FS-CS
– They do not need a central authority to issue.
– They has open nature and platform independence, the same magnet link can be
used as long as the system has the appropriate application.
– They are more user based, easy to use.
– All the system need is an application that support magnet links.
– Since it is user based, it is so easy to share resources.
– Most Magnet links application has a search function.
Disadvantages:• Disadvantages:– Slower speed.
– Less control on speed and on the contend that is being downloaded
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
Torrents
• Advantages
• Faster connection.
UOZ–FS-CS
• Faster connection.
• Easier to search through the web.
• Disadvantages:
– Trackers are needed when downloading a contend. If the tracker is
down and there are no existing connections, the download may never be
finished.
– Most torrent client do not have search function. Torrents usually would – Most torrent client do not have search function. Torrents usually would
be find from the internet.
– If a torrent was stored on web for long time, the tracker may be expired
already. It is almost impossible to find existing seed or leechers.
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi
References
• Further Development of BitTorrent Simulator in Erlang (Karwan Jacksi)
• Is P2P dying or just hiding? (Thomas Karagiannis UC Riverside [email protected])
UOZ–FS-CS
• Is P2P dying or just hiding? (Thomas Karagiannis UC Riverside [email protected])
• Distributed algorithms for improving BitTorrent performance (ANIL CAN AKAY)
• Incentives Build Robustness in BitTorrent (Bram Cohen)
• http://lifehacker.com/5411311/bittorrents-future-dht-pex-and-magnet-links-explained
• http://en.wikipedia.org/wiki/Distributed_hash_table
• http://en.wikipedia.org/wiki/Peer_exchange• http://en.wikipedia.org/wiki/Peer_exchange
• http://en.wikipedia.org/wiki/The_Pirate_Bay
• http://en.wikipedia.org/wiki/Secure_Hash_Algorithm
Karwan Jacksi22/04/2012
Fonte: Karwan Jacksi