Distributed hash table: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: hdl updated in citation with #oabot.
Reverted 1 edit by 112.215.244.250 (talk): Broke syntax
 
(5 intermediate revisions by 4 users not shown)
Line 11:
These systems differed in how they located the data offered by their peers. Napster, the first large-scale P2P content delivery system, required a central index server: each node, upon joining, would send a list of locally held files to the server, which would perform searches and refer the queries to the nodes that held the results. This central component left the system vulnerable to attacks and lawsuits.
 
Gnutella and similar networks moved to a [[query flooding]] model{{spaced ndash}} in essence, each search would result in a message being broadcast to every machine in the network. While avoiding a [[single point of failure]], this method was significantly less efficient than Napster. Later versions of Gnutella clients moved to a [[dynamic querying]] model which vastly improved efficiency.<ref>{{cite journal |last1=Richter, Stevenson|display-authors=et al |title=Analysis of the impact of dynamic querying models on client-server relationships |journal=Trends in Modern Computing |date=2009 |pages=682–701}}</ref>
 
Freenet is fully distributed, but employs a [[Heuristic (computer science)|heuristic]] [[key-based routing]] in which each file is associated with a key, and files with similar keys tend to cluster on a similar set of nodes. Queries are likely to be routed through the network to such a cluster without needing to visit many peers.<ref>{{citation |url=https://freenetproject.org/papers/lic.pdf |title=Searching in a Small World Chapters 1 & 2 |access-date=2012-01-10 |archive-date=2012-03-16 |archive-url=https://web.archive.org/web/20120316102141/https://freenetproject.org/papers/lic.pdf |url-status=dead }}</ref> However, Freenet does not guarantee that data will be found.
Line 17:
Distributed hash tables use a more structured key-based routing in order to attain both the decentralization of Freenet and Gnutella, and the efficiency and guaranteed results of Napster. One drawback is that, like Freenet, DHTs only directly support exact-match search, rather than keyword search, although Freenet's [[routing algorithm]] can be generalized to any key type where a closeness operation can be defined.<ref>{{citation |chapter-url=https://freenetproject.org/papers/ddisrs.pdf |title=A Distributed Decentralized Information Storage and Retrieval System |chapter=Section 5.2.2 |access-date=2012-01-10 |archive-date=2012-03-16 |archive-url=https://web.archive.org/web/20120316102156/https://freenetproject.org/papers/ddisrs.pdf |url-status=dead }}</ref>
 
In 2001, four systems&mdash;[[Content addressable network|CAN]],<ref name="Ratnasamy01">{{Cite journal |last1=Ratnasamy |first1=Sylvia |last2=Francis |first2=Paul |last3=Handley |first3=Mark |last4=Karp |first4=Richard |last5=Shenker |first5=Scott |date=2001-08-27 |title=A scalable content-addressable network |url=https://dl.acm.org/doi/10.1145/964723.383072 |journal=SIGCOMM Comput. Commun. Rev. |volume=31 |issue=4 |pages=161–172 |doi=10.1145/964723.383072 |issn=0146-4833}}</ref> [[Chord (peer-to-peer)|Chord]],<ref>[[Hari Balakrishnan]], [[M. Frans Kaashoek]], David Karger, [[Robert Tappan Morris|Robert Morris]], and Ion Stoica. [http://www.cs.berkeley.edu/~istoica/papers/2003/cacm03.pdf Looking up data in P2P systems] {{Webarchive|url=https://web.archive.org/web/20160519125101/http://www.cs.berkeley.edu/~istoica/papers/2003/cacm03.pdf |date=2016-05-19 }}. In [[Communications of the ACM]], February 2003.</ref> [[Pastry (DHT)|Pastry]], and [[Tapestry (DHT)|Tapestry]]&mdash;ignitedbrought DHTsattention asto a popular research topicDHTs.
A project called the Infrastructure for Resilient Internet Systems (Iris) was funded by a $12 million grant from the United States [[National Science Foundation]] in 2002.<ref>{{Cite news |title= New P2P network funded by US government |author= David Cohen |work= New Scientist |date= October 1, 2002 |url= https://www.newscientist.com/article.ns?id=dn2861 |access-date= November 10, 2013 |archive-date= April 6, 2008 |archive-url= https://web.archive.org/web/20080406123915/http://www.newscientist.com/article.ns?id=dn2861 |url-status= live }}</ref>
Researchers included [[Sylvia Ratnasamy]], [[Ion Stoica]], [[Hari Balakrishnan]] and [[Scott Shenker]].<ref>{{Cite news |title= MIT, Berkeley, ICSI, NYU, and Rice Launch the IRIS Project |work= Press release |publisher= MIT |date= September 25, 2002 |url= https://iris.pdos.csail.mit.edu/MITPressRelease1.doc |access-date= November 10, 2013 |url-status= dead |archive-url= https://web.archive.org/web/20150926070618/https://iris.pdos.csail.mit.edu/MITPressRelease1.doc |archive-date= September 26, 2015 }}</ref>
Line 36:
The structure of a DHT can be decomposed into several main components.<ref>Moni Naor and Udi Wieder. [http://www.wisdom.weizmann.ac.il/~naor/PAPERS/dh.pdf Novel Architectures for P2P Applications: the Continuous-Discrete Approach] {{Webarchive|url=https://web.archive.org/web/20191209032152/http://www.wisdom.weizmann.ac.il/~naor/PAPERS/dh.pdf |date=2019-12-09 }}. Proc. SPAA, 2003.</ref><ref>Gurmeet Singh Manku. [http://www-db.stanford.edu/~manku/phd/index.html Dipsea: A Modular Distributed Hash Table] {{webarchive|url=https://web.archive.org/web/20040910154927/http://www-db.stanford.edu/~manku/phd/index.html |date=2004-09-10 }}. Ph. D. Thesis (Stanford University), August 2004.</ref> The foundation is an abstract [[Keyspace (distributed data store)|keyspace]], such as the set of 160-bit [[string (computer science)|string]]s. A keyspace [[Partition (database)|partitioning]] scheme splits ownership of this keyspace among the participating nodes. An [[overlay network]] then connects the nodes, allowing them to find the owner of any given key in the keyspace.
 
Once these components are in place, a typical use of the DHT for storage and retrieval might proceed as follows. Suppose the keyspace is the set of 160-bit strings. To index a file with given {{Var serifmvar|filename}} and {{mvar|data}} in the DHT, the [[SHA-1]] hash of {{mvar|filename}} is generated, producing a 160-bit key {{mvar|k}}, and a message {{math|''put''(''k, data'')}} is sent to any node participating in the DHT. The message is forwarded from node to node through the overlay network until it reaches the single node responsible for key {{mvar|k}} as specified by the keyspace partitioning. That node then stores the key and the data. Any other client can then retrieve the contents of the file by again hashing {{mvar|filename}} to produce {{mvar|k}} and asking any DHT node to find the data associated with {{mvar|k}} with a message {{math|''get''(''k'')}}. The message will again be routed through the overlay to the node responsible for {{mvar|k}}, which will reply with the stored {{mvar|data}}.
 
The keyspace partitioning and overlay network components are described below with the goal of capturing the principal ideas common to most DHTs; many designs differ in the details.
Line 158:
* [[Oracle Coherence]]: an in-memory data grid built on top of a Java DHT implementation
* [[Perfect Dark (P2P)|Perfect Dark]]: a [[peer-to-peer]] [[file-sharing]] application from Japan
* [[Retroshare]]: a [[Friend-to-friend]] network<ref>[httphttps://retroshare.sourceforge.net/wiki/index.php/Frequently_Asked_Questions#4-1_How_does_RetroShare_know_my_friend.27s_IP_address_and_port.3F_Why_don.27t_I_need_a_static_IP_address.3F_What_is_DHT_for.3F Retroshare FAQ] {{Webarchive|url=https://web.archive.org/web/20130717094704/http://retroshare.sourceforge.net/wiki/index.php/Frequently_Asked_Questions#4-1_How_does_RetroShare_know_my_friend.27s_IP_address_and_port.3F_Why_don.27t_I_need_a_static_IP_address.3F_What_is_DHT_for.3F |date=2013-07-17 }} retrieved December 2011</ref>
* [[Jami (software)|Jami]]: a privacy-preserving voice, video and chat communication platform, based on a Kademlia-like DHT
* [[Tox (protocol)|Tox]]: an [[instant messaging]] system intended to function as a [[Skype]] replacement