P2P Networking

The emergence of peer-to-peer (P2P) networking shows that technological determinism can be turned on its head: rather than socio-economic systems being determined by technological developments, sometimes technology can be determined by its users. Although in general the coupling goes both ways, P2P networking culminates the progression in computer architectures from mainframes through minicomputers, workstations, PCs, and client–server (master–slave) architectures, as information and communication technology (ICT), over a 50-year span, gradually adapted to the needs and behavior of its users. In contrast to the master–slave or client– server architecture, in a peer-to-peer network each node has the same status and can issue requests to or respond to requests from any other node. Centralized management of network traffic becomes functionally unnecessary, with implications of increased local autonomy and lack of centralized control of the content carried by such traffic.

At the research level, P2P networks benefit from a constructive tension between two completely different disciplinary domains. While P2P networks research is increasingly drawn toward biology and physics in response to ever-increasing demands on the performance of such networks, it is also drawn toward social science as the development of P2P applications is increasingly influenced by a convergence between social science and design, known also as “social computing”. Because this article is centered on P2P networks rather than on applications, the focus is on their technological, physical, and mathematical aspects.

Although P2P technology is still making waves as a “disruptive technology” in ICT (Subramanian & Goodman 2004), its potential has not been fully realized, primarily because none of the implementations that underpin all the major P2P applications fully meets the requirements of a “true” P2P architecture. The key vision of P2P is to move away from any centralization of services and a single point of failure/control, to a truly distributed environment that enables two or more individuals to collaborate to achieve a goal that is of value to at least one of them. To meet this vision, a P2P network must have a number of organizational as well as technological characteristics. While the provision of the resources and services that constitute the P2P network should ideally be shared by its community of users, the autonomy of individual users must be respected. P2P networks must be open to dynamic membership and connection topology, secure, scalable, robust to failures, and “autonomic” (self-configuring/self-optimizing/self-healing).

All instances of P2P applications meet the vision better in some respects and worse in others. For example, in Gnutella all peers have equal rights, but the search algorithm is extremely expensive in network traffic. Napster has a much faster search mechanism, but at the cost of using a centralized directory. Freenet removes the dependence on either centralized directories or single points of failure for file storage, but its search mechanism is even more complex than Gnutella’s.

Current research addressing the performance of P2P networks as their size increases to millions of nodes has led to the investigation of their underlying topological structure in terms of graph theory concepts. Until relatively recently the modeling of physical and nonphysical systems and processes has been performed under an implicit assumption that the interaction patterns among the components could be embedded onto a regular Euclidean lattice. Two mathematicians, Erdös and Rényi (1960), made a breakthrough in graph theory by describing a network with a complex topology by a random graph. Fundamentally, Gnutella uses this model. Many real-life complex networks are neither completely regular nor completely random, and in P2P networks, even just in file-sharing systems, implementations based on these two models have encountered significant problems.

An important model for several P2P networks (especially discovery systems) is the “small-world” model (Watts & Strogatz 1998), based on the discovery that any two people on the planet can be connected by acquaintance relationships between six intermediate people, on average. A prominent common feature of the random graph and the small-world models is that their node connectivity distribution peaks at an average value and decays exponentially on either side. Such networks are called “exponential networks” or “homogeneous networks” because each node has roughly the same number of link connections (the sharp peak in the distribution). Because this most frequent number of links depends on the size of the network, it provides a measure of its “scale.” Work by Barabasi (2002) has shown, however, that most networks in a wide range of contexts – from physics, to biology, to social science, to the Internet – exhibit a “scale-free” distribution of node connectivity, i.e., uniformly varying from a maximum to a minimum, without intermediate peaks and independent of network scale. Scale-free networks are inhomogeneous: most nodes have very few connections but a few nodes have many connections. These “aristocratic” nodes organize the network so that it too can exhibit a smallworld effect, but with a very different topology to the Watts and Strogatz small-world model.

Existing examples of P2P networks show limited achievement of the goals of P2P mainly because none of them exploits the scale-free properties that arise so often in natural and social networks. In a scale-free network a search can succeed in a very small number of steps if it is directed through one or two relevant and well-connected hubs. The introduction of “super-peers” is one way to begin to introduce scale-free aspects into P2P networks. However, such attempts to date have been of limited success as most “natural” scale-free networks have a hierarchy of super-peers, not just one additional layer. In addition, to remain true to the P2P philosophy, super-peer networks must be avoided where the super-peers are static and physically reside on (clusters of ) servers owned by a single organization. Somewhat paradoxically, although naturally occurring scale-free networks demonstrate highly efficient communication and stability against fragmentation, they are also extremely vulnerable to organized attack on the superconnected few. Counters to this threat involve carefully monitoring the hubs and either repairing them or delegating their responsibilities if their key role is jeopardized. Thus, the environment supported by a P2P network ideally should be truly dynamic as well as distributed. Research indicates that P2P networks are becoming ever more relevant to social computing and are performing in ways ever closer to how social systems actually work.

References:

Aberer, K., & Hauswirth, M. (2002). An overview on peer-to-peer information systems, Workshop on Distributed Data and Structures (WDAS-2002). At http://lsirpeople.epfl.ch/hauswirth/ papers/WDAS2002.pdf, accessed March 5, 2007.
Adamic, L. A., & Huberman, B. A. (2000). Power-law distribution of the world wide web. Science, 287(5461), 2115.
Barabasi, A. L. (2002). Linked: The new science of networks. New York: Perseus.
Barabasi A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509 –512.
Erdös, P., & Rényi, A. (1960). On the evolution of random graphs. Mathematics Institute of the Hungarian Academy of Science, 5, 17– 61.
Kan, G. (2001). Gnutella. In A. Oram (ed.), Peer-to-peer: Harnessing the power of disruptive technologies. Cambridge, MA: O’Reilly, pp. 94 –122.
Subramanian, R., & Goodman, B. D. (2004). Peer-to-peer computing: The evolution of a disruptive technology. New York: Idea Group.
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-world networks. Nature, 393, 440 – 442.