| A Survey on Peer-to-Peer Systems | | | | |
| 1.G.Satyavathy,Lecturer,Department Of Computer | | | | Presence Information: Presence information is very |
| Science,Sri Ramakrishna College Of Arts and Science | | | | important in P2P applications. It provides information |
| For Women,Coimbatore-641 044. | | | | about which peers and resources are available. This is |
| 2.Dr.M.Punithavalli,Director and Head,Department Of | | | | relevant for the self-organization of the system. The |
| Computer Science,Sri Ramakrishna College Of Arts | | | | use of information is also important to share |
| and Science For Women,Coimbatore-641 044. | | | | processor cycles because the system knows which |
| ABSTRACT | | | | processor is overload and which one not. The peers |
| In this survey, we propose a framework for | | | | are agents of information for the others peers. |
| analyzing peer-to-peer content distribution | | | | |
| technologies. Our approach focuses on nonfunctional | | | | Document management: usually documents systems |
| characteristics such as security, scalability, | | | | are centrally organized, this allows share storage, |
| performance, fairness, and resource management | | | | management and use of data. A great effort is |
| potential, and examines the way in which these | | | | necessary to create a centralized index of relevant |
| characteristics are reflected in—and affected | | | | documents. The experience shows that documents |
| by—the architectural design decisions adopted by | | | | created in a company are distributed among the |
| current peer-to-peer systems. Nowadays | | | | desktop PCs without a central repository having any |
| Peer-to-Peer (P2P) systems became an important | | | | knowledge of their existence. In this case, the P2P |
| part of Internet, millions of users have been | | | | networks are very useful. |
| attracted to use their structures and services.The | | | | |
| popularity of Peer-to-Peer systems speed up | | | | Collaboration: P2P permits management of documents |
| academic research joining researchers from systems, | | | | at level of closed working groups. |
| networking and theory. The most popular P2P | | | | |
| applications support file-sharing and content | | | | 5.2. Files |
| distribution, new applications are emerging in different | | | | |
| fields, Internet telephony is an example. This paper | | | | A characteristic of file-sharing is that sometimes |
| discusses the issues of P2P systems such as | | | | peers ares client when they download files and |
| characteristics, structures, protocols, drawbacks, | | | | sometimes servers when they upload files (sevents). |
| open problems and futures fields of development. | | | | A central problem in P2P systems is the searching of |
| | | | | the contents or files required (lookup problem)[4]. In |
| Keywords: distributed systems, peer-to-peer, | | | | the context of file-sharing, three different models |
| algorithms, performance design, grid computing, | | | | have been developed: the flooding request model |
| peer-topeer. | | | | (Gnutella) [16, 17], the centralized directory model |
| | | | | (Napster) and document routing model (Freenet) [6, |
| 1. INTRODUCTION | | | | 7, 14]. |
| Computation in networks of processing nodes, each | | | | |
| holding a part of the inputs and/or resources initially, | | | | 5.3. Bandwidth |
| can be classified into centralized or distributed | | | | |
| computations. A centralized solution relies on one | | | | The traffic on networks is constantly rising, mainly in |
| node being designated as the computer node that | | | | large volume of multimedia data, file-sharing, so the |
| processes the entire application locally. In distributed | | | | effective use of the bandwidth has suffer an |
| computation, the processing steps of the application | | | | important increment. When data are centralized and a |
| are divided among the participating nodes. The goal in | | | | spontaneous increment of demand arises, the |
| such systems is to minimize communication and | | | | bandwidth becomes a bottleneck. P2P approach |
| computation cost. Distributed systems can be further | | | | increases load balancing without any kind of additional |
| classified into a client-server model and a P2P model. | | | | administration,by taking advantage of transmission |
| In the client-server model, the server is the central | | | | routes which are not fully exploited. This concept is |
| registering unit, as well as the only provider of | | | | appliedin the areas of streaming. A shared use of the |
| content and services. A client only requests content | | | | bandwidth is also very well exploited splitting big files |
| or the execution of services, without sharing any of | | | | in smaller blocks which are downloaded by the |
| its own services. The client-server model can be flat | | | | requesting peers, BitTorrent [8] is an implementation |
| where all clients only communicate with a single | | | | using this principle. |
| server or it can be hierarchical for improved scalability. | | | | |
| During years and today the client-server paradigm is | | | | 5.4. Storage Space |
| the battle horse of the most users applications. In | | | | |
| the last years there is a new paradigm that is | | | | With P2P storage networks, only a portion of the |
| emerging, peer-to-peer (P2P) mainly supporting | | | | disk space available on desktop PC will be used. A |
| applications providing file-sharing, content exchange | | | | P2P storage network is a cluster of computers, |
| like music, movies and programs, but have also | | | | based on existing networks, which share all the |
| successfully implemented distributing computing and | | | | storage available in the network. Examples are PAST |
| Internet-basedtelephony. A refined definition of the | | | | [18], Pasta [15], CFS [9], Oceanstore [12], Farsite [1], |
| Peer-to-Peer is : ”A Peer-to-Peer [P2P] system is | | | | and Intermemory [10]. |
| a self organizing system of equal, autonomous | | | | |
| entities (peers) which aims for the shared usage of | | | | |
| distributed resources in networked environment | | | | 5.5. Processor Cycles |
| avoiding central services”[21]. It is possible to say | | | | |
| that peer-topeer is a system with completely | | | | There are requirements for high performance |
| decentralized selforganization and resource usage. | | | | computing, at the same time there is computing |
| Due to principles design, completely decentralized and | | | | power unused, this an incentive for using P2P |
| self-organizing - opposed to client-server paradigm - | | | | applications to bundle that computer power. In this |
| the peer-to-peer concept emerges as the design of | | | | way it is possible to achieve computing power more |
| the future. From the point of view of the | | | | cheap than a supercomputer can provide. This is |
| peer-to-peer concepts there are different challenges, | | | | effected by forming a cluster of independent, |
| e.g. resilient and scalable distributed systems and new | | | | networked computers, in which a single computer is |
| services. Statistics establish that 50 per cent of | | | | transparent and all the networked nodes are merge |
| Internet traffic obeys to peer-to-peer applications, in | | | | into a single logical computer. |
| some cases up to 75 per cent. The growing of | | | | An example is SETI@home [2]. |
| Internet, users and bandwidth, is requiring an increase | | | | |
| of a diverse wealth of applications. The client-server | | | | 6. APPLICATIONS BASED ON PEER-TO-PEER |
| paradigm requires a great effort and resources to | | | | |
| meet these challenges. Internet-based applications | | | | Some applications based on P2P follows: |
| identify three main characteristics: | | | | |
| - Scalability. | | | | 6.1. Application-Layer Multicast |
| - Security and reliability. | | | | In the early days the size of Internet , certainly |
| - Flexibility and quality of services. | | | | limited, permitted broadcasting a single packet to |
| It is difficult for client-server based applications to | | | | every possible node. In the present Internet, this |
| meet the evolution of Internet. The client-server | | | | technique of broadcasting is very expensive. Now is |
| centralized approach is one of the main constrains | | | | necessary a selective broadcast, such multicast. In |
| (resource bottleneck), it is easily attacked and | | | | this field P2P technology has helped, in his |
| difficult to modify due its placement within the | | | | unstructured networks, to reach unlimited scalability. |
| network infrastructure. All of above expressed is | | | | |
| indicating that there is a bias of paradigm, from | | | | 6.2. GRID Computing |
| client-server schemes to peer-topeer schemes. | | | | |
| | | | | The basic objective of GRID computing is to support |
| 2.UNSTRUCTURED PEER-TO PEER SYSTEMS | | | | resource sharing among individuals and institutions |
| | | | | (organizational units), or resource entities within a |
| Was the first generation of peer-to-peer based file | | | | networked infrastructure. Grids are structured and |
| sharing, which used an unstructured approach. | | | | has standards, but not capacity of self-organizing, |
| Napster [11] was one of them with a strategy based | | | | fault tolerance and scalability. On the other hand P2P |
| in a metaserver and servers for looking up the | | | | systems are self-organizing, fault tolerance, react |
| location of data items, after that the data was | | | | very well a transient populations of peers but is lack |
| transferred directly between peers. Gnutella use a | | | | of standards. All the efforts of researching in these |
| flooding technique, a query is sent to all the peers in | | | | fields is in merge the best of the two worlds. Indeed |
| the system until the required data of peer is found. | | | | the question of how the two concepts converge is |
| Peer-to-peer networks do not rely on a specific | | | | stillopen [3]. |
| infrastructure offering transport services. Based on | | | | |
| TCP or HTTP connections, peer-to-peer system | | | | 7. SUMMARY: THE PRESENT AND THE FUTURE |
| forms an overlay structure focusing on content | | | | |
| allocation and distribution. In standard client-server | | | | There was a lot of work did and there is a lot of |
| systems content is stored and provided by a central | | | | work to do in the field. It is possible to classify and |
| server. Peer-to-peer are highly decentralized and | | | | summarize all the activities in applications and |
| locate a desired content at some peer and provide | | | | research, present and future. |
| the corresponding IP address of that peer to the | | | | |
| searching peer. The download of that content is | | | | 7.1. Applications |
| initiated using a separate connection. In client-server | | | | |
| system the server provides services or contents | | | | 7.1.1. The Present |
| (webserver,time server), clients only request content | | | | From 2004 up today |
| or service from the server. In peer-to-peer systems | | | | Support for different communications forms |
| all resource are provided by peers, playing role of | | | | - Telephony. |
| clients or/and servers, this is expressed by the term | | | | - Streaming |
| servent (first syllable of the term server and the | | | | - Scalable and flexible naming systems. |
| second of the term client). There was in the first | | | | - Personal communications (e.g.e-mail). |
| generation of peerto- peer systems some ones that | | | | - Interorganization resource sharing. |
| used a centralized approach. The server is still | | | | - Context/content aware routing. |
| available, however contrary to the client-server | | | | - |
| approach this server only stores IP address of peers | | | | 7.1.2. The Future |
| where some content is available, reducing the load of | | | | |
| the server (Napster [11] is an example). Gnutella 0.4 | | | | Challenges in the future of applications |
| and Freenet were decentralized approach in | | | | |
| replacement of the centralized scheme above | | | | - Video conference. |
| presented. These schemes rely on flooding the | | | | - Distribution of learning material. |
| desired content identifier over the network, reaching | | | | - Location-based services in Mobile Ad Hoc Networks |
| a large number of peers. Peers which share content | | | | (MANET), distributed and centralized. |
| will respond to the requesting peer. An important | | | | - Context aware service. |
| drawback is the large generation of traffic by | | | | - Trustworthy computing. |
| flooding the request. To avoid this situation,Gnutella | | | | |
| 0.6 introduces a hierarchy of nodes called superpeers, | | | | 7.2. Drawbacks |
| which store the content available at the connected | | | | |
| peers together with their IP addresses. The main | | | | Reasons against peer-to-peer. |
| mission of these superpeers is reduce hops in the | | | | |
| process of searching, reducing the traffic in the | | | | 7.2.1. The Present |
| network. | | | | Up today. |
| The above schemes are unstructured peer-to-peer | | | | - Law suits against users. |
| because the content stored on a given node and its | | | | - Software patents. |
| IP address are unrelated and do not follow any | | | | - Intellectual properties. |
| structure. Examples of unstructured peer-to-peer | | | | - P2P requires flat rates access. |
| systems are Napster, Gnutella [11, ?], FastTrack, | | | | - Still low bandwidth end nodes. |
| eDonkey, Freenet. | | | | - Digital right management. |
| | | | | - Best effort service insufficient for most applications. |
| 3.STRUCTURED PEER-TO PEER SYSTEMS | | | | - |
| The challenge of develop scalable unstructured | | | | 7.2.2. The Future |
| Peerto -Peer applications put in attention the research | | | | - Lack of trust. |
| community. Due the advantages and possibilities of | | | | - Commercialization as the end of P2P. |
| decentralized self-organizing systems, researchers | | | | - P2P integrated into other topics. |
| focused on approaches for distributed, | | | | |
| content-addressable data storage so called | | | | 7.3. Research Focus |
| Distributed Hash Tables (DHT). These were | | | | What are the present research efforts and what the |
| developed to provide distributed indexing, scalability, | | | | researching work to do. |
| reliability and fault tolerance.Using DHT a data item | | | | |
| can be retrieved from the network in a complexity | | | | 7.3.1. Nowadays |
| of O(logN). The underlying network and the number | | | | Actually points of researching. |
| of peers in a structure approach can grow without | | | | Semantics integration of different information types |
| impact on the efficiency of the distributed application; | | | | in the specific peer-database. |
| there is a contrast to the previously describes | | | | - Quality of services criteria (consistency, |
| unstructured peer-to-peer applications which usually | | | | availability,security, reliability). |
| exhibit, at best, linear search complexity. Four of the | | | | - Legacy support in overlays. |
| most interesting and representative mechanisms for | | | | - P2P and non-request reply interactions. |
| routing messages and locating data for structured | | | | - highly adaptive DHTs. |
| content distribution systems are: Freenet [6, 7] is a | | | | - Overlay optimization. |
| loosely structured system that uses file and node | | | | - P2P signaling efficiency. |
| identifier to produce an estimate of where a file may | | | | - Data dissemination. |
| be located, and a chain mode propagation approach | | | | - Resource allocation (mechanism and protocols) and |
| to forward queries from node to node. Chord is a | | | | guaranteeing quality of services P2P systems. |
| system whose nodes maintain a distributed routing | | | | - Self determination of information source. |
| table in the form of an identifier circle on which all | | | | - Accounting incentive. |
| nodes are mapped and an associated finger table is | | | | - Realistic P2P simulator. |
| built. CAN is a system using n-dimensional Cartesian | | | | - Decentralize reputation mechanism. |
| coordinate space to implement the distributed | | | | - Semantics queries. |
| location and routing table, each node is responsible | | | | - Efficient P2P content distribution. |
| for a zone in the coordinate space.Tapestry ( and | | | | - Content-based search queries, metadata. |
| Pastry and Kadmelia [13]) are based on plaxton mesh | | | | - Reduction of signaling traffic. |
| data structure, which maintains pointers to nodes in | | | | - Data-centric P2P algorithm. |
| the network whose IDs match the elements of a | | | | - Content management. |
| tree-like structure or ID prefixes up to a digit position. | | | | - Application/data integration. |
| | | | | - Security trust, authentication transmission. |
| 4. SELF ORGANIZATION | | | | - Incentive market mechanism. |
| Under the term self-organization it is possible consider | | | | - Reliable messaging. |
| autonomy, self-maintenance, optimization,adaptability, | | | | - P2P in mobile cellular/ad-hoc. |
| rearrangement, reproduction or emergence. | | | | |
| | | | | 7.3.2. Future Challenges |
| 4.1. Definitions | | | | - Anonymous but still secure e-commerce. |
| | | | | - Interoperability and/vs standards. |
| System: A system is a set components that have | | | | - Real P2P for bussiness information systems. |
| relations between each other and form a unified | | | | - Real time P2P data dissemination. |
| whole. A system distinguishes itself from its | | | | - P2P file systems. |
| environment. | | | | - Concept of trust and dynamic security. |
| | | | | - Dynamic content update. |
| Complexity: This term is used to denote the | | | | - Distributed search mechanism. |
| existence of system properties that make it | | | | - P2P technologies in MANET. |
| difficultto describe the semantics of a systems | | | | - Mobile P2P. |
| overall behavior in an arbitrary language, even if | | | | - Intelligent search. |
| completeinformation about its components and | | | | - Service differentiation. |
| interaction is known . | | | | - P2P-GRID integration. |
| | | | | Certainly there is a lot of work to do, this paper has |
| Feedback: The return to the input of a part of the | | | | not conclusions (nothing is over) because all is just |
| output of a machine, system or process (as | | | | beginning. The fields of applications is huge. There are |
| forproducing changes in an electronic circuit that | | | | excellent readings [23, 3, 22] that should be used for |
| improve performance or in an automatic control | | | | researching and teaching. |
| device that provide self-corrective action). | | | | |
| | | | | 8. REFERENCES |
| Emergence: Refers to unexpected global system | | | | [1] A. Adya, W.J. Bolosky, M. Castro, G. Cermak, R. |
| properties, not present in any of the | | | | Chaiken, and J. R. Douceur. FARSITE: |
| individualsubsystems, that emerge from component | | | | Federated, Available and Reliable Storage for an |
| interactions [5]. | | | | Incompletely Trusted Environment, |
| | | | | 2002. |
| Complex Systems: Complex systems are systems | | | | [2] D. Anderson. SETI@home. chapter 5, pp 67-76. |
| with multiple interacting components whose behavior | | | | OReally,2001. |
| cannot simply inferred from the behavior of the | | | | [3] S. AndroutsellisTheotokis and D. Spinellis. A |
| components [20]. | | | | Survey of Peer-to-Peer Content Distribution |
| | | | | Technologies. ACM Computing Surveys, Vol. 36(4), |
| Criticality: An assembly in which a chain reaction is | | | | 2004. |
| possible is called critical, and is said to haveobtained | | | | [4] H. Balakrishnan, M. F. Kaashoek, D. Karger, |
| criticality. | | | | R.Morris and I. Stoica. Looking up Data in P2P |
| | | | | Systems. Communications of the ACM, 46(2), 2003. |
| Hierarchy: In this context hierarchy is defined as a | | | | [5] J. L. Casti. Complexity. Enciclopaedia Britannica. |
| rooted tree. | | | | 2005 |
| | | | | [6] I. Clarke. Freenets Next Generation Routing |
| Heterarchy: A heterarchy is a type of network | | | | Protocol. 2003. index. php?page=ngrouting. |
| structure that allows a high degree of connectivity. | | | | [7] I. Clarke, S. G. Miller, T. W. Hong, O. Sandberg, |
| By contrast, in a hierarchy every node is connected | | | | and B. Wiley. Protecting Free Expression Online with |
| to at most one parent node and zero or morechilds | | | | Freenet. IEEE Internet Computing, 6(1), pp 40-49, |
| nodes. In heterarchy, however a node can be | | | | 2002. |
| connected to any of the surrounding nodes. | | | | [8] B. Cohen. ,Incentive to Build Robustness in Bit- |
| | | | | Torrent. Workshop on Economics of Peer-to-Peer |
| Stigmergy: Stigmergy defines a paradigm of indirect | | | | Systems, 2003. |
| and asynchronous communication mediated by an | | | | [9] F. Dabek, M.F. Kasshoek, D. Karger, R. Morris, |
| environment. | | | | and I. Stoica. Wide-area Cooperative Storage with |
| | | | | CFS. Proceedings of the 18th ACM Symposium on |
| Perturbation: A perturbation is a disturbance which | | | | Operating Systems Principles. pp 202-215, 2001. |
| causes an act of compensation, whereby the | | | | [10] A. Goldberg and P. Yianilos. Forwards an Archival |
| disturbance may be experienced in a positive or | | | | Intermemory. Proceedings of the IEEE International |
| negative way. | | | | Forum on Research and Technology Advances in |
| | | | | Digital Libraries. pp 147-156, 1998. |
| 4.2. Characteristics of self-organization | | | | [11] A. Kim and L. Hoffman. Napster and other |
| | | | | Internet peer-to-peer applications.George Washington |
| Based on above definitions, self-organization of | | | | University, 2002,citeseer. ist.psu.edu/kim01pricing.html. |
| systems could be characterized as follow: | | | | [12] J. Krubiatowicz, D. Bindel, Y. Chen et al. |
| | | | | OceanStore: An Architecture for Global Scale |
| Self-determined Boundaries: The border between | | | | Persistent Storage. Proceedings of the 9th |
| system and environment is defined by the system | | | | International Conference on Architecture Support for |
| itself. | | | | Programming Languages and Operating Systems. |
| | | | | 2000. |
| Independence of identity and structure: The | | | | [13] P. Maymounkov and D. Mazieres. Kademlia: A |
| distinction between identity and structure allows to | | | | peer-to-Peer Information System Based on the XOR |
| explain flexibility and adaptability. | | | | Metric. International Workshop on Peer-to- Peer |
| | | | | Systems. (IPTPS02), 2002. |
| Maintenance: A self-organizing system must try to | | | | [14] D. S. Milojicic, V. Kalogeraki,, R. Lukose, K. |
| maintain itself. | | | | Nagaraja and J. Pruyne. Peer-to-Peer Computing. HP, |
| | | | | Technical Report, HPL-2002- |
| Feedback and heterarchy: If a system is perturbed, it | | | | [15] T. Moreton, I. Pratt, and T. Harris. Storage, |
| try to restructure to maintain itself, so it need | | | | Mutability and Naming in Pasta. 2002. pdf. |
| cross-linked relations with its neighborhood. | | | | [16] M. Ripeanu. Peer-to-Peer Architecture Case |
| | | | | Study: Gnutella Network. Proceedings of the IEEE 1st |
| Self-determined reaction to perturbation: A | | | | International Conference on Peer-to-Peer |
| selforganizing system reacts when a perturbation | | | | Computing,2001. |
| occurs, but it needs metrics for detecting and | | | | [17] M. Ripeanu and I. Foster. Mapping the Gnutella |
| evaluating the perturbation. | | | | Network: properties of Large-scale Peer-to-Peer |
| These characteristics of self-organizing systems can | | | | Systems and Implications for System Design. IEEE |
| be extended to P2P systems establishing several | | | | Internet Computing, 6(1), 2002. |
| basic criteria such as boundaries, reproduction, | | | | [18] A. Rowstron and P. Druschel. Storage |
| mutability, organization, metrics and adaptivity; and | | | | Management and Caching in PAST, a Large-scale, |
| criteria for autonomy as feedback, reduction o | | | | Persistent Peer-to-Peer Storage Utility. 18th ACM |
| complexity, randomness, self-organized criticality and | | | | SOSP01. 2001. |
| emergence. Besides the degree of conformance to | | | | [19] S. Saroiu, K. P. Gummadi and S.D. Gribble. |
| these criteria, every system has an identity or a main | | | | Measuring and analyzing the characteristics of Napster |
| purpose that is essential characteristic of the system. | | | | and Gnutella hosts. Multimedia Systems, 9(2), 2003. |
| The identity of a P2P system is imposed from | | | | pp 170-184, Springer-Verlag. |
| outside (the developers) and it is not self-determined. | | | | [20] F. Schweitzer. Coordination of Decisions in |
| | | | | Spatial Multi-Agents Systems. International Workshop |
| 5. APPLICATION AREAS | | | | on Socio- and Econo-Physics. 2003. |
| Peer-to-peer is an alternative for managing different | | | | [21] R. Steinmetz and K. Wehrle. Peer-to-Peer- |
| types of resources as information, files bandwidth, | | | | Networking and -Computing. Informatik- Spectrum, |
| storage and processor cycles. | | | | 27(1). Springer. 2004. |
| | | | | [22] R. Steinmetz and K. Wehrle (Eds). Peer-to-Peer |
| 5.1. Information | | | | Systems and Applications. Lecture Notes in Computer |
| | | | | Science, LNCS 3485, Springer. 2005. |
| In this section is explained how P2P networks is | | | | [23] J. Van Der Merwe, D. Dawound, S. Mc Donald. |
| deployed in areas of information. | | | | |