A Survey On Peer-To-Peer Systems

A Survey on Peer-to-Peer Systems 
1.G.Satyavathy,Lecturer,Department Of ComputerPresence Information: Presence information is very
Science,Sri Ramakrishna College Of Arts and Scienceimportant in P2P applications. It provides information
For Women,Coimbatore-641 044.about which peers and resources are available. This is
2.Dr.M.Punithavalli,Director and Head,Department Ofrelevant for the self-organization of the system. The
Computer Science,Sri Ramakrishna College Of Artsuse of information is also important to share
and Science For Women,Coimbatore-641 044.processor cycles because the system knows which
                    ABSTRACTprocessor is overload and which one not. The peers
In this survey, we propose a framework forare agents of information for the others peers.
analyzing peer-to-peer content distribution 
technologies. Our approach focuses on nonfunctionalDocument management: usually documents systems
characteristics such as security, scalability,are centrally organized, this allows share storage,
performance, fairness, and resource managementmanagement and use of data. A great effort is
potential, and examines the way in which thesenecessary to create a centralized index of relevant
characteristics are reflected in—and affecteddocuments. The experience shows that documents
by—the architectural design decisions adopted bycreated in a company are distributed among the
current peer-to-peer systems. Nowadaysdesktop PCs without a central repository having any
Peer-to-Peer (P2P) systems became an importantknowledge of their existence. In this case, the P2P
part of Internet, millions of users have beennetworks are very useful.
attracted to use their structures and services.The 
popularity of Peer-to-Peer systems speed upCollaboration: P2P permits management of documents
academic research joining researchers from systems,at level of closed working groups.
networking and theory. The most popular P2P 
applications support file-sharing and content5.2. Files
distribution, new applications are emerging in different 
fields, Internet telephony is an example. This paperA characteristic of file-sharing is that sometimes
discusses the issues of P2P systems such aspeers ares client when they download files and
characteristics, structures, protocols, drawbacks,sometimes servers when they upload files (sevents).
open problems and futures fields of development.A central problem in P2P systems is the searching of
 the contents or files required (lookup problem)[4]. In
Keywords: distributed systems, peer-to-peer,the context of file-sharing, three different models
algorithms, performance design, grid computing,have been developed: the flooding request model
peer-topeer.(Gnutella) [16, 17], the centralized directory model
 (Napster) and document routing model (Freenet) [6,
1. INTRODUCTION7, 14].
 Computation in networks of processing nodes, each 
holding a part of the inputs and/or resources initially,5.3. Bandwidth
can be classified into centralized or distributed 
computations. A centralized solution relies on oneThe traffic on networks is constantly rising, mainly in
node being designated as the computer node thatlarge volume of multimedia data, file-sharing, so the
processes the entire application locally. In distributedeffective use of the bandwidth has suffer an
computation, the processing steps of the applicationimportant increment. When data are centralized and a
are divided among the participating nodes. The goal inspontaneous increment of demand arises, the
such systems is to minimize communication andbandwidth becomes a bottleneck. P2P approach
computation cost. Distributed systems can be furtherincreases load balancing without any kind of additional
classified into a client-server model and a P2P model.administration,by taking advantage of transmission
In the client-server model, the server is the centralroutes which are not fully exploited. This concept is
registering unit, as well as the only provider ofappliedin the areas of streaming. A shared use of the
content and services. A client only requests contentbandwidth is also very well exploited splitting big files
or the execution of services, without sharing any ofin smaller blocks which are downloaded by the
its own services. The client-server model can be flatrequesting peers, BitTorrent [8] is an implementation
where all clients only communicate with a singleusing this principle.
server or it can be hierarchical for improved scalability. 
During years and today the client-server paradigm is5.4. Storage Space
the battle horse of the most users  applications. In 
the last years there is a new paradigm that isWith P2P storage networks, only a portion of the
emerging, peer-to-peer (P2P) mainly supportingdisk space available on desktop PC will be used. A
applications providing file-sharing, content exchangeP2P storage network is a cluster of computers,
like music, movies and programs, but have alsobased on existing networks, which share all the
successfully implemented distributing computing andstorage available in the network. Examples are PAST
Internet-basedtelephony. A refined definition of the[18], Pasta [15], CFS [9], Oceanstore [12], Farsite [1],
Peer-to-Peer is : ”A Peer-to-Peer [P2P] system isand Intermemory [10].
a self organizing system of equal, autonomous 
entities (peers) which aims for the shared usage of 
distributed resources in networked environment5.5. Processor Cycles
avoiding central services”[21]. It is possible to say 
that peer-topeer is a system with completelyThere are requirements for high performance
decentralized selforganization and resource usage.computing, at the same time there is computing
Due to principles design, completely decentralized andpower unused, this an incentive for using P2P
self-organizing - opposed to client-server paradigm -applications to bundle that computer power. In this
the peer-to-peer concept emerges as the design ofway it is possible to achieve computing power more
the future. From the point of view of thecheap than a supercomputer can provide. This is
peer-to-peer concepts there are different challenges,effected by forming a cluster of independent,
e.g. resilient and scalable distributed systems and newnetworked computers, in which a single computer is
services. Statistics establish that 50 per cent oftransparent and all the networked nodes are merge
Internet traffic obeys to peer-to-peer applications, ininto a single logical computer.
some cases up to 75 per cent. The growing ofAn example is SETI@home [2].
Internet, users and bandwidth, is requiring an increase 
of a diverse wealth of applications. The client-server6. APPLICATIONS BASED ON PEER-TO-PEER
paradigm requires a great effort and resources to 
meet these challenges. Internet-based applicationsSome applications based on P2P follows:
identify three main characteristics:  
- Scalability.6.1. Application-Layer Multicast
- Security and reliability.In the early days the size of Internet , certainly
- Flexibility and quality of services.limited, permitted broadcasting a single packet to
It is difficult for client-server based applications toevery possible node. In the present Internet, this
meet the evolution of Internet. The client-servertechnique of broadcasting is very expensive. Now is
centralized approach is one of the main constrainsnecessary a selective broadcast, such multicast. In
(resource bottleneck), it is easily attacked andthis field P2P technology has helped, in his
difficult to modify due its placement within theunstructured networks, to reach unlimited scalability.
network infrastructure. All of above expressed is 
indicating that there is a bias of paradigm, from6.2. GRID Computing
client-server schemes to peer-topeer schemes. 
 The basic objective of GRID computing is to support
2.UNSTRUCTURED PEER-TO PEER SYSTEMSresource sharing among individuals and institutions
 (organizational units), or resource entities within a
Was the first generation of peer-to-peer based filenetworked infrastructure. Grids are structured and
sharing, which used an unstructured approach.has standards, but not capacity of self-organizing,
Napster [11] was one of them with a strategy basedfault tolerance and scalability. On the other hand P2P
in a metaserver and servers for looking up thesystems are self-organizing, fault tolerance, react
location of data items, after that the data wasvery well a transient populations of peers but is lack
transferred directly between peers. Gnutella use aof standards. All the efforts of researching in these
flooding technique, a query is sent to all the peers infields is in merge the best of the two worlds. Indeed
the system until the required data of peer is found.the question of how the two concepts converge is
Peer-to-peer networks do not rely on a specificstillopen [3].
infrastructure offering transport services. Based on 
TCP or HTTP connections, peer-to-peer system7. SUMMARY: THE PRESENT AND THE FUTURE
forms an overlay structure focusing on content 
allocation and distribution. In standard client-serverThere was a lot of work did and there is a lot of
systems content is stored and provided by a centralwork to do in the field. It is possible to classify and
server. Peer-to-peer are highly decentralized andsummarize all the activities in applications and
locate a desired content at some peer and provideresearch, present and future.
the corresponding IP address of that peer to the 
searching peer. The download of that content is7.1. Applications
initiated using a separate connection. In client-server 
system the server provides services or contents7.1.1. The Present
(webserver,time server), clients only request content From 2004 up today
or service from the server. In peer-to-peer systems Support for different communications forms
all resource are provided by peers, playing role of- Telephony.
clients or/and servers, this is expressed by the term- Streaming
servent (first syllable of the term server and the- Scalable and flexible naming systems.
second of the term client). There was in the first- Personal communications (e.g.e-mail).
generation of peerto- peer systems some ones that- Interorganization resource sharing.
used a centralized approach. The server is still- Context/content aware routing.
available, however contrary to the client-server-  
approach this server only stores IP address of peers7.1.2. The Future
where some content is available, reducing the load of 
the server (Napster [11] is an example). Gnutella 0.4Challenges in the future of applications
and Freenet were decentralized approach in 
replacement of the centralized scheme above- Video conference.
presented. These schemes rely on flooding the- Distribution of learning material.
desired content identifier over the network, reaching- Location-based services in Mobile Ad Hoc Networks
a large number of peers. Peers which share content(MANET), distributed and centralized.
will respond to the requesting peer. An important- Context aware service.
drawback is the large generation of traffic by- Trustworthy computing.
flooding the request. To avoid this situation,Gnutella 
0.6 introduces a hierarchy of nodes called superpeers,7.2. Drawbacks
which store the content available at the connected 
peers together with their IP addresses. The mainReasons against peer-to-peer.
mission of these superpeers is reduce hops in the 
process of searching, reducing the traffic in the7.2.1. The Present
network.Up today.
The above schemes are unstructured peer-to-peer- Law suits against users.
because the content stored on a given node and its- Software patents.
IP address are unrelated and do not follow any- Intellectual properties.
structure. Examples of unstructured peer-to-peer- P2P requires flat rates access.
systems are Napster, Gnutella [11, ?], FastTrack,- Still low bandwidth end nodes.
eDonkey,  Freenet.- Digital right management.
 - Best effort service insufficient for most applications.
3.STRUCTURED PEER-TO PEER SYSTEMS-  
The challenge of develop scalable unstructured7.2.2. The Future
Peerto -Peer applications put in attention the research- Lack of trust.
community. Due the advantages and possibilities of- Commercialization as the end of P2P.
decentralized self-organizing systems, researchers- P2P integrated into other topics.
focused on approaches for distributed, 
content-addressable data storage so called7.3. Research Focus
Distributed Hash Tables (DHT). These wereWhat are the present research efforts and what the
developed to provide distributed indexing, scalability,researching work to do.
reliability and fault tolerance.Using DHT a data item 
can be retrieved from the network in a complexity7.3.1. Nowadays
of O(logN). The underlying network and the numberActually points of researching.
of peers in a structure approach can grow without Semantics integration of different information types
impact on the efficiency of the distributed application;in the specific peer-database.
there is a contrast to the previously describes- Quality of services criteria (consistency,
unstructured peer-to-peer applications which usuallyavailability,security, reliability).
exhibit, at best, linear search complexity. Four of the- Legacy support in overlays.
most interesting and representative mechanisms for- P2P and non-request reply interactions.
routing messages and locating data for structured- highly adaptive DHTs.
content distribution systems are: Freenet [6, 7] is a- Overlay optimization.
loosely structured system that uses file and node- P2P signaling efficiency.
identifier to produce an estimate of where a file may- Data dissemination.
be located, and a chain mode propagation approach- Resource allocation (mechanism and protocols) and
to forward queries from node to node. Chord is aguaranteeing quality of services P2P systems.
system whose nodes maintain a distributed routing- Self determination of information source.
table in the form of an identifier circle on which all- Accounting incentive.
nodes are mapped and an associated finger table is- Realistic P2P simulator.
built. CAN is a system using n-dimensional Cartesian- Decentralize reputation mechanism.
coordinate space to implement the distributed- Semantics queries.
location and routing table, each node is responsible- Efficient P2P content distribution.
for a zone in the coordinate space.Tapestry ( and- Content-based search queries, metadata.
Pastry and Kadmelia [13]) are based on plaxton mesh- Reduction of signaling traffic.
data structure, which maintains pointers to nodes in- Data-centric P2P algorithm.
the network whose IDs match the elements of a- Content management.
tree-like structure or ID prefixes up to a digit position.- Application/data integration.
 - Security trust, authentication transmission.
4. SELF ORGANIZATION- Incentive market mechanism.
Under the term self-organization it is possible consider- Reliable messaging.
autonomy, self-maintenance, optimization,adaptability,- P2P in mobile cellular/ad-hoc.
rearrangement, reproduction or emergence. 
 7.3.2. Future Challenges
4.1. Definitions- Anonymous but still secure e-commerce.
 - Interoperability and/vs standards.
System: A system is a set components that have- Real P2P for bussiness information systems.
relations between each other and form a unified- Real time P2P data dissemination.
whole. A system distinguishes itself from its- P2P file systems.
environment.- Concept of trust and dynamic security.
 - Dynamic content update.
Complexity: This term is used to denote the- Distributed search mechanism.
existence of system properties that make it- P2P technologies in MANET.
difficultto describe the semantics of a systems- Mobile P2P.
overall behavior in an arbitrary language, even if- Intelligent search.
completeinformation about its components and- Service differentiation.
interaction is known .- P2P-GRID integration.
 Certainly there is a lot of work to do, this paper has
Feedback: The return to the input of a part of thenot conclusions (nothing is over) because all is just
output of a machine, system or process (asbeginning. The fields of applications is huge. There are
forproducing changes in an electronic circuit thatexcellent readings [23, 3, 22] that should be used for
improve performance or in an automatic controlresearching and teaching.
device that provide self-corrective action). 
 8. REFERENCES
Emergence: Refers to unexpected global system[1] A. Adya, W.J. Bolosky, M. Castro, G. Cermak, R.
properties, not present in any of theChaiken, and J. R. Douceur. FARSITE:
individualsubsystems, that emerge from componentFederated, Available and Reliable Storage for an
interactions [5].Incompletely Trusted Environment,
 2002.
Complex Systems: Complex systems are systems [2] D. Anderson. SETI@home. chapter 5, pp 67-76.
with multiple interacting components whose behaviorOReally,2001.
cannot simply inferred from the behavior of the [3] S. AndroutsellisTheotokis and D. Spinellis. A
components [20].Survey of Peer-to-Peer Content Distribution
 Technologies. ACM Computing Surveys, Vol. 36(4),
Criticality: An assembly in which a chain reaction is2004.
possible is called critical, and is said to haveobtained [4] H. Balakrishnan, M. F. Kaashoek, D. Karger,
criticality.R.Morris and I. Stoica. Looking up Data in P2P
 Systems. Communications of the ACM, 46(2), 2003.
Hierarchy: In this context hierarchy is defined as a [5] J. L. Casti. Complexity. Enciclopaedia Britannica.
rooted tree.2005
  [6] I. Clarke. Freenets Next Generation Routing
Heterarchy: A heterarchy is a type of networkProtocol. 2003. index. php?page=ngrouting.
structure that allows a high degree of connectivity. [7] I. Clarke, S. G. Miller, T. W. Hong, O. Sandberg,
By contrast, in a hierarchy every node is connectedand B. Wiley. Protecting Free Expression Online with
to at most one parent node and zero or morechildsFreenet. IEEE Internet Computing, 6(1), pp 40-49,
nodes. In heterarchy, however a node can be2002.
connected to any of the surrounding nodes. [8] B. Cohen. ,Incentive to Build Robustness in Bit-
 Torrent. Workshop on Economics of Peer-to-Peer
Stigmergy: Stigmergy defines a paradigm of indirectSystems, 2003.
and asynchronous communication mediated by an [9] F. Dabek, M.F. Kasshoek, D. Karger, R. Morris,
environment.and I. Stoica. Wide-area Cooperative Storage with
 CFS. Proceedings of the 18th ACM Symposium on
Perturbation: A perturbation is a disturbance whichOperating Systems Principles. pp 202-215, 2001.
causes an act of compensation, whereby the [10] A. Goldberg and P. Yianilos. Forwards an Archival
disturbance may be experienced in a positive orIntermemory. Proceedings of the IEEE International
negative way.Forum on Research and Technology Advances in
 Digital Libraries. pp 147-156, 1998.
4.2. Characteristics of self-organization [11] A. Kim and L. Hoffman. Napster and other
 Internet peer-to-peer applications.George Washington
Based on above definitions, self-organization ofUniversity, 2002,citeseer. ist.psu.edu/kim01pricing.html.
systems could be characterized as follow: [12] J. Krubiatowicz, D. Bindel, Y. Chen et al.
 OceanStore: An Architecture for Global Scale
Self-determined Boundaries: The border betweenPersistent Storage. Proceedings of the 9th
system and environment is defined by the systemInternational Conference on Architecture Support for
itself.Programming Languages and Operating Systems.
 2000.
Independence of identity and structure: The [13] P. Maymounkov and D. Mazieres. Kademlia: A
distinction between identity and structure allows topeer-to-Peer Information System Based on the XOR
explain flexibility and adaptability.Metric. International Workshop on Peer-to- Peer
 Systems. (IPTPS02), 2002.
Maintenance: A self-organizing system must try to [14] D. S. Milojicic, V. Kalogeraki,, R. Lukose, K.
maintain itself.Nagaraja and J. Pruyne. Peer-to-Peer Computing. HP,
 Technical Report, HPL-2002-
Feedback and heterarchy: If a system is perturbed, it [15] T. Moreton, I. Pratt, and T. Harris. Storage,
try to restructure to maintain itself, so it needMutability and Naming in Pasta. 2002. pdf.
cross-linked relations with its neighborhood.[16] M. Ripeanu. Peer-to-Peer Architecture Case
 Study: Gnutella Network. Proceedings of the IEEE 1st
Self-determined reaction to perturbation: AInternational Conference on Peer-to-Peer
selforganizing system reacts when a perturbationComputing,2001.
occurs, but it needs metrics for detecting and [17] M. Ripeanu and I. Foster. Mapping the Gnutella
evaluating the perturbation.Network: properties of Large-scale Peer-to-Peer
These characteristics of self-organizing systems canSystems and Implications for System Design. IEEE
be extended to P2P systems establishing severalInternet Computing, 6(1), 2002.
basic criteria such as boundaries, reproduction, [18] A. Rowstron and P. Druschel. Storage
mutability, organization, metrics and adaptivity; andManagement and Caching in PAST, a Large-scale,
criteria for autonomy as feedback, reduction oPersistent Peer-to-Peer Storage Utility. 18th ACM
complexity, randomness, self-organized criticality andSOSP01. 2001.
emergence. Besides the degree of conformance to [19] S. Saroiu, K. P. Gummadi and S.D. Gribble.
these criteria, every system has an identity or a mainMeasuring and analyzing the characteristics of Napster
purpose that is essential characteristic of the system.and Gnutella hosts. Multimedia Systems, 9(2), 2003.
The identity of a P2P system is imposed frompp 170-184, Springer-Verlag.
outside (the developers) and it is not self-determined. [20] F. Schweitzer. Coordination of Decisions in
 Spatial Multi-Agents Systems. International Workshop
5. APPLICATION AREASon Socio- and Econo-Physics. 2003.
Peer-to-peer is an alternative for managing different [21] R. Steinmetz and K. Wehrle. Peer-to-Peer-
types of resources as information, files bandwidth,Networking and -Computing. Informatik- Spectrum,
storage and processor cycles.27(1). Springer. 2004.
  [22] R. Steinmetz and K. Wehrle (Eds). Peer-to-Peer
5.1. InformationSystems and Applications. Lecture Notes in Computer
 Science, LNCS 3485, Springer. 2005.
In this section is explained how P2P networks is [23] J. Van Der Merwe, D. Dawound, S. Mc Donald.
deployed in areas of information.