TU Delft
 
Alexandru IOSUP
Peer-to-Peer Computing
Parallel and Distributed Systems
EWI PDS-A.Iosup-Research -Peer-to-Peer Computing
 
 
 
 
 
 
 
 
 
Site Search  click to search <-- click to open the search box
 
  click to close click to close the search box

Peer-to-Peer Computing Research (2005-ongoing) printer-friendly version: Peer-to-Peer Computing Research by Alexandru Iosup, PDF [0.2MB]
-
-
Rationale
-
-
why and how is this work relevant?
-

Peer-to-Peer computing is a paradigm under which participating entities in a distributed system (the peers) can use direct two-way (peer-to-peer) communication to perform and/or receive some service. The ability to communicate directly allows peers with the desire to provide and/or use a specific service (similar interest) to group (swarm). Because peers can both perform and receive service, peer-to-peer systems promise to use all the available resources, including the resources volunteered by peers, efficiently and cost-effectively. Because the peers can create any communication structure, peer-to-peer systems may be able to be scalable. From the systems in use, the BitTorrent peer-to-peer file-sharing system has hundreds of millions of daily users world-wide.

Our work in peer-to-peer computing focuses on designing and understanding peer-to-peer file-sharing systems. Thus, BitTorrent has been the focus of several of our measurement and observation projects, and a platform for proving our new peer-to-peer algorithms and methods.

Note on funding and scope: The funding for this work was provided by the EU FP projects CoreGRID, P2P-Next, and i-Share, and by various NWO grants obtained by Dick Epema, Johan Pouwelse, and Henk Sips (all TU Delft). The work described here is part of a much larger body of research coordinated by the aforementioned three researchers; visiting their web pages will lead to more insights into the P2P topics investigated by the PDS group at TU Delft. The grid computing/P2P part of this research was done by Alexandru Iosup as part of his PhD thesis research, under the supervision of Dick Epema. The file-sharing part of this research stands under the much broader umbrella of the Tribler project.


People
-
-
who is part of the group?
-

Note on the operation of the group: Alexandru Iosup collaborated with the group of people described above over six years, from the period he was a PhD student, and until and after his appointment as an Assistant Professor. He did not coordinate the graduate students listed above, except for Boxun Zhang (as a temporary daily supervisor in the middle of his PhD track.) His collaboration with U. Wisconsin-Madison, UCC, and UANL developed as a multi-year collaboration, but does not span the full six years covered by this research.


Main Research Questions
-
-
what do we try to achieve?
-
  1. What are the characteristics of global BitTorrent systems? The wide-spread use of P2P technology, particularly by file-sharing applications, makes the study of the characteristics of global BitTorrent systems important and timely. How many users are deployed P2P systems servicing? What are their characteristics? How do BitTorrent systems operate and change over time? How present and important are specific P2P fenomena, such as free-riding? Do deployed P2P systems experience large-scale distributed systems phenomena such as churn and flashcrowds?
     
  2. Is there bias in P2P file-sharing measurements aimed at swarm-based systems? Since 2000, tens of emprical (measurement) studies have focused on the study of P2P file-sharing systems. We are interested in understanding if the results of these studies are comparable, and if any particular study's results are meaningful and representative for a particular P2P scenario.

  3. How to define and how good are the Quality of Service and the Quality of Experience in P2P systems? Quality of Service and Quality of Experience refer to the capacity of the system to provide service according to the requirements of the application designer and of the user, respectively; for example, the ability to transfer files at a constant rate of 1GBps (service-level requirement) may be sufficient for online HD-movie watchers but insufficient for streaming signal acquired from the Large Hedron Collider's detectors (both user-level requirements). Moreover, how do real P2P systems cope with large-scale distributed systems phenomena such as churn and flashcrowds? We are interested in creating good and comprehensive specifications of service- and user-level requirements for P2P systems. This goal is further complicated by the variety of systems that can be built on top of P2P technology.

  4. Do collaborations help with downloads in swarm-based file-sharing systems? The heterogeneity of peers in real P2P systems, both in terms of available resources and participation, may lead to poor Quality of Service for selected parts of the system and to poor Quality of Experience for some of the peers. We are interested in understanding if peers with similar interest or characteristics can collaborate in order to achieve better Quality of Experience.

  5. Are hybrid P2P-centralized systems useful? Is it profitable to employ P2P resources for content distribution, when a centralized system can be built for the same purpose? How can P2P technology be used in conjunction with traditional centralized technology? What are the potential areas of application for hybrid P2P-centralized technology?
     
     

 


Main Achievements
-
-
what did we do?
-
  1. Created the Peer-to-Peer Trace Archive, which provides anonymized P2P traces collected from deployed P2P systems to researchers and to practitioners alike. The P2P Trace Archive currently hosts over 20 traces, of various length and size (number of peers), taken from real P2P systems with various application, from file-sharing to video-streaming to voice-over-IP.
     
  2. Evaluated the characteristics of various global BitTorrent systems. We have performed in 2006 the largest BitTorrent measurement, which remained the largest until 2009). We have created BTWorld, a project focusing on the continuous observation of all global BitTorrent systems on infrastructure of reduced size; we have used BTWorld in a number of large-scale studies of BitTorrent, including the observation of over 10.5 million swarms (the largest amount observed in a scientific study, to date). Our studies have focused on a variety of topics, including peer geo-location, peer temporal patterns, downloading behavior (closely linked with Quality of Service and Quality of Experience). We have studied various phenomena occurring in P2P file-sharing systems, such as aliased media, which is the presence of very similar content in a variety of formats---; the flashcrowd effect, that is, the spurt of growth in the size of swarms; super-seeders, which are BitTorrent peers who own and seed several files at the same time; collectors, which are BitTorrent peers who attempt to download several pieces of content at the same time, although the items are not part of an aliased media set; second-seeders, that is, the first BitTorrent peer to obtain full service (to complete the download of a content item seeded by a unique seeder); etc. We have looked at the way peers make use of their resources, and in particular at the correlation between BitTorrent peer activity and Internet topology.
     
  3. Performed the first comprehensive study and created the first comprehensive model of flashcrowds in global BitTorrent systems. Our study is based on real traces taken from nearly 4 million swarms. The model we have proposed focuses on the arrival of flashcrowds, their magnitude, the duration of their peak period, and the growth and decay of swarms.
     
  4. Identified and evaluated various sources of bias in P2P file-sharing measurements aimed at swarm-based systems. We have analyzed a large number of sources of bias, and created a set of recommendations for P2P file-sharing measurements with reduced bias.
     
  5. Proposed several Quality of Service and Quality of Experience metrics, and studied their values in real and realistic settings for a variety of BitTorrent systems and applications.
     
  6. Designed 2Fast, a protocol for collaborative downloads in swarm-based file-sharing systems. We have shown that collaborative downloads can lead to significant improvements in service and alleviate the negative effects of peer heterogeneity.
     
  7. Part of the team that designed Tribler, a P2P file-sharing system that is innovative in its social orientation while being fully BitTorrent-compatible. We have designed the first P2P file-sharing system based on a social paradigm, that is, based on the idea that social relations---friends, friends-of-friends---can help peers achieve better Quality of Experience. In particular, Tribler leverages the 2fast protocol to achieve better performance for collaborative peers. Tribler is operational and has been downloaded over 150,000 times in its first five years of operation.

  8. Proposed a new way to visualize the evolution of global BitTorrent systems The Hairy World bubbles movie was presented on February 25, 2004, at the IPTPS'05 workshop. A similar movie was also used as a visual abstraction for global BitTorrent systems.

    The movies have each ~170MB. Here's a sample shot:
    The Hairy World bubble-based movie

 


Main Findings
-
-
what did we find?
-
  1. "A purely peer-to-peer architecture can be augmented with a social network component to achieve significantly improved user experience, such as faster downloads, ability to gossip reliably about new content, and get access to credible recommendations." [3][4][9]
     
  2. "Hybrid centralized/peer-to-peer architectures can be used to inter-operate grids. They lead to significantly improved performance (Quality of Service) while offering good provisions for preserving organizational structure." [6][8]
     
  3. "Hybrid centralized/peer-to-peer architectures can be used to build platforms for on-demand video streaming and collaborative environments. They lead to good performance (Quality of Service) and cost-control." [5]
     
  4. "It is worthwhile to re-investigate many of the existing algorithms, methods, and mechanisms proposed in the context of peer-to-peer systems, because the characteristics of global BitTorrent systems vary significantly across deployment and over time." [13]
     
  5. "The Peer-to-Peer Trace Archive is a valid and easy-to-use source of data for peer-to-peer studies." [9][12]
     
  6. "The swarms exhibiting flashcrowds account for a small fraction of the swarms present in a global BitTorrent system, under 1%, but affect a much larger fraction of users, from one quarter to two thirds of all the users." [14]
     
  7. "Aliased media, super-seeders, and collectors are P2P phenomena that occur in BitTorrent, despite good moderation and system efficiency." [1], see also Main Achievement point #2 for definitions of the terms.
     
  8. "There exist global BitTorrent systems since 2005, when PirateBay users were served by over 6,500 Internet Organizations, the largest 500 Internet Organizations and 250 Autonomous Systems covered less than 90% of the peers, and the largest country participation was below 15% of the peers." [2]
     
  9. "For SuprNova, the largest global BitTorrent system in 2004 but now defunct, many users quit before completing a single download (before obtaining a complete file); from the users who obtained a complete file, most do not remain in the network long enough to seed more than 15% of the downloaded file. Users also quit after downloading one complete file from an aliased media set, wasting the resources consumed for downloading parts of the other files in the aliased media set." [1]
     
  10. "The preference of users for particular content can be characterized as both trivial---preference for localized versions of the same content for users of different languages---and non-trivial---users of a specific culture (even country) preferring or even being interested exclusively in specific types of content." [1]
     
  11. "The application-level bandwidth, defined as the total amount of data transferred by a peer over a period of time, is a service metric that separates users; its distribution per continent shows similar shape but different location." [2]
     
  12. "The distribution of application-level bandwidth for global BitTorrent systems changes greatly over time; in one case, it has doubled over a period of just one year." [2][13]
     
  13. "Different data sources and measurement techniques can lead to signifficantly different measurement results. Thus, using a variety of input traces when evaluating peer-to-peer systems is necessary." [11]
     
  14. "Collaborative downloading can significantly reduce the download time even for peers with small social networks, and reduce the implicit incentive to free-riding for ADSL users and users of other imbalanced download/upload networks." [4]
     
  15. "A comprehensive model for flashcrowds in BitTorrent needs to cover aspects such as major and minor flashcrowds, flashcrowd arrival time, flashcrowd magnitude, flashcrowd peak (steady) period, and swarm growth and decay." [14]
     
  16. "Most swarms exhibiting flashcrowds are short-lived; two thirds of them reach their half-life point, that is, half the peak size in less than 48 hours after the flashcrowd starts forming." [14]
     
  17. "The performance of BitTorrent under flashcrowd conditions can be very poor, with the second-seeder (the first peer to complete a full download of the content) appearing only after the peak of the flashcrowd is reached." [14]
     
  18. "Flashcrowds are important in BitTorrent: flashcrowds appear in small fractions (0.3-2%) of swarms but can affect a significant fraction of peers (21-45%). "[14]
     
  19. "Flashcrowds arrive rapidly: Most (70%) major flashcrowds start right after swarm creation."[14]
     
  20. "Flashcrowds are short: the average duration of themajor flashcrowd is around 12 hours."[14]
     
  21. "The effect of flashcrowds in BitTorrent can be a seven-fold decrease in the performance of peers during flashcrowds versus similarly-sized swarms without non-flashcrowds." [14]
     

 


Publications
journals/conferences/workshops | all PDS group publications | my publications (with BibTeX) | my DBLP entry | my ACM DL entry
2010  click to toggle the display of all publications <-- click to see more details
[14 B. Zhang, A. Iosup, J. Pouwelse, and D. Epema, Identifying, Analyzing, and Modeling Flashcrowds in BitTorrent, IEEE Int'l. Conf. on Peer-to-Peer Computing (P2P'11) . An extended version is available as TU Delft Technical Report PDS-2010-009 -
keywords flashcrowds, BitTorrent, trace archive, analysis, trace characterization, statistical modeling, flashcrowd model, performance analysis, performance, Quality of Service, peer-to-peer.
 
Article Identifying, Analyzing, and Modeling Flashcrowds in BitTorrent
 in IEEE P2P 2011, PDF [1.35MB]  
  info Study based on real traces taken from nearly 4 million swarms. The proposed model focuses on the arrival of flashcrowds, their magnitude, the duration of their peak period, and the growth and decay of swarms.
 
 
[13 B. Zhang, A. Iosup, J. Pouwelse, and D. Epema, The Peer-to-Peer Trace Archive: Design and Comparative Trace Analysis, (under submission). An extended version is available as TU Delft Technical Report PDS-2010-003.
keywords Peer-to-Peer Trace Archive, peer-to-peer, trace archive, analysis, general models, statistical modeling, Quality of Service, Quality of Experience.
 
-  
  info The P2P Trace Archive currently hosts over 20 traces, of various length and size (number of peers), taken from real P2P systems with various application, from file-sharing to video-streaming to voice-over-IP. Analyzed and modeled general properties of peers in global peer-to-peer systems. For file-sharing, proposed as Quality of Experience metric the number of sessions needed to download a complete file, and shown that this varies across different global systems.
 
 
[12 B. Zhang, A. Iosup, J. Pouwelse, D. Epema, and H. Sips, Sampling Bias in BitTorrent Measurements, In Euro-Par Conference 2010, Aug 31-Sep 3, 2010 (accepted). Extended version as Technical Report PDS-2009-005.
keywords measurement bias, trace-based analysis, BitTorrent, peer-to-peer.
 
- (-)  
  info Assessed various sources of bias in BitTorrent measurements. Formulated guidelines for measurements that avoid the sources of bias identified in this work.
 
 
[11 M. Wojciechowski, M. Capotã, J. Pouwelse, and A. Iosup, BTWorld: Towards Observing the Global BitTorrent File-Sharing Network, In ACM Workshop on Large-Scale System and Application Performance (LSAP'10), in conjunction with the ACM/IEEE Int'l. Symp. on High Performance Distributed Computing (HPDC'10) (accepted).
keywords measurement, BitTorrent, peer-to-peer, traces, tens of millions of swarms.
 
-
BiBTeX Entry  
  info Developed BTWorld, a tool for observing the global BitTorrent network. Observed hundreds of trackers, over 10.5 million swarms, and tens of millions of peers through the use of just two regular computers.
 
 
  click to close click to toggle the display of all publications
2009  click to toggle the display of all publications <-- click to see more details
[10 B. Zhang, A. Iosup, P. Garbacki, J. Pouwelse, A Unified Format for Traces of Peer-to-Peer Systems, In ACM Workshop on Large-Scale System and Application Performance (LSAP'09), in conjunction with the ACM/IEEE Int'l. Symp. on High Performance Distributed Computing (HPDC'09) (accepted).
keywords standardization, peer-to-peer, traces, performance.
 
Article, A Unified Format for Traces of Peer-to-Peer Systems in LSAP'09, PDF [200KB]
Electronic EditionACM DL BiBTeX EntryACM DL Conference Entry
 
  info designed a unified format for sharing peer-to-peer traces.
 
 
  click to close click to toggle the display of all publications
2008  click to toggle the display of all publications <-- click to see more details
[9 J.A. Pouwelse, P. Garbacki, J. Wang, A. Bakker, J. Yang, A. Iosup, D.H.J. Epema, M. Reinders, M. van Steen, and H.J. Sips, Tribler: A social-based peer-to-peer system, Concurrency and Computation: Practice and Experience, 2008, vol. 20(2), pp. 127--138, DOI 10.1002/cpe.1189, (accepted, journal).
keywords peer-to-peer, file-sharing, Tribler, system, 2fast, collaborative downloads, recommendations.
 
Article, Tribler, PDF [200KB] Article, Tribler, PS [3MB]
Electronic EditionDBLP BiBTeX EntryDBLP Conference Entry
 
  info journal paper on Tribler, a peer-to-peer file-sharing system. Tribler is based on a social paradigm and is BitTorrent-compatible.
 
 
[8 A. Iosup, D.H.J.Epema, T. Tannenbaum, M. Farrellee, M. Livny, Inter-Operating Grids through Delegated MatchMaking, In the Journal of Scientific Programming, Special Edition on Best Paper Award at SuperComputing 2007, IOS Press, vol. 16(2-3), pp. 233-253, 2008, DOI 10.3233/SPR-2008-0246, ISSN 1058-9244 (Print) 1875-919X (Online), (in print, journal).
keywords grid computing, Delegated MatchMaking, scheduling, peer-to-peer/centralized hybrid architecture.
 
Electronic EditionDBLP BiBTeX EntryDBLP Conference Entry
 
  info extended version of the SC'07 omonim paper.
 
 
[7 Javier Bustos-Jimenez, Nicolas Bersano, Satu Elisa Schaeffer, Jose Miguel Piquer, A. Iosup, and Augusto Ciuffoletti. Estimating the size of Peer-to-Peer networks using Lambert's W function. In Grid Computing: Achievements and Prospects, S. Gorlatch, P. Fragopoulou, T. Priol (Editors), Springer, New York, USA, pp. 61-72, 2008. ISBN 978-0-387-09456-4.
info accurate estimation of the size of real p2p networks using little measurement (extended version).
keywords peer-to-peer, algorithm, network size, Lambert's W function.
 
Electronic Edition  
  info accurate estimation of the size of real p2p networks using little measurement (extended version).
 
 
  click to close click to toggle the display of all publications
2007  click to toggle the display of all publications <-- click to see more details
[6 A. Iosup, D.H.J.Epema, T. Tannenbaum, M. Farrellee, M. Livny, Inter-Operating Grids through Delegated MatchMaking, In the ACM/IEEE SuperComputing Conference on High Performance Networking and Computing (SC'07), Nov 10-16, 2007 (accepted, 20%). SC'07 Best Paper Award finalist (top 10% accepted papers).
keywords grid computing, Delegated MatchMaking, scheduling, peer-to-peer/centralized hybrid architecture.
 
Delegated MatchMaking Article, PDF [1.5MB] | Inter-Operating Grids through Delegated MatchMaking finalist for SC'07 Best Paper Award
Electronic EditionDBLP BiBTeX EntryDBLP Conference Entry
Honors
 
  info inter-operating grids with a solution that combines the hierarchical and the decentralized approaches.
 
 
  click to close click to toggle the display of all publications
2006  click to toggle the display of all publications <-- click to see more details
[5 A. Iosup, P. Garbacki, D.H.J. Epema, Provisioning and Scheduling Resources for World-Wide Data-Sharing Services, In The 2nd IEEE Int'l. Conference on e-Science and Grid Computing (e-Science), Dec 4-6, 2006, Amsterdam, NL. (accepted).
keywords world-wide data sharing, flashcrowds, scheduling, provisioning, grid computing, peer-to-peer.
 
Article, PDF [110KB]
Presentation, e-Science'06 [PPT, 1.2MB] Presentation, e-Science'06 [PDF, 0.8MB]
Electronic EditionDBLP BiBTeX EntryDBLP Conference Entry
 
  info We have investigated the problem of provisioning resources and deploying content servers in support of peer-to-peer content distribution networks. We have provided a model for peer-to-peer swarms that includes regular operation and flashcrowds. We have designed several algorithms for resource provisioning. Through trace-based simulation, we have shown the impact of using each algorithm on a variety of performance and Quality of Service metrics, and that the proposed algorithms offer the systems designer a good coverage of the trade-off between cost (number of allocated resources) and Quality of Service.
 
 
[4 P. Garbacki, A. Iosup, D.H.J. Epema, M. van Steen, 2Fast: Collaborative downloads in P2P networks, In The Sixth IEEE International Conference on Peer-to-Peer Computing (P2P), Sep 6-8, 2006, Cambridge, UK. (accepted, 21%). Best Paper Award
keywords 2Fast, P2P file-sharing, collaborative downloads, cooperative behavior, peer-to-peer.
 
Article, PDF [210KB]
Electronic EditionDBLP BiBTeX EntryDBLP Conference Entry
Award
 
  info We have built 2fast, a protocol for collaborative downloads in P2P networks. 2fast exploits collaborations to correct the imbalance between download and upload bandwidth; as a result, a peer may fully fill its download connection without free-riding, thus achieving considerable speedup over non-collaborative downloaders.
 
 
[3 J. Pouwelse, P. Garbacki, J. Wang, A. Bakker, J.Yang, A. Iosup, D.H.J.Epema, M.Reinders, M. van Steen, H.Sips, Chitraka: A Social-Based Peer-to-Peer System, In the 5th International Workshop on Peer-to-Peer Systems (IPTPS'06), 27-28 February, 2006, Santa Barbara, CA, USA (accepted, 27%). An extended version can be found as Technical Report TU Delft/PDS/2006-002.
keywords peer-to-peer computing, file-sharing, social networking, performance.
 
Chitraka: A Social-Based Peer-to-Peer System at IPTPS'06 [PDF]  
  info We propose a novel paradigm for P2P systems design, based on social principles. We use this new paradigm and the existing BitTorrent infrastructure in the design of Tribler*, a P2P file-sharing system.
* Tribler, aka Chitraka. The name was changed at the last minute before shipping the software package associated with this research, for reasons beyond the scope of this note.
 
 
[2 A. Iosup, P. Garbacki, J. Pouwelse, D.H.J.Epema, Correlating Topology and Path Characteristics of Overlay Networks and the Internet, In the 6th Int'l Workshop on Global and Peer-to-Peer Computing (GP2PC'06), in conjunction with the IEEE/ACM CCGrid'06. An extended version can be found as Technical Report PDS-2005-002.
keywords Large-scale underlay/overlay network measurements, P2P measurements and analysis,
The Delft BitTorrent Measurements 2, peer-to-peer.
 
- | MultiProbe presentation, GP2PC/CCGrid'06 [PPT, 2MB]
Electronic EditionDBLP BiBTeX EntryDBLP Conference Entry
 
  info We have observed 2,000 BitTorrent swarms from the Pirate Bay global BitTorrent system, for 5 days. We report on the network characteristics of the peers, such as application-level bandwidth, latency, serving ISP and AS, etc.
 
 
  click to close click to toggle the display of all publications
2005  click to toggle the display of all publications <-- click to see more details
[1 A. Iosup, P.Garbacki, J.A.Pouwelse, D.H.J.Epema, Analyzing BitTorrent: Three Lessons from One Peer-Level View, In Proceedings of the 11th ASCI Conference, pp. 96-104, 6-8 June, 2005, Heijen, The Netherlands.
keywords peer-to-peer, analysis, peer characteristics, quality of experience.
 
Article, PDF [330KB] | Presentation, PPT [5MB]
BibTeX entry
 
  info Peer-level analysis of a large BitTorrent dataset (obtained from the largest BitTorrent measurements at that moment). Proposed a novel Quality of Experience metric, which quantifies the fraction of files downloaded by peers before quitting.
 
 
  click to close click to toggle the display of all publications

File types: .bib | .pdf | .ps | .doc | .ppt .tgz | .zip | description | any
Other information: Electronic Edition Electronic Edition | DBLP Conference Entry Conference/Journal ToC (DBLP/ACM DL)


                                                                                                                                                                                                                                             
   Online reports  
A peer-level view of a P2P network
added Aug 2005
   
 
     

Last modified: Wed, 29 June, 2011 2:52 PM
The newest version of this page can be found at: http://www.pds.ewi.tudelft.nl/~iosup/research_p2p.html
Copyright © 1998-2005 Alexandru Iosup. All Rights Reserved.
Google Analytics .