Privacy in P2P based Data Sharing Muhammad Nazmus Sakib CSCE 824 April 17, 2013
Dec 26, 2015
Privacy in P2P based Data SharingMuhammad Nazmus SakibCSCE 824April 17, 2013
OutlineProblem DescriptionBackground
◦Privacy◦P2P
Type of Privacy◦Location based◦Content based
Summary
Problem DescriptionPrivacy concerns in P2P networks
◦User’s ability to control disclosure of personal information
Our Goal◦Assess the current privacy exposures
in existing networks◦Discuss the existing solutions to
counter them
PrivacyThe right of individuals to
determine for themselves when, how and what extent of information about them is communicated to others
Alan Westin, Columbia University
Overview of P2PDistributed application architecture Partitions tasks and workloadsPeers are both supplier & consumerNo or little centralized controlTypes
◦Structured Uses DHT (Distributed Hash Table) Example - Kad
◦Unstructured Ad hoc fashion Example – Freenet, Gnutella.
Types of PrivacyLocation Privacy
◦Controlling disclosure of IP address, geo-graphic location, identity, etc.
Content Privacy◦Controlling disclosure of personal
data files and user behavior.
Location PrivacyThe problem
◦Gnutella, eDonkey◦Kaaza◦Skype + BitTorrent
Solutions◦Freenet◦OneSwarm◦I2P
Location Privacy:ProblemGnutella/eDonkey
◦Change from protocol V.0.4 to V.0.6 increased privacy vulnerability
◦Users can be monitored by IP address DNS name Software versions Shared files Queries
Location Privacy:ProblemKaaza
◦No support for anonymity
Skype + BitTorrent◦It is possible to determine the IP
address and file sharing usage of a particular user Blond et al.
Skype + BitTorrentFinding the IP address
◦Find the target person’s Skype ID◦Inconspicuously call this person◦Extract callee’s IP address from
packet headers◦Skype privacy settings fail to protect
against this scheme◦Observe mobility of the Skype users
Skype + BitTorrentLinking internet usage
◦ Skype tracker employs ten tracking clients to daily collect the IP address for the 100,000 users
◦ Infohash crawler determines the infohashes (file IDs) of the 50,000 most popular BitTorrent swarms
◦ BitTorrent crawler collects the IP addresses participating in the 50,000 most popular swarms
◦ Verifier attempts to initiate P2P communications with the two applications in order to verify that the same user is indeed running both of them
Location Privacy: SolutionsFreenet
◦Protects anonymity of both producers and consumers
◦Identical nodes collectively pool their storage space to store data files
◦Dynamically replicated files are referred to in a location-independent manner
◦Infeasible to discover the true origin or destination of a passing file
Location Privacy: SolutionsFreenet
◦Weakness TTL value of the packets can be used to
gain knowledge about the source of the file
Surrounding a node with all malicious nodes can monitor incoming and outgoing of packets
Slower performance than traditional P2P networks
:Location Privacy: SolutionsOneSwarm
◦ Makes a trade-off between performance and anonymity Better performance than Freenet Better privacy than BitTorrent
◦ Control of Privacy is on the users◦ Data transferred through disposable addresses◦ Prevents monitoring of user behavior
OneSwarm
OneSwarmWeakness
◦Timing attack is possible with only two attacking nodes
◦15% attacking peers can make 90% peers vulnerable
◦Thwarting attacks will increase response time greater than Freenet
◦25% attackers can monitor 98% peers
◦A TCP-based attack with only one attacker can identify source of data
Location Privacy: SolutionsI2P (Invisible Internet Project)
◦Network layer allowing communication pseudonymously
◦Implemented through I2P routers◦End-to-end encryption◦P2P implementations
I2P over BitTorrent iMule (Invisible eMule) I2Phex
I2PAttacks
◦Timpanaro et al. developed a large scale monitoring architecture
◦It reveals that a large scale system can compromise its anonymity
◦Still a better choice than Tor or Freenet
Content PrivacyKaazaKadPersonal Health Information
Content PrivacyKaaza
◦Good et al. conducted experiments to Find out whether users are sharing personal files Find out whether the shared files are
downloaded
◦Results indicate (24 hour period) 156 distinct users shared their inbox 19 out of 20 users shared email files 9 users shared web browser cache 5 users shared word processing documents 2 users shared financial documents Shared dummy files were downloaded by 4
distinct users
Content PrivacyKad Network
◦Dragonfly monitoring system Passively monitor sharing and downloading
events
◦Within 2 weeks 5000 private files related to 10 distinct keywords
◦Honey files 192 distinct attackers tried to download 45 attackers tried to hack into the honey
accounts 125 times
◦Solution eMule plugin – Numen
Content PrivacyPersonal Health Information (PHI)
◦ Emam et al. designed a system to download files from P2P networks
◦ Results show 0.4% Canadian IP had PHI 0.6% US IP had PHI
Personal Financial Information (PFI)◦ Same experiment
1.7% Canadian IP had PFI 4.7% US IP had PFI
Experiments performed over◦ FastTrack (Kaaza)◦ Gnutella◦ eDonkey
SummaryConsiderable amount of privacy
exposures are present in current P2P systems for both location and content privacy
Several solutions have been proposed to provide anonymity, while very few solutions for content privacy
Flaws are present in the existing solutions
Questions?