© 2001 Mixter
 Peer-to-peer and the future of distributed applications


Index


What is P2P

Goals and applications for P2P

How P2P works

Security issues and solutions

The Hacktivismo project



What is P2P

Definitions

Peers: Computers communicating with each other mutually while playing identical roles. Also called nodes.

Peer-to-peer: A decentralized technology of peers that communicate with each other while playing identical roles on a network.

Client: Part of a P2P application which lets the user direct the information exchange and which initiates connections.

Server: Part of a P2P application not directing the information exchange, providing access to data or resources, and accepting incoming connections.

Neighbouring Node: A node that is directly connected to the node in question.

Metadata: Data describing other data, e.g.: filenames, indexes, content labels.


Characteristics and features

Decentralization - leads to increased performance, scalability and disruptiveness.

As a network grows bigger, all centralized points will become weak points in its infrastructure. Example: The internet and the DNS system, or transatlantic peering points.

Opposed to the traditional client/server model of standalone ftp/http/mail/etc. servers, peers are constantly connected to the network and constantly exchange information.

Peer-to-peer...



Goals and applications for P2P


Current applications of distributed and P2P technology

AreaSince ca.Examples
File-Sharing3 yearsGnutella, Napster*
Distributed Computing4+ yearsSeti@Home**, Distributed.net**, DistributedScience**
P2P Search Engine1 yearOpenCOLA, Some bots/agents (see botspot.com)
P2P Communication4+ yearsICQ*, IRC*, Eggdrop, Aimster
Edge Services**2 yearsIntel's upcoming edge services
Device Intercommunication1 yearJini, Bluetooth
Anonymity/Anti-Censorship1 yearFreenet, Onion Routing, Hacktivismo, Red Rover

* = Does need central or hub servers
** = Distributed but not "real" P2P: peers don't communicate directly with each other

Note: DDoS would have a ** here, because a decentralized DDoS tool doesn't exist yet. It would probably be a problem to make a stealthy and reliable P2P DDoS network, because of the low but constant traffic or P2P nodes.


Possible future trends affecting P2P

Future goals of P2P

Anonymity, anti-censorship services, and decentralized information hosting. Increasingly deployed in totalitarian countries with excessive censorship of the internet.

Easier ways to find shared content and data, and a reduced risk of losing data (by HD crash, viruses, intrusions, etc.), through multiple copies and search indexes.

Open business-to-business and logistics networks, working through "supply" and "demand" messages of various types, offering fast exchange of computer resources and materials.

Transferring previously signed cybercash with true anonymity, using anonymous aliases for identifying transaction partners. Providers of anonymous cybercash might develop such systems to avoid liability for their customer's actions and to attract more people.

General access to storage space, CPU cycles, content, even rendering capabilities of video cards or other hardware. Distributed computing requires better virtual languages for executing untrusted code and offering access to resources securely.

Net infrastructure. Today, if 1000 people in Europe request the same web page in the US at the same time, it will travel across the Atlantic 1000 times. Transparent collaborative P2P routing might join technologies like multicasting, edge services, and transparent caching in the future to enhance effectivity of bandwidth usage on the net.


Conclusions

The key problems of public distributed P2P systems are: mutual trust, application security and data integrity.

Things you can't or shouldn't implement as P2P are systems where users have to receive the same data simultaneously, and systems in which the user conceptually has to communicate with a central party.

Peer-to-peer and decentralization can make old protocols interesting again, and put them to new uses, for example, HTML, XML and HTTP. A P2P-extended web could help end users to contribute actively with their own content.

Decentralization and peer-to-peer models have probably more applications than we can imagine. In any areas where scalability and capacity increasingly plays a role, decentralized systems will be implemented eventually.


How P2P works

Different types of decentralized P2P applications and protocols have different structures of user data, and different ways of using that data, but they all share an underlying infrastructure.


Components of a typical P2P application

The client component

The server component

The data component

The routing component

Propagation of messages in a decentralized network is usually done according to the source/destination fields of the application protocol:

Routing and P2P topography

Messages can be broadcasted, which means they are sent to all directly connected hosts by each peer. For example, pings or file search requests.

Messages can also be routed, which means they will be sent only to a particular location, after traveling through a chain of other peers.

Since the topography of a P2P network is semi-random and cyclic, a cache of recent identifiers unique for each message, and a decreasing TTL value in the P2P protocol can be used to prevent circulating redundant messages.


Security issues and solutions

General security recommendations

The authenticity of peers in a distributed network cannot be trusted unless reputation can be assured from a source outside of the conventional communication of the distributed network. Problem: this leads to some centralization.

A secure distributed protocol should generally not let a remote peer make you:

As additional protection for the user and his anonymity, the following general policies should be configurable in distributed applications:


Stealth and anonymity of peers

Spying and malicious parties have two basic ways of monitoring traffic of a decentralized P2P network:

Traffic analysis is used to analyze content and addresses on a public network, and to determine that a particular protocol or form of communication is actually taking place.

Peer-to-peer SSL connections can help obscuring the content to outside parties (each node exchanges P2P headers and payloads with direct neighbors through SSL-encrypted channels).

Traffic analysis can go beyond analyzing the content. For example, it can find data belonging to an encrypted P2P protocol if it often sends data packets of the same size. Padding of data packets to a random size can prevent such kinds of analysis.

Eavesdropping means to determine who is talking to whom, and what data is exchanged. It consists of methods to subvert existing principles of anonymity.

Anonymity means that two parties can communicate while one or both cannot be identified by the other. Using the common internet transport protocols, this is impossible, however, peer-to-peer can make this possible.

The TCP/IP headers are like an envelope which bears the addresses of source and destination and contains the user data. In a peer-to-peer system, we can discard one (when routing) or both (when broadcasting) of those addresses by routing our messages through our own P2P application level protocol, replacing source and destination addresses with virtual addresses or aliases, thus delivering only the now anonymized content.

For senders and requesters of data to remain anonymous, the remaining problem is that each node always knows the IP addresses of its neighboring nodes, the nodes which are directly connected to it.

Therefore, if your neighboring nodes want, they can still monitor which files you request or send and correlate to your IP address. A spy - a government agent, your boss, the RIAA, or whoever - could contribute own hub servers to the P2P net, ensuring that many nodes directly connect to his node(s), and then still keep a log of who sends and requests what.

There are a few workable, but imperfect solutions to this dilemma:


Confidentiality and integrity of distributed data

A weak point of open P2P networks in which anyone can contribute, is the integrity of the data exchanged. Even with cryptographic or specific protection, such as in current distributed computing projects, it is questionable if a user couldn't reverse engineer his client or its traffic, and manipulate it.

Providing data through a distributed networkis not hard, but when it comes to offering authentic data, cryptographical signatures or other forms of guaranteeing authenticity become a necessity.

SSL-through-SSL using trusted peers, a technology developed by Hacktivismo, and first implemented by our team member Paul B., is one possible solution to this problem. It also solves the problem of protecting secrets while still being able to send them over untrusted peers.

Public keys or certificates must be exchanged between Peer 1 (Requesting Peer) and Peer 5 (Trusted Peer). While the protocol header of the P2P packet can and must be read by each peer, the payload part is encrypted with the trusted peer's key and stays encrypted while being routed by middlemen peers.

The privacy of a SSL-through-SSL transmission is moved beneath the application layer, limited to the payload, therefore it is made application independent.

Trusting SSL in a distributed environment

To prevent man-in-the-middle attacks against SSL, the trusted peer's key or certificate itself must be obtained through a channel that ensures authenticity, e.g. it must be downloaded from a secure, certified web site.

If certificates or public keys in a P2P network are not either distributed through out-of-band methods, or at least signed by a trusted CA, the identity of the keyholding peer cannot be trusted. It can then at best be used for traffic encryption against the monitoring efforts of third parties *outside* of the peer-to-peer network.


The Hacktivismo project


The Hacktivismo team has been brought together by the cDc. We are developing a distributed application with the goal to defeat censorship and surveillance technologies.

We have decided to do this because of growing international censorship, to give people access to free information, and to prevent legal actions against people in unfree countries who try to get access to politically incomfortable material.

At a certain point of development, hacktivismo will go open source and also publish detailed informations and documentation.

In general terms, our program provides a decentralized peer-to-peer network for doing fully anonymous proxy downloads of files and web sites, instead of locally storing and sharing files.

Our goal is to make all traffic anonymous and stealthy enough to bypass any firewall or censorship system from the "inside", i.e. from totalitarian countries.

We are working with a handful of well-known developers and other people, as well as human rights groups, on the goal of finishing and safely distributing this application to the people who need it.