Randpeer - Research
We want to make fully distributed, totally serverless, fully scalable, globally searchable, bandwidth efficient, robust, encrypted, stealthy, preferably anonymous p2p systems.
We know our demands are pretty extreme, but here goes.
Totally serverless
We want to have NO servers at all for several reasons.- Servers cost a lot of money. Serverless networks on the other hand can handle millions of nodes for about 0 cost.
- Servers usually don't scale well. When a network becomes too popular one has to add more servers and then make those servers cooperate so one don't get a separate network per server. That is often very tricky and costly.
- Servers can stop functioning for several reasons. For instance power outages or simply because the one who created the network is not interested in it anymore and turns of the server. Serverless networks can go on forever as long as there are people using the network.
- Servers can be attacked with for instance massive "denial of service" attacks. Serverless networks might be much harder to bring down.
Fully distributed
Primitive p2p algorithms often use a centralistic approach where some nodes act as a kind of servers/bosses/leaders. We don't like that since those nodes often become weak spots and also those nodes often have to work too hard. So we prefer fully distributed solutions where all nodes are as equal as possible. Although we allow some nodes to be supernodes since all nodes are not created equal. Or if you will, some nodes are more equal than others. But note, there is a big difference between using 10-30% of the nodes as supernodes compared to using a single node at the top of a centralistic algorithm.Scalable
We try to find algorithms that work well in networks with at least 100 million nodes. So for instance for the search function we want a user to find what he looks for even if only one node out of 100 millions has it! That is what we call fully scalable and globally searchable systems.Bandwidth efficient
Since most Internet users in this world are NOT using multi megabit Internet connections that are always online we design our algorithms so they use as little bandwidth as possible. Besides, even those users with multi megabit bandwidth probably want to be able to run more than one communication application at a time.Robust
Most home users don't have computers that are online 24/7 so we want to be able to handle high rates of churn. (When nodes go on and offline pretty often.) Also home user nodes sometimes suddenly crash or behave strange. Another problem is that any given node can usually only talk directly to say 99% of the other nodes due to badly configured ISP routers or Internet congestion. So when designing p2p algorithms one has to take all these "Internet imperfections" into account and make the algorithms robust.Encryption and stealth
We don't want ISPs and others to be able to (easily) detect what application you are running. And they should definitely not be able to see what data you are transferring. So of course we make extensive use of encryption and some other tricks.Anonymity
Freedom of expression and freedom of information are necessary for democracy to work properly and is a basic human right. But from time to time those rights are attacked. Sometimes big corporations and big organisations "legally" harass ordinary citizens to quiet them (sue or threaten to sue them). And in some countries the government put people into jail for what they publish or read. And sometimes extremists beat up people that publish opposing opinions. So we need both anonymous downloading AND anonymous publishing/uploading. Although in our current research and designs we have put low emphasis on anonymity.
Randomised algorithms
P2p networking is chaotic. We can create order in chaos by using randomness. Because near perfect randomness means near perfect order.More or less by accident we stumbled on a special type of solutions to p2p problems, a special family of algorithms that is called "randomised algorithms". Some of them are also truly chaotic. In the centre of all those algorithms is a pseudo random number generator. Most of these algorithms are also of the type that is called "emergent networks". This kind of algorithms is by their nature very scalable. This means we usually don't have problems handling millions of nodes. Instead our algorithm problems usually begins when we have less than about 100 nodes. Thankfully it is easy to simulate less than 100 nodes, which makes it easy to test the algorithms. Of course we do simulate up to 4000 nodes too...
So our algorithms are pretty different from most of the algorithms that has been designed/invented by others.
Target applications
Since we are doing basic p2p algorithms research we keep many different "target applications" in mind when we identify problems and investigate algorithms. Applications such as filesharing, chat, instant messaging, Internet telephoning, Internet games, radio and TV (sent from home users computers), distributed calculation systems and many more applications.P2p problems
We have identified a number of problems in p2p networking and try to find algorithms that solve each of them. Occasionally we even find algorithms that solve several of them at once. Remember that we are doing it fully distributed and totally serverless before you say these problems are easy to solve.- Keeping the network together, avoiding netsplits.
- Finding the network when a node is going online without using any central server, not even DNS. Even when joining the network for the first time.
- Handling broadcasts. Among other things to prevent broadcasts from flooding the network with too much messages when many nodes broadcast at the same time and to broadcast in a bandwidth efficient way.
- Counting number of nodes online. And counting number of unique nodes per month.
- Do network wide statistical value aggregation. Like average (medium) values, median values, percentile values and maximum and minimum values.
- Network total numbers. Like total number of Megabytes online etc.
- Network time. That is, that all nodes agree on what date and time it is now. Within a minutes accuracy is acceptable but we think we can get it down to some seconds accuracy.
- Subnetting (or subgrouping if you prefer). That is, to announce and find separate subnets (groups of nodes) on "the mother net". For instance application subnets if you run several applications on the same mother net. Then under the application net you might want to run subnets for each service like one subnet for each chat room and each search database etc. (For instance one broadcast search database and one DHT database.) And be able to plug in new services and subnets later on.
- Announcing and finding items. That is, create distributed databases/lookup systems/search systems. Such as broadcast search and DHTs. Such search systems have a whole range of problems of their own.
- Transfer files in an efficient, robust and sabotage proof manner using such things as multi source download, hash trees/lists and perhaps even using advanced file swarming.
- Encryption and security. How to handle keys and identities and how and what to encrypt and sign etc.
Well, that was at least a partial list of the problems and algorithms we have been working on.