Ok, I'm going to attempt to explain why open source software is better then closed source. For the libre-software folks in the crowd I'll be addressing copyleft and the four freedoms in a following post.
I'm going to explain this using an analogy and what I hope is an apt one- that of a recipe. This is something everyone is familiar with and has probably worked with at one point or other.
Source code is a recipe for how to make a program. Depending on the language it is either "baked" (compiled) or "eaten as is" (interpreted).
Like a recipe the source code is just a list of things to use (resources) and instructions on how to use them (the program).
Just as many recipes need to be baked and one can't easily identify what went in to the recipe after baking, many modern programs are compiled and what comes out of the compiler looks way more like cake then eggs, flour, sugar, vanilla, etc. Therefore it is very hard to work on a program after it is compiled. Just imagine trying to add more oil to a cake that came out too dry after it's baked. It just won't end well.
Ok, now that we have the analogy established, imagine a world where recipes were all legally protected secrets. The only food you could buy was pre-cooked or ready-to-eat. Hate the flavour? Too bad. Want to add blueberries? Sorry can't do that, or at least not in a meaningful way. Ovens would be for heating alone just as most people's computers are just for using a browser.
Worse still, if you did figure out how to make a brownie, somehow found the ingredients and tools to use them and then, GASP! used your oven for cooking, you'd probably promptly get sued by the local big brownie concern for stealing their secrets. And because they are secrets you couldn't prove that you didn't or it would be very hard to.
In this world almost no one could help you with your brownies as only a select few know how to cook or what cooking even is, other then "That thing specially trained people do for big companies".
This is the world of closed source. This is the world of the late 80's and early 90's before the open-source movement. There were a few small pools of hobbyists keeping programming for fun alive but mostly all the recipes had disappeared or were very old and stale.
Now imagine a world where everyone publishes their recipes. And because of this the tools to use the recipes are readily available. If you didn't like Magoo's chocolate cake, you could download the recipe and fix it and bake your own. Now depending on the license that Magoo attached to the recipe you may or may not be able to tell anyone about how you fixed it, and may or may not be able to sell the better cake you made. This is where software freedom and copyleft comes in which I'll talk about in a later posting.
In this world there would be lots of people cooking, sharing ideas on how to cook, how to cook better, coming up with new and interesting things. Also people could look over Magoo's recipes and say "Too much salt in cake #3, it should be 1 teaspoon not 1 tablespoon". Also people could make sure Magoo's wasn't including rat poison, or making a frosting of raw eggs, sugar and lard that'd go off in a day and lead to people getting sick and dying, thus making everyone safe.
Just imagine if VW's emission control software had been open source. People would have looked at it and said "WTF! What are you doing?" Now I know some of you are saying "Ah, but they could publish a good recipe and then bake the bad one". True, but it'd still be a lot easier to catch them as you could bake the recipe they published and then compare it to the pre-baked version. In the VW example the pre-baked version would somehow, mysteriously have way better mileage. And because people know how to cook, they'd know there are only a couple of ingredients that could be fiddled with to achieve that result.
VW is not the only one hiding things in their closed source software. Most programs that you find in "App Stores" are closed source and many of them do their best to take your personal information, often without permission. These activities would be plainly visible if people could look at the source code, as would many vulnerabilities or things like back doors in the program.
This is where we are hopefully heading. Many programs are now open source; many are still secret. We will probably never get to a 100% open source world. But as people learn more about the open source movement, and realize that programming is just a learned skill like cooking, instead of seeing it as a magical "something" that only rare geniuses can do, there will be more and more pressure for companies to open their source code, or for software repositories like "App Stores" to include a way to also download the source code for a program.
I am writing this blog entry to explain to those that may not know the three models of doing things on the Internet. And also why it is important to understand them, to pay attention to them, and choose services and software that use the most correct model.
The three models are:
- P2P - Peer-to-Peer
They all have their strength and weaknesses and more importantly, they all have an impact on your rights and freedoms.
Centralized is the most common. This is your Google, Facebook, Pinterest, Twitter, Amazon, E-bay, Bank, Etc.
The centralized model has Big servers run by private interests (the site owners/Company) located in some place of their choosing which you use a browser or mobile App. to connect to. Typically all data is stored on the remote server.
This model is perfect for things like banking or online shopping. Just like in the real world you go to the place of business to shop or bank. It is also very appropriate for information type websites news,stocks,weather,sports scores,etc.
The important thing to remember about this model is that you do not control the server and therefor you do not control the data on the server. For the sites mentioned above, no biggie. For things like Facebook,Twitter,etc that live and die on user generated content (your stuff, your data) it's a huge biggie. Once your data is on their server it is usually considered "their data". The User Agreements of such site almost always stipulate that they can do what ever they want with what you upload.
The centralized model is also the easiest for the government to spy on, sensor, control, and shutdown. Because all the data on the server is owned but Company X all the government has to do is legally compel Company X to hand it over. In this way encryption like HTTPS is null and void. Governments can also just seize and shutdown servers they don't like. Also if Company X gets tired of running the server it and all your data will just go Poof and disappear from the Internet.
Considering all these things it is easy to see that the centralized model is both the least free (as in your rights and freedoms) and the most fragile. A lot of service providers out there could be whipped off the Internet by one good flood or other disaster happening to their main server.
This model is less known and understood by the average person today but it is actually the most common model used in the early days of the Internet. In this model instead on one server (or server farm/s) owned by one company there are many small servers that all talk to each other (federate) to provide a service. This model is used for E-mail, IRC, Usenet, XMPP, UUCP (yes I know that is ancient and deprecated), and newer system like pump.io and Tor. The strength of this system lies in the fact that no one owns the system.. sure they may own a server or two but no one owns the whole system. If a server goes down you just switch to another one.
This model is much harder to sensor, shutdown or control. Servers can live in different countries with different laws and governments. Typically the software to run these kinds of servers is small and easier to install and maintain. This means that anyone with a bit of work and understanding can set up an server and become part of the network of servers. If a government wanted to shutdown the service they'd have to block access to every single server, or a majority of them, to make the system unusable. Not so easy. Spying wise it is harder too. If the government compelled Google to hand over all E-mails (you can be pretty confident that they have/are) it doesn't get them any mails going from email@example.com to firstname.lastname@example.org.
Users typically use some sort of "client" software to connect to their server of choice and interact with the system as a whole. They don't have to worry about what server their friend is on because all servers in the system talk to one another. So email@example.com can email firstname.lastname@example.org no problem, no worries. As you can see from that example the part that comes after the @ actually refers to what server someone is on in the system. The same is true for XMPP addresses, SIP (proper Voip) addresses, webfinger addresses (pump.io), etc.
There is still the problem of your data on their server.. but as a federated system passes the data from server to server people running federated servers tend to act more like custodians of the data then owners of it. People tend to run these types of servers to offer a public service. OK, well not the Google's of the world. But places like Riseup or Ostel.
Peer to Peer (P2P)
In this model the client software is also the server. All clients on the system talk to and can connect to all other servers on the system. These systems are highly dynamic (servers coming and going all the time) and tend to be very connection and bandwidth heavy because everyone has to help move everyone else's data around.
In a P2P system no one owns the data it just lives out there bouncing from client to client. This means that for most P2P systems you have to be willing to give resources to the network. You have to let the P2P network use some of your bandwidth and disk space.
As you can imagine this is the least easy to censor or shutdown model, and also, if it is done right the hardest to spy on. Because of this many people see the P2P model as a freedom and privacy Panacea. But the truth is this isn't the best model for all things. I don't want to be trading huge chunks of bandwidth and disk space just to see what the weather is going to be like tomorrow. Also because of the dynamic nature of the network and the problem of where stuff is stored relative to who is online the P2P model isn't really the best for "store and forward" applications like E-mail. Sure there are things like Bitmessage but if Bob isn't around for a day or two after Sue tries to send him a bitmessage her software will have to try sending it again. If they have really bad timing it could take months for Bob to get the message. Where in a federated system Sue would send the data to her server of choice which would send it to Bob's server of choice which would hold on to it till bob came online.
People in remote locations or developing countries may not have the bandwidth or disk space to share. There are people in the area where I live for whom a P2P system could easily eat their monthly data allotment in a day or two.
Even tho a P2P system that used good encryption for transfer and storage would be very hard to spy on these systems are complicated beasties and are prone to other forms of attack, resource depletion, evil clients that do things like say they'll forward that data but then throw it away thus vanishing it from the network, governments running a ton of clients to analyze the traffic flow and figure out who is talking to who or even who is who, etc.
It is also important to note that many P2P systems like Bittorrent and Bitcoin do nothing to hide your IP address, so there is no anonymity. Many people are confused and think that P2P automatically means anonymous.
Which is Best
There really is no one best model. The important thing is to try and pick the services that are using the right model for the right job and be aware of the trade offs
- more right but more resources (P2P) - Heavy on bandwidth, CPU time, and disk space but no central server, just other people using the software.
- No rights but fast, easy and light on resource (centralized) - Where people running the service control everything. The rules, your data, who has access and how, etc.
- a bit of a mix (federated) where people running the many servers take the bandwidth and resource hit.
Things to watch out for are centralized sites that are trying to own and control your data, and a newer trend of big companies trying to push the workload onto users by using P2P technologies. Netflix has eyed this to take some of the load off their servers by making people watching a show also stream that show to other people watching the show.. great for them.. terrible for your bandwidth.
Pay attention to which model a service is using and you have a much better ideal of how it effects your rights, freedoms, data, bandwidth, and disk space.