This is the fourth post in my series about what I believe is the future of our online experience and identity. In part one I talked about why I believe the future is in an open peer to peer social network, in part two I described how and why that network needs to be based first and foremost on domain names owned by the individuals that make up the network, and in part three I talked about the potential that lies in using URI’s the foundation for a distributed database that ties everything together. This fourth post talks about how, once everyone is on the same grid, we can connect the dots and make them talk. I was actually going to wait a while to post this fourth one until I had something functional to show, but lots of people had questions about the last post so without further ado I guess I’ll just go ahead and post this now.
So let’s go, biology-style, from the molecule down to the atoms of the network.
Here’s the ultra-simplified, 5000-foot overview of the network:
At this level, we have lots of users, each with their own domain name. Looks pretty much like your standard peer to peer network, and it basically is. Each of the nodes in this network runs what I have started calling a “private server”, that is, a server that runs at an individual user’s domain name. Or, as in the red node above, at hosted application servers.
The private server is a piece of open source software that acts as the center of the user’s online experience. Email, instant messages, social networking, and personal Web sites all flow through this single point of presence. The Web piece of this server is the meat of the user application itself because it provides both the user’s personal center of communication as well as the user’s public facing Web presence. The user can use it to administer their public Web presence, send and receive messages, launch applications, and a bunch of other fun stuff which I’ll talk about another time.
Applications and Authentication
This network is fundmentally a peer to peer social network. However, it needs to allow for applications to be built on top of it as well. In fact, the main reason I set out to figure out how all of this should work is because I want to write applications on top of such a network.
However, authentication works a little differently in this world. With everyone having their own domain, server, and set of URI’s now, we have to trust above all else the user’s private server, which is really the only server that the individual should need to deal with for authentication. Thus, instead of a user registering to use an application as it typically done with Web applications these days, we need to turn this concept around and the application now needs to register with the user. The private server Web client has the ability to launch an application by making a request to the application server, through the private server, to verify the user’s identity and launch the application. Going this route there is no risk of a user’s account being compromised at the application level because the application acts more as an extension of the user’s own private server. Applications essentially offer the user an extended set of capabilities now, similar to how the MySpace add-on companies plug in to the user’s MySpace page (except that there will be no MySpace to send C&D’s to apps they don’t like). In this model, it’s more akin to standard desktop/server security than normal Web security: the user is authenticated automatically because he is the one launching the application from a trusted location, and the application doesn’t need to worry about authentication–it can focus on what it’s supposed to do.
Connecting the Dots
Let’s zoom in on one of those private server to private server connections:
When you look at the network this level, you can see that the private server has another server sitting behind it which I like to call the storage server. It’s responsible for storing user data and running the guts of what the private server is actually capable of. In fact, the private server does nothing more than serve Web content and proxy requests to and from the storage server. All application logic and data storage is done on the storage server.
Yes, it actually would be possible to combine the private and storage servers, but there are two reasons I wanted to keep them separated. One is so that the private server can stay lightweight enough to run on mobile devices, and the other is that it’s easier to load balance and distribute the heavy lifting if it’s decoupled from the user interface and external endpoints.
The private server is really just a collection of protocol endpoints. Its job is to normalize all incoming traffic (HTTP, SMTP, and XMPP) to an XMPP message that is either delivered to a connected client or shuttled to the storage server which processes the messages and does the heavy lifting. It’s also capable of routing messages in the opposite direction and delivering them via instant message, email, and Web. In fact, if you only wanted to use today’s technology on top of this server, you could easily host a Web site, host email, and host IM on the private server of a domain (with all the data securely persisted on the storage server). It would essentially be a small business server, but at a personal level. The social network and application layer sits on top of that layer.
The XMPP Plumbing
As you can probably see, XMPP is the plumbing behind most of this. The private server is both an XMPP client and an XMPP server, and the storage server only speaks XMPP. It’s XMPP that connects private servers together, and it’s XMPP that connects the private servers to the storage servers that sit behind them. There are a few reasons for using XMPP:
- XMPP already has widely available clients for every platform. They’re instant messaging clients, which mean that they obviously can’t use all of the functionality I’m talking about, but there are still a lot of people using XMPP-based instant messaging clients out there, so that’s a lot of people that the private server can interact with out of the gates.
- Its addressing is URI-based. As I talked about in part three, this is crucial. It allows every node in the system to communicate with every other node.
- It has a large, mature set of standards. SIP calls? Personal eventing? RPC? Message security? It’s all there in the Jabber specs, ready to be used.
- It defines discovery and publication mechanisms that are already in use today. Although I believe most of the requests will be done directly against objects that are already known to the consumer, a service discovery mechanism is very important to any decentralized system.
Finally, let’s zoom in one more time, this time on the private server’s XMPP server endpoint; this is where it ties into the URI as a database ID concept:
This diagram shows a very simplified example of how a blogging application might view the XMPP server endpoint. As you can see, there isn’t just one endpoint, there are an infinite number of endpoints. Well, to be precise, there are just as many endpoints as there are records on the storage server. When a request comes in to the private server for a URI, it asks the storage server if such a thing exists, gets it, and then serves it.
Here is where it starts to look like a distributed database. If you look at it, you can see that the first level objects below the base URI are top level objects such as a weblog, a contact list, and a calendar. At the level below that, you get sub-objects that belong to the upper-level object. For example, a blog post belongs to a blog, and trackbacks belong to a blog post. You could also say that people (the root) have a one-to-many relationship with blogs, which have a one-to-many relationship with posts, and so on. This just turns a database record into a URI resource, and changes the language of choice from SQL into XMPP. XMPP has built-in querying and discovery capabilities, so you have the ability to see what lies below any one of these levels.
Another interesting artifact of organizing data this way is that not only does a URI identify where to locate something, but that endpoint also becomes useful immediately by the larger Internet. To refer someone to a specific contact, or a specific calendar event, or anything else, the URI is all you need. When they go there with their browser, the HTTP server element of the private server can return the HTML representation of that resource by requesting it from the storage server (using URL re-writing). The default HTML representation can easily be configured by wrapping an XSLT around the underlying resource’s XML representation, which allows for extensive customization as well.
Here’s a link to the next post in the series–Reinventing the Internet, part five: Decentrlized network, centralized identity