Unix 2.0

I quite like the Web 2.0 revolution, both from the interactive, crowd-sourcing, social aspect, and the technical aspect. It’s the technical side of things that I’ve found particularly interesting lately, especially after reading about some of the possibilities of Google Chrome and Google Gears and apps like AjaxTerm or ChipTune. There’s a whole lot of development thought and effort going moving applications onto the web, and it’s probably pure hubris to try to guess how it will all end up, but hey, why would that stop a blogger?

I’ve always quite liked the Unix philosophy (UHH ownership notwithstanding), and there are some really interesting parallels between it and aspects of the AJAX revolution.

One of the most elegant features of Unix’s X Windowing System is the separation between the X Server and X clients — which in turn allows your applications, as X clients, to easily be pointed at varying X servers — so you can have them display on a different computer, or a virtual desktop, without major challenges. On top of the basic X system, there are a number of extensions servers can support to provide more powerful or faster access to the display/hardware, and there are a number of toolkits, such as gtk or qt, that make it easier to do standard things with X (like create menus and toolbars). And being a standard protocol, that means you can run an X server on your Windows or OS X machine, and display programs running on a remote Unix box just as if they were running locally.

The major downside with this, though, is that the X servers are largely “dumb” — they’ll provide standard extensions which help some things, but otherwise, it’s up to the client to do all the work. If you’re running the server and client on the same machine, that doesn’t matter. But if the client’s an ocean away, that can introduce a lot of lag in between simple UI actions and responses, that make for a bad user experience.

And Web 2.0 apps work kinda similarly. Rather than an X server, you have a browser; rather than an X client, you have a website; rather than extensions, you have plugins; and rather than toolkits you have Javascript libraries. And it doesn’t matter whether you’re running Windows or OS X or Unix, or which of those the website’s using — it’s all compatible.

But there are two additional features Web 2.0 has over X: the first is that you can specify your layout with a powerful (and, with CSS, themeable) language, in the form of HTML; and the second is you can actually push smarts to the client machine without requiring the user to install plugins/extensions, by making use of Javascript and dynamic HTML.

I think that benefit alone makes a good case for considering the Unix desktop of the future to be web-based: in theory at least you can duplicate the current features of X, while at the same time gaining some quite interesting benefits. (And since I’ve already nominated 2007 as the year of the Linux desktop, I’ve got no reason to worry about discarding current technology as obsolete :)

But what does that mean? I think the aforementioned AjaxTerm provides a good example of what existing Unix apps could be like if converted to web apps. Compare it to the equivalent standard Unix application, xterm. Normally, you would invoke the command (eg from an existing shell, or double clicking an icon), it would find the $DISPLAY setting, connect to the X server it references, and start sending data to open a window and display the output of the commands you’re running in the terminal.

Now suppose you want the same experience with AjaxTerm — that is, you’re logged in somewhere, and you want a new terminal window to appear, and you either type “ajaxterm” at a prompt, or double click the “ajaxterm” icon. In order for that to work:

  1. Your program needs to start providing content via a URL that’s accessible to the client.
  2. Your program needs to contact the client’s browser, and tell it to open a new window, pointing at that url.
  3. Your program needs to supply CSS and JavaScript files necessary to provide the correct user experience.

None of those seem terribly difficult things to do, at least in theory. If you’re already in a browser, and doing the equivalent of clicking a link, they’re actually pretty easy. But doing it with the same features as xterm does create some challenges, particularly in order to avoiding either needing a daemon running prior to the invocation of ajaxterm, or having the connection between the program and the display be public (ie, DISPLAY=:0, versus http://example.com/rootshell).

The next question is whether the technology is advanced enough to actually provide a good user experience — but with demos such as the aformentioned ChipTune and the ubiquitous Google Maps, that seems easy to answer in the affirmative.

How hard it is to make good web apps is the next challenge — there are certainly plenty of horrible web applications out there to demonstrate it’s not trivially easy. There are a few challenges:

  1. Designing good HTML and CSS based interfaces
  2. Writing basic user interface code
  3. Coding in JavaScript
  4. Communicating with the web server
  5. Avoiding web security issues (XSS, etc)
  6. Avoiding limitations imposed by web sandboxes

I’ve not had much luck finding good ways to design nice HTML+CSS sites; though to be fair I have spent years deliberately trying to avoid acquiring that skillset. And for webapp development, there’s the added difficulty that most of the Javascript libraries are fairly new — so they’re still in development themselves, and hoping for a Glade to your preferred Javascript Gtk (script.aculo.us, YUI, etc) seems to be a bit forlorn.

That Javascript is essentially a specialist web language naturally doesn’t help those of us who aren’t already specialist web programmers, but there seem to be a few effective solutions to avoiding coding everything in Javascript (Google Web Toolkit and Pyjamas compile Java and Python (resp) to Javascript, and from what I gather Rails and Pylons and similar will let you create a whole bunch of standard UI elements, without having to touch Javascript directly).

Communicating with the web server, and dealing with security issues is either easy (you’re not doing anything fancy, and the browser keeps you and your users safe), or can potentially be hard (if you’re trying to work around browser limitations or setup collaborative sites). But ultimately, whatever you’re trying to do is either no harder than it would be any other way (eg, designing mutliuser programs), or just needs some help from a plugin — and Google Gears seems to be doing a particularly good job covering the bases there.

So congratulations, we just argued fairly convincingly that it’s possible to write applications for the web — who would have thought? The real question is whether we can make them “Unixy”, or for that matter “open sourcey”.

One nice aspect of Unix applications (and particularly open source ones) is that they’re not really tied down to a particular machine. If you’ve got one machine with an app installed, you can generally quite happily copy it to another machine and, as long as you’ve got the necessary libraries installed, just run it.  And heck, this is what vendors and distributors expect this, and if you want an app, the normal way to go about using it is to install your own copy locally and then use it.

Webapps are generally the exact opposite of that — they’re not only generally tied into other applications, and require configuration changes and a daemon (apache, tomcat, etc) running before you can use them, but they’re often not distributed at all, and only run by the company that developed them. If you want to use the app, you send your data to them, they store it, and you just get to see what they let you see. There’s two reasons for that: one is just business — if you control the app, you control the users, and maybe you can make money that way; the other is that serving a webapp is actually hard: you need to setup and configure a web server, need to work out a security policy to prevent it being accessed by random people, you need to provide storage for the data, and you need to manage updates and such. One of those, at least, is solvable.

But having a forced separation between your screen and your data is useful too — if your data’s not actually on your laptop, it’s not as big a problem if your laptop gets stolen, or broken. And when you have to explicitly contact a server to write anything at all (browsers not providing local storage for webapps, generally) you get that feature for free. If you could make it an option where exactly your data gets put — so that the webapp that is Gmail, eg, could be told to access and store your emails on google’s servers, or on your own computer, or even on Amazon’s S3; you’d have a really powerful system, that suddenly feels not only a lot more free than current webapps, but also gives users a lot more freedom than current open source desktop apps.

The other downside of webapps is that they run on browsers. Which, compared to regular Unix, is lame: you don’t get multitasking, protected memory, process control, ease of debugging — heck, half the time you don’t even get to avoid your windows being adorned with various bits of spyware. In theory, Chrome fixes all that, though sadly it’s still Windows only. (OTOH, there’s a Mozilla hacking tutorial at lca this year, so maybe that’ll help Firefox pick up some of the features)

If you could assume a Chrome-like design from your browser, you could then take that a little further, and corral your webapps under different user-ids — so that you could ensure that a malicious Facebook application running under one user-id (aj-fun, say), can’t exploit a vulnerability in your Javascript interpretor to access your netbanking passwords, cookies, or so forth, stored under another user-id (aj-serious, eg). Most Unix systems could handle that easily (a single user Unix desktop system can generally cope with anywhere from 20,000 to 4 billion user-ids, with only a few dozen already allocated to the system), and if the browser could separate different windows/sites into different processes like Chrome claims to, and provides a hook for privilege changes via sudo, we’d have a pretty good step in securing both the web generally, and user’s desktops as well.

Additionally, you’d want to make the browser a little more invisible too — moving it more along the lines of a window manager than an application in its own right, at least when running webapps. This is close to what Chrome does with its Application Windows but might go a little beyond that.

Add all that up, and what do you get?

First, you change the system architecture: your Unix desktop now has to offer apps:

  • a kernel and X drivers to deal with the raw hardware
  • a window manager and/or browser that supplies appropriate chrome for switching between tabs/pages
  • a rendering engine that will handle HTML and CSS (probably in multiple layers, that mostly already exist)
  • a Javascript VM/interpretor
  • any plugins necessary for apps to do more than base W3C standards allow (eg, Gears, Flash)

And the system with the apps you want to actually use, needs:

  • a way of serving URLs to clients (eg, http)
  • appropriate background daemons to support apps (apache, tomcat)

Development changes from just picking a language (C, python, etc) and a toolkit (Gtk, wxWidgets, etc) to needing:

  • a language for the application (C, python, Java, etc)
  • a language for the UI (Javascript, or Java+GWT, Python+Pyjamas, Rails, etc) and possibly a toolkit (if it’s not already implied by the language)
  • a support framework/infrastructure for serving URLs (tomcat, Pylons internal server, etc)
  • a protocol for communicating data between the application and the UI (the “X” in AJAX, basically)
  • an HTML and CSS based-design for your user interface

Unfortunately, the latter list is currently way too complicated. If you could successfully simplify it just a little — by just having to choose one language (and a toolkit), and having good standards for frameworks/infrastructure — you can start writing apps for the web just as easily as you write apps for the Unix desktop, with exactly the same user experience for people on Unix desktops, but the added benefit that it actually works just as well for people on OS X or Windows, and just as well for people in a different country.

And since Unix desktop hackers are already halfway used to this, with the separation of X servers and X clients, it’s a relatively small step to a real brave new world.

So that’s my theory. If you want it summed up more pithily, I’m now claiming not only that 2007 was the year of the Linux desktop, but that if we’re lucky, 2009 or 2010 will be the year the Linux desktop is completely superseded by the web desktop. :)

Leave a Reply