Wednesday, 16 May 2012

Introducing: The CrO4 Browser

Please note that this post is about a very early work in process. You'll have to wait at least a few months before CrO4 is public-ready. If you can't wait, CrO4 is open-source, so check out the GitHub page and ask how you can help:
https://github.com/leighpauls/cro4

This is a very high-level overview of how I got to the idea of CrO4, and it disreagards many measures which are/will be in place to actualy make this idea technically viable, which will be topics for many posts to come in the near future.

I've been holding off publishing anything about this project for a couple months now, but I've presented it at my class's independent project symposium, and I've been spending enough time on it that people are getting curious. So, without further adieu,

CrO4 Browser: Many devices, One Web, One Browser


Like the CrO4 ion, CrO4 is one chrome instance with many peripheral browsers

The State of the Art:

I'm that guy who's got a desktop, a work/school laptop, a laptop for use around the house, a tablet, and a smartphone (I've probably missed a few). Now, the average user won't have that many devices, but it's not unreasonable to expect the typical person to have a laptop, a tablet, a smartphone, and maybe a more powerful desktop at work.

Now, each of these devices has a browser which uses the same web, and each of them shows the web in (roughly) the same way as all of the others. Further more, the user will likely use a device based on where they are, rather than what they want to do. For example, you browse on your phone while walking or waiting for a train, you use your tablet while on the train or sitting on the couch, and you use your laptop while at a desk but in any of those scenarios you might want to check any of Reddit, email, Facebook, etc.

I find that my choice of device usually comes down to using the largest device that isn't made impractical by the environment I'm in. So, I'll use my tablet on the train, but I'll use the phone instead if it's seriously packed. However, I would never bother to browse on my phone while sitting in front of my desktop. Let's call this the Largest Practical Device Rule.

So, I now that I live in this world where I've got a device built for every scenario, so I must be in browsing nirvana, right? Well, not quite.

The problem:

The point to take away from the Largest Practical Device Rule is that the device you'll be using depends on where you are, which is in many cases independent of what you want to do. What you want to do is far more frequently dependant on what you were just doing, possibly on another device.

Let's look at a scenario which has happened far too frequently to me:

I'm at work and I'm trying to figure out how some system works. This typically means that on my desktop I've got:
  • 2 internal documentation pages open
  • A very long RFC scrolled down to some particular position
  • A query half-entered into my bug tracker/source control page
Now I've got to catch the last train/bus home. When I get on the train and want to pick up where I left off on my tablet, I'll have to:
  • Re-locate those internal docs again, on what is typically a horribly organized internal wiki
  • Find the RFC, and try to re-locate the spot I was scrolled to
  • Restart with entering the query all over again
  • Do all of this on a device with less power and sloppier control than my desktop, where I've already done it once
This is just plain dumb. As I've said before, these devices use the same web to display the same pages, but each of them has no idea what the others are doing. What I want is to be able to walk away from my laptop, and pick up exactly where I left off on my phone/tablet/desktop. No repeating searches for content, no ssh/rdp bull-crap back to the other device, and if I'm filling out a form or doing anything else with a dynamic page, that should already be reflected in the next device I happen to pick up and start browsing on.

So, given what's available right now, I've only found 2 partial solutions:

1: Browser-level Tab Re-opening

Both Chrome and Firefox have facilities for opening URLs on one device which are currently (or were) open on another. This functionally equivalent to syncing the Browsers' history, as it's really only allowing you to re-visit a URL that you've already been to. You'll still have to scroll down that RFC, and you'll still need to re-enter that query.

Similar URL-sending systems, like Chrome-to-Phone, fall into this category as they are functionally equivalent for the purposes of this problem.

It's worth noting that if every web developer used the URL hash attribute as hyper-actively as Google does for search, then this would actually be a viable solution. But they don't, and they won't; so it isn't.

2: Google Docs-Style Web Apps

Ideally, every website in the world would just work like Google Docs, where you're essentially using a virtual desktop which never turns off. Whenever you log back in, regardless of where you are, the work you were doing was automatically saved and it's always just magically there. Waiting for you to get back to work.

Unfortunately, Google Docs was made by a collection of dam good engineers working for years, and that level of quality and time commitment is completely impractical to expect from every website within some reasonable amount of (or even infinite) time.

Same as the first point: don't, won't; isn't.

The Complete Solution:

So I asked, "Why not take the auto-updating magic used by Google Docs, and find a way to apply it to every site on the web?". After experimenting with a few different ways of implementing this, I decided that the best way to do this was to implement this functionality at the browser level, by takeing a split-browser approach, similar to what Amazon Silk and Opera Mini do.

The basic concept of a split browser is that you've got some amount of the browser's work offloaded to a remote server. In the case of Silk and Mini, this amounts to the browser requesting content through Amazon/Opera's servers as an explicit proxy. The proxy server then decides to do things like compress/reduce quality of images, and do a little bit of HTML re-writing to make the page a little more  pretty on the smaller screen, and then sends it along in some sort of byte-code that's easier to turn into a rendered page than raw HTML/JavaScript/CSS.

Where my concept diverges from the current use of Split Browsers, is:
  1. Modern Split browsers exist explicitly to mitigate technical challenges faced by browsing on lower powered devices (namely, processor speed and mobile bandwidth), not to add any new features.
  2. They are still designed with one device:one browser in mind. They've relocated resources from the user's location to the cloud, but have failed to leverage the fact that each instance of the cloud service doesn't have to be fundamentally tied to only a single end client.
  3. They use weird JavaScript execution, but some still ends up on the client. Opera Mini uses some sort of inbred child of JavaScript split between the client and server, which breaks many pages on lower end devices. Silk uses a more elegant solution, but still puts some execution on the client side. Essentially, they don't provide the web in it's full glory.
What I've done is taken the split browser concept to the extreme. CrO4 moves the entire page execution to the cloud, and leaves the client as a thin interface and display. The benefit to this is that from a page execution standpoint, there is no difference between having 1 client or 10 clients, all controlling and displaying the same page.

The server browser is actually a complete, normal, browser which is listening for changes in it's own DOM, which are then encoded and relayed to all of the client browsers after some sanitization filtering (removing scripts, re-writing URL references, etc.). The client browsers capture all input events which could be triggered by the user, then sends those forward to the server in a similar fashion. All of this happens continuously in real-time.

Since JavaScript is executed entirely on the server, there's no need to do messy work, like trying to split up execution across multiple environments.

The end state is that all of your devices are updated in real-time with each other, so when you close your laptop and pick up your phone, you've truly just picked up the exact same browser.

3 comments:

  1. Great job Leigh keep the code coming =)

    ReplyDelete
  2. This sounds very interesting. Any updates on this Leigh?

    ReplyDelete
    Replies
    1. I've stopped working on it. I'll be making a post soon on the wide variety of reasons why. I'm taking another (less intrusive) approach to unifying the work/thought process. I'll be making a write up on that idea once I've got some more work done on the algorithms behind it.

      Delete