Thursday, 24 January 2013

CrO4 - Why it didn't happen

After a few months of experimenting and tinkering with CrO4, I eventually stopped work on it. There were 3 main reasons for this (from least to most important):

  1. Developing a feature which crosses the WebKit/Chromium project boundary sucks. Transitioning from a JavaScript Chrome plugin and NodeJs server prototype into something that had practical performance and scalability required hacking that functionality into the browser itself. Unfortunately, this code straddles the interface between WebCore (The DOM and HTML code in WebKit) and Chromium it's self.
    Hopping this boundary meant multi-hour build times, and trying to navigate through code that is highly C-Macro infested and interacts with auto-generated C++ code (most of Chromium's WebKit glue is C++ generated by a few dozen Perl scripts which parse the hundreds of IDL files in WebCore). Answering a question like "who calls this" usually involved a 45-minute build to create some scenario observable in GDB, because navigating that code manually is totally impractical.
  2. HTML is incredibly stupid. I can't express just how amazingly stupid the amount of legacy and related bullshit there is in the modern browser. Think of the most menial task you can do in a browser, like setting a font, or moving a text box to the middle of the screen. Now think of all the completely different ways that you can do that in HTML. Then add CSS into the mix. The amount of duplicated and obscure functionality in the modern page renderer is mind-boggling, simply because of how important it has been to keep adding features to browsers without breaking existing pages. This both augments point #1, and makes synchronizing a given page across multiple browsers even more impractical.
  3. The UI of CrO4 wasn't a once-size fits all solution. CrO4 worked well for long reads which took multiple sittings, and editing documents/forms while on the go, but it didn't do well with stream-based services. When I look at my phone in the morning, I really don't care about what was half-way down my Facebook feed, or what post #125 on my Reddit front page was last night. The inertia of removing that cruft canceled out the value of not losing my place on other services. CrO4 is just too intrusive.
But, my initial problem of reducing the barrier of seamless transition between devices still exists. Reading Reddit on my browser, and then walking to the train an looking at my phone still presents me with duplicate information and a discontinuity in my user experience. Looking for a flight at work, and then continuing my search from home still requires me to basically start from scratch.

So, what have I learned from CrO4?

  • Browsers are an engineering quagmire. There is a lot of value for this problem waiting to be taken from within the browser, but Html (and accompanying technologies) have acquired so much technical debt that this approach isn't going to yield anything from a one-man personal project in a reasonable amount of time.
  • Don't build a new basement under an unstable building. CrO4 was an attempt to add new functionality to existing infrastructure by treating the platform of the browser as an abstraction that could be swapped out for free. This happens in networking all the time (Ethernet, WiFi, and 3G all do this quite well), but my mistake (well, one of them) was not to consider the robustness and interface complexity of the layer of abstraction directly above the one I'm replacing.
  • Building up and out is easier than re-modeling. Through working on CrO4 I've developed a great deal of respect for Linus Torvalds' hard-line attitude of kernel fixes not breaking user-space programs. Even if those user-space programs depended on "wrong" kernel features ("wrong" is different from "broken"). Re-building existing infrastructure means that you need to make sure that all the existing functionality of the world needs to keep working, where as building on top of it (ie, a new web service) or building in parallel to it (ie, a new mobile platform) provides a huge "clean slate" degree of freedom.

And, the most important lesson:

  • I don't know how to magically make all web services multi-device friendly without changing the service. I was careful not to word this "It is impossible to magically make..." because I'm still willing to bet that there is some umbrella solution to this problem, but I've conceded that I've only grazed the edge of that solution if it does exist. As far as I'm concerned (for now) the solution to the multi-device discontinuity problem rests in the distributed hands of the developers of the services that users want to use.

So where do I go from here? 

One of the consequences of not being able to use the browser event model and a single central execution environment (CrO4's server-browser DOM+JavaScript VM) means that most services will need to have multiple executables running over many devices.

In an ideal world, this collection of devices could just be considered a composite of identical state machines which form one simple state machine over multiple devices, but in reality, it means that we have multiple independent systems connected through changing variable time delays. In less mathematical terms, it's a bitch of a communications problem to solve.

The general identity for the problem of synchronizing multiple agents, all of whom are modifying the state of some document, where each agent's document needs to have the changes of all the other agents applied to it (in some sort of manner that makes sense for that particular document format) is called Operational Transform (OT).

The problem with OT is that it is difficult to do, and for most services, the cost-benefit tradeoff of implementing OT (substantial complexity and developer time costs) to the benefits it provides (multi-device synchronization and offline editing ability) usually ends in a clear case for not using OT.

But...

If I could provide OT in a package that lowers the implementation costs to a level where app and service developers see OT as a worthwhile investment, it may solve my original problem (seamless experience over multiple devices). This is where my latest project, Cortex OPT, comes into play.

More to come on this topic soon.

Wednesday, 16 May 2012

Introducing: The CrO4 Browser

Please note that this post is about a very early work in process. You'll have to wait at least a few months before CrO4 is public-ready. If you can't wait, CrO4 is open-source, so check out the GitHub page and ask how you can help:
https://github.com/leighpauls/cro4

This is a very high-level overview of how I got to the idea of CrO4, and it disreagards many measures which are/will be in place to actualy make this idea technically viable, which will be topics for many posts to come in the near future.

I've been holding off publishing anything about this project for a couple months now, but I've presented it at my class's independent project symposium, and I've been spending enough time on it that people are getting curious. So, without further adieu,

CrO4 Browser: Many devices, One Web, One Browser


Like the CrO4 ion, CrO4 is one chrome instance with many peripheral browsers

The State of the Art:

I'm that guy who's got a desktop, a work/school laptop, a laptop for use around the house, a tablet, and a smartphone (I've probably missed a few). Now, the average user won't have that many devices, but it's not unreasonable to expect the typical person to have a laptop, a tablet, a smartphone, and maybe a more powerful desktop at work.

Now, each of these devices has a browser which uses the same web, and each of them shows the web in (roughly) the same way as all of the others. Further more, the user will likely use a device based on where they are, rather than what they want to do. For example, you browse on your phone while walking or waiting for a train, you use your tablet while on the train or sitting on the couch, and you use your laptop while at a desk but in any of those scenarios you might want to check any of Reddit, email, Facebook, etc.

I find that my choice of device usually comes down to using the largest device that isn't made impractical by the environment I'm in. So, I'll use my tablet on the train, but I'll use the phone instead if it's seriously packed. However, I would never bother to browse on my phone while sitting in front of my desktop. Let's call this the Largest Practical Device Rule.

So, I now that I live in this world where I've got a device built for every scenario, so I must be in browsing nirvana, right? Well, not quite.

The problem:

The point to take away from the Largest Practical Device Rule is that the device you'll be using depends on where you are, which is in many cases independent of what you want to do. What you want to do is far more frequently dependant on what you were just doing, possibly on another device.

Let's look at a scenario which has happened far too frequently to me:

I'm at work and I'm trying to figure out how some system works. This typically means that on my desktop I've got:
  • 2 internal documentation pages open
  • A very long RFC scrolled down to some particular position
  • A query half-entered into my bug tracker/source control page
Now I've got to catch the last train/bus home. When I get on the train and want to pick up where I left off on my tablet, I'll have to:
  • Re-locate those internal docs again, on what is typically a horribly organized internal wiki
  • Find the RFC, and try to re-locate the spot I was scrolled to
  • Restart with entering the query all over again
  • Do all of this on a device with less power and sloppier control than my desktop, where I've already done it once
This is just plain dumb. As I've said before, these devices use the same web to display the same pages, but each of them has no idea what the others are doing. What I want is to be able to walk away from my laptop, and pick up exactly where I left off on my phone/tablet/desktop. No repeating searches for content, no ssh/rdp bull-crap back to the other device, and if I'm filling out a form or doing anything else with a dynamic page, that should already be reflected in the next device I happen to pick up and start browsing on.

So, given what's available right now, I've only found 2 partial solutions:

1: Browser-level Tab Re-opening

Both Chrome and Firefox have facilities for opening URLs on one device which are currently (or were) open on another. This functionally equivalent to syncing the Browsers' history, as it's really only allowing you to re-visit a URL that you've already been to. You'll still have to scroll down that RFC, and you'll still need to re-enter that query.

Similar URL-sending systems, like Chrome-to-Phone, fall into this category as they are functionally equivalent for the purposes of this problem.

It's worth noting that if every web developer used the URL hash attribute as hyper-actively as Google does for search, then this would actually be a viable solution. But they don't, and they won't; so it isn't.

2: Google Docs-Style Web Apps

Ideally, every website in the world would just work like Google Docs, where you're essentially using a virtual desktop which never turns off. Whenever you log back in, regardless of where you are, the work you were doing was automatically saved and it's always just magically there. Waiting for you to get back to work.

Unfortunately, Google Docs was made by a collection of dam good engineers working for years, and that level of quality and time commitment is completely impractical to expect from every website within some reasonable amount of (or even infinite) time.

Same as the first point: don't, won't; isn't.

The Complete Solution:

So I asked, "Why not take the auto-updating magic used by Google Docs, and find a way to apply it to every site on the web?". After experimenting with a few different ways of implementing this, I decided that the best way to do this was to implement this functionality at the browser level, by takeing a split-browser approach, similar to what Amazon Silk and Opera Mini do.

The basic concept of a split browser is that you've got some amount of the browser's work offloaded to a remote server. In the case of Silk and Mini, this amounts to the browser requesting content through Amazon/Opera's servers as an explicit proxy. The proxy server then decides to do things like compress/reduce quality of images, and do a little bit of HTML re-writing to make the page a little more  pretty on the smaller screen, and then sends it along in some sort of byte-code that's easier to turn into a rendered page than raw HTML/JavaScript/CSS.

Where my concept diverges from the current use of Split Browsers, is:
  1. Modern Split browsers exist explicitly to mitigate technical challenges faced by browsing on lower powered devices (namely, processor speed and mobile bandwidth), not to add any new features.
  2. They are still designed with one device:one browser in mind. They've relocated resources from the user's location to the cloud, but have failed to leverage the fact that each instance of the cloud service doesn't have to be fundamentally tied to only a single end client.
  3. They use weird JavaScript execution, but some still ends up on the client. Opera Mini uses some sort of inbred child of JavaScript split between the client and server, which breaks many pages on lower end devices. Silk uses a more elegant solution, but still puts some execution on the client side. Essentially, they don't provide the web in it's full glory.
What I've done is taken the split browser concept to the extreme. CrO4 moves the entire page execution to the cloud, and leaves the client as a thin interface and display. The benefit to this is that from a page execution standpoint, there is no difference between having 1 client or 10 clients, all controlling and displaying the same page.

The server browser is actually a complete, normal, browser which is listening for changes in it's own DOM, which are then encoded and relayed to all of the client browsers after some sanitization filtering (removing scripts, re-writing URL references, etc.). The client browsers capture all input events which could be triggered by the user, then sends those forward to the server in a similar fashion. All of this happens continuously in real-time.

Since JavaScript is executed entirely on the server, there's no need to do messy work, like trying to split up execution across multiple environments.

The end state is that all of your devices are updated in real-time with each other, so when you close your laptop and pick up your phone, you've truly just picked up the exact same browser.

Friday, 27 April 2012

Google's Zerg Rush

I just discovered that searching for "zerg rush" in Google activates an Easter Egg mini-game where a bunch of Google's 'O' "zerglings" try to destroy your "base", and you have to kill them off before they destroy the whole page.

Back in high school, a friend and I had a blast writing SCAR scripts for flash games, so that we could hold world-wide high scores on Neopets games. (don't try to extract any reason beyond that, we just thought that was a good way to spend our free time in the lab while stuck at school)

For nostalgia's sake, I decided that I needed to have the highest score of anyone I knew. So I wrote a small chrome extension to play the game for me, and achieve a score that I feel is sufficiently high. This fit in nicely with my latest project, which revolves around simulated user input to a browser (blog posts on that topic are one their way).

Check it out on github at https://github.com/leighpauls/googlezergrush

Wednesday, 4 April 2012

TwitchErorr Live!

I just published TwitchErorr, the HTTP error creating server that wrote about in my last post.

Check it out at http://twitcherorr.nodester.com/

Get the source and run it from wherever you want at: https://github.com/leighpauls/TwitchErorr

Monday, 2 April 2012

TwitchErorr - HTTP Response Codes On Demand

A few months ago at work, I needed an external server to throw HTTP 500 errors at me under my own control to test a feature on a transparent HTTP proxy.

It seemed like such a simple tool. Simple enough that you'd expect there to be dozens of implementations available. Just a server that would take a number as GET parameter, and reply with that error number. Perhaps my Google - foo just sucks, but I couldn't find one anywhere. So I wrote a quick app-engine script to throw 500 errors back at me and figured this must be some really unusual task.

Today, however, I needed a similar server, except it had to throw any 50X error. So I modified that script to do it, and now I'm wondering if anyone else out there could use this server. And if I need something twice, then that probably means that someone else out there is going to need it at least once.

So, I gave it a smart-ass name and I've posted it on Github (check out TwitchErorr).

Right now, it's just a crappy App Engine python script that can only throw 50X errors, but I'd rather re-write it in node.js and be able to throw any logical HTTP server-side error. I only wrote it in App Engine because I didn't have a Heroku account set up at the time and I needed that test done soon, but since this blog post is longer than the script it's self, perhaps this deserves a proper re-write. (I might hack that out a little later tonight)

It's currently live at http://twitcherorr.appspot.com/error=501, just replace the parameter with any 50X number. More documentation and features are coming soon.

Friday, 18 November 2011

Launch Lessons

Today marks the official launch of TwitchTetris. I've progressed to a point where all of the required functionality is in place. Notable additions include:
  • Configurable Controls
  • Configurable Auto-Repeat
  • A few T-spin recognition bug fixes
  • Changes in the color scheme to accommodate a wider range of monitor color temperatures.
  • General UI improvements
It's an awesome feeling to have taken a project from a blank canvas to completion, and then finally share it with the world. I'm in an odd state of relief, anticipation, and excitation.

TwitchTetris has taken up the better part of my spare time, class time, and sleeping time for the past 3 months, so I though I'd share some of the lessons that I've learned during the development of the game.


Lesson #1: Acquire Nerves of Steel
Before posting it to the world, I shared the game to my class's Facebook page (about 100 engineers), and to /r/indiegaming on reddit. These were essentially the two least flaming-prone communities that I could think of publishing a soft-release to, as /r/indiegaming sees lots of games in all kinds of stages of development, and my class, all having gone through Engineering Co-ops, is fairly seasoned in the world of QA and constructive criticism. And the criticism I received from both sources was overwhelmingly constructive.

But, even when you get intelligent constructive criticism and recommendations about something that you've been working on for months, you feel like crap. It's like someone gave you a sincere recommendation on how to make your face less ugly. No matter how well intentioned it is, you feel your heart sink a bit.

My first response to the criticism, constructive or negative, was to get defensive. Across the board, my first thought was always:

The problem isn't a flaw in my app, clearly, you just happen to be an idiot who doesn't know how to use the app.


And the first thing I'd want to do is post a reply, informing the commenter about the correct way to play the game. Obviously, this is the worst thing I could have possibly done.

The only way to increase the quality of your app is to take all of the feedback in stride, and address each problem as a new technical requirement. It's tough to stay objective about feedback directed towards work that you've put a lot of yourself into, but fixing the problems as they are brought up is the only way you're going to turn criticisms into complements.

Lesson #2: Your Own Opinion is Irrelevant
I had a good friend in High School who was a very gifted writer (of English, not code) help me proof-read and perfect the myriad of essays and reports required for university and scholarship applications. One tip she gave me for writing was:

Always proofread by reading your sentences in reverse order. When you read forward, you're just reading out of your own memory of writing that paragraph and everything will make sense to you. Read the sentences in the wrong order, and you'll notice what doesn't make sense.


It makes a lot of sense, when you read backwards, all of the extraneous ideas that you formed in your head stop putting themselves in front of what you've written, and you'll be able to focus on the wording and syntax of the document. What I've found is that the same clouds of preconception about your own work apply to user experience design in the same way that they do to writing.

You'll probably run and use your own app hundreds of times before you manage to get to a state that you would call 'complete', from a user experience point of view. Every time you do that, you'll get a little more used to the 'oddness' of the incomplete app, to the point where all of the little quirks of your app just seem natural. In fact, the technical term for this is Classical Conditioning.

Unfortunately, that 'oddness' that you no longer notice, is very noticeable by your users, and as I mentioned above, they will let you know about it. As of now, I'm currently unaware of any kind of 'reading backwards' applied to UX design, so the only conclusion that I can draw is that a user's opinion of what seems broken in your app, should always trump your own opinion.

Lesson #3: Every Assumption You Make is Wrong
I know that this point tends to be cliche, but I felt like it needed to be mentioned. So instead of rambling on about the concept of assumption, I'll just leave you with a list of invalid assumptions which I tried, and failed, to get past the users in earlier versions of TwitchTetris:

  • Users will read instructions located somewhere on the page
  • Users can accept a slightly different control scheme from what they might be used to.
  • Users will read instructions located very closely to the thing they're working with
  • Users will wait for more than 5 seconds before assuming the page isn't going to load, and try refreshing
  • Users will read instructions located directly on top of the thing they're working with
  • Users will understand that you need a device with a keyboard to play a game with keyboard input

Thursday, 10 November 2011

TwitchTetris, the Beginning of TwitchCode

> HELLO, WORLD!
> Press Any Key to Continue...

I've created this blog to publish the projects that I'm working on or have finished, as well as to start discussion on whatever interesting ideas I stumble upon. Today marks the release of the first project I'm setting into the wild: TwitchTetris.

Currently, Tetris games on the web tend to fall into one of 2 categories:

  1. High-Quality Flash games with obtrusive ads.
  2. Low to medium quality games in flash or Html5 with noticeable game-play problems.
My problem is that when I want to play Tetris, I want to play it now, and I want it to play crisply and responsively on every platform. I don't want to wait for some video ad to finish before I can start playing, and there must not be any noticeable lag in the controls. And most of all, if the game does not follow the Tetris Guidelines (yes, Tetris has it's own regulatory agency), I'm not going to play it.

In short, I'm very particular in my choice of Tetris games, and as of now, no web-based implementation has met my requirements.

Enter TwitchTetris.

My first design decision was to not use the cluster-fustication that is Flash. Without even touching on the user experience problems of plugin dependencies, Flash implementations vary in quality and performance by operating system and browser. Tetris is one of the fastest games around, so differences in Flash run-times become very noticeable around level 12 of the game.

This leaves Html5/Canvas as the sole option.


By making this call, I knew that excanvas wouldn't be able to keep on par with the performance that I needed for the game. I basically had to accept the fact that I was telling IE users to get bent if they wanted to play the game.


I have no regrets.

So after doing a quick survey of the available libraries for writing simple games using Html5, I settled on JawsJs. It's a simple game development layer that is still early in development, but it was far enough along for me to make a half-decent game with it.

The projects that have been developed in JawsJS to date were considerably less complex than what TwitchTetris turned out to be, and I had to do a bit of creative rendering to make keep it running at the speeds I needed, but overall I think that JawsJS was the right direction for me.

As a whole, there are a lot of points where JawsJS could see some serious improvement, and if game development in Html5 starts to take the place of Flash, I hope that this library sees some serious development from the community.

So, then after weeks and weeks of development happening between lectures and labs at school, TwitchTetris is finally ready to be deployed. I'm using Google App Engine as a backend/hosting solution just because it's at the center of the cross-section of easy-to-use, and reliable.

My design theory was "no bullshit": meaning no ads which hinder the experience of the gamer and no extra "features" added on to the game to make it needlessly complex.

I came up with a "Teletype Console" theme for displaying the level and score information. The theme allowed me to create a relatively simple backdrop of green on dark green so that the action of the game really draws your attention with more dynamic colors. It also puts some movement into the side panels of the game display, so the game feels more dynamic.

So, head over to http://www.twitchtetris.com/, play a few games, try to set a high score, and let me know what you think!