Monday, July 11, 2005

Google Maps Information

I've been getting a slow but steady stream of requests for more information on how to work with Google Maps. I would love to respond to each of these individually, but honestly haven't been doing much with it since I wrote my first few articles on the subject. Fortunately, I just ran across the Google Maps Mania blog. It seems to be doing a pretty good job of aggregating all of the stuff going on with Maps at the moment. I'm really quite amazed at the array of different things people are building with it (and I doubt there's any way I could keep up with it all anyway). Happy mapping to all!

Wednesday, June 22, 2005

Another Word or Two on Memory Leaks

Ok, I promised to explain in more detail how to get rid of memory leaks once you've found them. Though I haven't had time to gather all of the information and examples I would have liked, I have run across a few external resources that might be of help.

The first of these is a new Microsoft Technical Article that discusses the various forms that IE memory leaks can take in some detail. Particularly interesting is the fact that it discusses an even more obscure type of leak that's not even a DOM element. It's definitely worth a read.

A bit more information on JavaScript closures can be found on Eric Lippert's blog (which I highly recommend) here.

For a nice, straightforward library that does an excellent job helping you avoid the problem altogether, take a look at Mark Wubben's Event Cache. I particularly like the fact that if you follow a simple set of rules, then you cannot easily leak elements.

On Another Note

I suggested earlier that the slowdown associated with leaking large amounts of memory in IE might be associated with hash tables or something similar getting full and therefore more inefficient. Eric Lippert left the following comment, which makes perfect sense to me and seems more likely to characterize the problem:

The symbol tables are very search-efficient. What's more likely is that the non-generational mark and sweep garbage collector is getting more and more full, and therefore taking longer and longer to walk each time a collection happens. A generational GC, like the .NET framework's GC, solves this problem by not GCing long-lived networks of objects very often.

And don't worry, I haven't forgotten about Drip at all. As time allows, I will be adding the features that I mentioned earlier. Of course, if anyone else wants to play with the source and make their own additions, please feel free!

Monday, June 06, 2005

Drip 0.2

Happy Monday morning to everyone (or, depending upon where you may be, evening). This is just a quick note to announce Drip 0.2! Here is a quick list of changes in this version:
  • The main window is now resizeable.
  • The property list is sorted.
  • Property lists are now separate from the leak dialog. You can double-click on an element to see its properties. And you can double-click on any object property to see its properties. Think of it as a poor-man's expandable property list.
  • The source is also available here.
My current list of definitely known issues is as follows:
  • Still need to hook node.cloneNode() to catch all possible leaks.
  • Still need to hook new windows as they are created.
  • It sometimes reports that leaks are coming from about:blank rather than their actual source.
And my current list of possible issues is:
  • A couple of people have mentioned crashes occuring, which I have not yet been able to reproduce. If anyone having such a problem has a chance to build the source and catch this in a debugger, that would be wonderful.
  • I've also heard mention of issues with deeply-nested frames. My demo leak test page should exhibit this issue, but seems to work fine. Again, any help appreciated.
As always, please let me know of any other issues you discover, suggestions, and (even better) patches. And I haven't forgotten about my promise to provide a solid overview of how to deal with leak issues. I'm still doing a bit of research on the subject, but this will be forthcoming soon!

Saturday, June 04, 2005

Drip Redux

Wow. Thanks for all the excellent feedback on Drip. It was really just a tool that I needed for myself, but I'm glad that it may prove useful for others as well.

There were a lot of comments, both here and on Slashdot, so I'm going to try to put as many of my thoughts and responses as possible in this post. As such, it may be a bit of a grab-bag.

Exacerbating the problem

The first point I want to make is in response to one or two comments here, and many on Slashdot: That is, that I am not particularly concerned about whether or not I am exacerbating the problem by helping developers to "work around" IE's issues. Don't get me wrong; I find it just as unfortunate as everyone else that these problems exist in the first place. It is truly awful that developers using such a high-level tool as a web browser have to take memory allocation issues into account. Particularly given the fact that they're not really given the tools to effectively deal with them (window.CollectGarbage() doesn't count, since it won't really fix the problem).

Anyone who's spent a significant amount of time developing software has to realize that they will always be dealing with inadequacies of their tools and platforms. This has always been the case. It doesn't mean that vendors shouldn't fix their mistakes, but it does mean that you can't usually bitch at your customers for their choice of platform. If you are going to make software development your profession, then you must generally accept this responsibility. Certainly there are cases where you can dictate the details of the client's platform, but this is not the case for most vendors.

I also want to point out two things about this specific problem. First, IE's memory leak issues stem largely from the underlying model that allows scripting languages to interface with native COM objects (that is, making all objects accessible to scripting languages COM objects deriving from IDispatch). While imperfect, this model is also quite efficient -- and given that it was developed in the mid-90's, not an unreasonable compromise at the time. The second point I want to make is that IE is not the only browser with this problem. Mozilla had fairly severe memory leak issues until recently, and I've been told that Safari does as well. So let's not use this as an excuse to jump all over Microsoft.

When do leaks matter?

This is another point that I think bears some discussion. If you've spent a little time pointing Drip at existing sites, you've probably found that most sites exhibit no issues at all. This is simply because most sites simply don't use enough complex DHTML (with complex object graphs and the like) to create the specific sort of circular references that cause leaks. Most sites that do have a few leaks seem to be of PARAM objects passed to Java and/or Flash components. I've gotten mixed reports on when this happens, and when it causes a significant leak, so the jury's still out on whether this matters.

On the other hand, I saw one comment to the effect that Google Maps leaks a lot of elements. This is exactly the sort of application that is in danger of leaking enough to matter. If you look at the Maps code, you'll discover that they've done an excellent job of abstracting the components that comprise the application, and it's quite easy to follow (if you de-obfuscate it, anyway). And I believe that the fact that it leaks so much is actually an indication that its developers have done a good job. The problem is that the very abstractions that make a code base of that size manageable make it really easy to create leaks. Because there are a lot of references among all of its objects, and most DOM elements are wrapped in one way or another, even a single leak can cause the entire reference graph to leak. Nasty, huh?

How do I fix leaks?

This is a pretty complex question. So I've decided to punt this to a forthcoming post. There are a lot of resources out there on this subject, but I hope to gather as much of it as possible into one post so that I can provide a reasonable framework for finding and dealing with them.

What now?

I've gotten a lot of helpful suggestions and a couple of bug reports. What I would like to do now is to list all of the fixes and enhancements that I can think of, and solicit advice on how to prioritize them. Once I've had another pass at the code, I will release the source as well so that you can all help maintain it! This is my current list:

  • Deal with deeply nested frames. This is a real issue for a lot of sites -- apparently Drip only hooks one level of nested frames, but fails to hook deeper windows.
  • Hook the cloneNode() method. This is simply an oversight on my part, but it's necessary to catch all possible leaks.
  • Resizable window. This was just me being lazy. I've gotten really used to constraint-based layout in the Java world, and to be honest, I just didn't want to deal with doing this by hand in MFC.
  • Sorted and expandable element properties. 'Nuff said.
  • Hook new windows (via window.open). I think this is feasible, and will do my best.
  • Anything else you guys can think of!

Tuesday, May 31, 2005

Drip: IE Leak Detector

To anyone still following this site, my apologies for taking a millenium or two between posts recently. Things have been a bit crazy of late, but I have something to introduce that will hopefully make up for the radio silence:

Drip -- an Internet Explorer leak detector

Over the last few months, a number of people have written to me or left comments asking questions about their memory leak issues with DHTML (or AJAX or whatever-you-want-to-call-it-this-week) applications. Unfortunately, there's not much I could offer in the way of advice that most people don't already know. Get rid of closures, unhook your event handlers, etc. This advice just isn't all that helpful when you've got a giant mess of JavaScript (often inherited) and visually detecting leak scenarios can be maddeningly subtle.

I did, however, find it quite surprising that no one had ever built a leak detector for Internet Explorer (or apparently for any other browser with leak problems; Mozilla has some, but they seem to be more for developers working on Mozilla itself, and the browser does a pretty good job of cleaning up leaks anyway). So I built one.

What it Does

It's a pretty simple application. Basically, it lets you open an HTML page (or pages, in succession) in a dialog box, mess around with it, then check for any elements that were leaked.

The interface is currently rather spartan. Here's what the main app looks like:

[Sorry, I lost the image somewhere along the way]

On the top you'll notice what looks like a crude version of Explorer's navigation bar. You've got the standard back and forward buttons, the URL box, and the 'go' button. These behave exactly as you might expect. To the right of it, however, is a 'check leaks' button, which will be grayed out when you first run the app. In order to try it out, you will first need to go to an HTML page (preferably one that you suspect leaks). The test page at [sorry I lost this page] will work. When you load this page, the 'check leaks' button will become enabled. Click it to see the following report:

[Yet again, I lost the image somewhere along the way]

This simple page leaks two DOM elements, a DIV and a BUTTON. These two elements are displayed in the top list, along with their source documents (useful if you've loaded more than one document between leak tests, or if you have more than one frame), the number of outstanding references on them, and their ID and CLASS attributes.

If you click on one, you'll see a list of its enumerable attributes in the bottom list. A particularly useful attribute for identifying the elements is 'innerHTML'.

Blowing Memory

Back to the main dialog for a moment. You might also have noticed the interestingly-titled 'blow memory' button. Its function is simple: to constantly reload a page as fast as it can, and to report the process' memory usage in the list box below. This is a helluva lot easier than pressing F5 for hours to determine how fast a page leaks memory.

How it Works

Fortunately, Internet Explorer's architecture made this app fairly easy to build. It's basically a simple MFC app with a browser COM component in it. The strategy for catching leaked elements is as follows:

  • When a document has been downloaded, sneakily override the document.createElement() function so that the application is notified of all dynamically-created elements.
  • When the document is fully loaded, snag a reference to all static HTML elements.
  • To detect leaks: navigate to a blank HTML page (so that IE attempts to release all of the document's elements),
  • force a garbage-collection pass (by calling window.CollectGarbage()),
  • and look at each element to see if it has any outstanding references (by calling AddRef() and Release() in succession on it).

Within the leak dialog, each element's attributes are discovered and enumerated using the appropriate IDispatch/ITypeInfo methods.

Caveats

This is basically an alpha release. The interface more or less blows, and I may have left glaring holes in the leak-detection strategy or in the code itself. It seems to work for me, but I would really like for anyone using it to keep an eye out for any problems so that I can fix them. And please don't hesitate to contact me, of course, if you have any ideas, praise, criticism, or even rants to offer. I really want this to help people to stop dealing with these god-awful leaks, and since Microsoft doesn't seem inclined to fix this design flaw, we can at least try to make it more bearable.

What Next?

Obviously, I would like any feedback I can get. There are definitely some interface quirks I need to iron out. And I would like to do more to help determine the actual cause of each leak. There are a few things that I would like to find out, and if anyone has any pointers, please share them:

  • Can you perform similar tricks with Safari/KHTML or Opera? (I know you can with Mozilla, but since it doesn't really leak much, that seems rather pointless)
  • Does anyone know if it's possible to enumerate variables on one of IE's JavaScript closures? (meaning the stack frame hanging off of the function reference)
  • How about enumerating expandos on IE DOM objects from C++? (I only seem to get built-in properties from ITypeInfo)

I'm sure other questions will come up in the near future. Oh, and I will be releasing the source before too long, as soon as I get a few things cleaned up.

Happy leak hunting!

Thursday, April 07, 2005

More Maps

This wasn't intended to be the Google Maps blog. Really. But hey, since they just released the Keyhole mode, I thought it might be worth going over the structure of the data.

Actually the integration of the Keyhole images is quite straightforward. The only things that had to change were (a) the functions that map between longitude/latitude pairs and pixel coordinates and (b) the tile URL generator.

The first of these is very simple. It would appear that the Maps team simply adhered to the coordinate system already in use by Keyhole, as it's completely different than the Maps coordinate system. The mapping functions are as follows:

// Initialize the zoom levels.
//
var gZoomLevels = new Array();
for (var zoom = 0; zoom < 15; zoom++)
  gZoomLevels[zoom] = Math.pow(2, 25 - zoom) / 360;

// Compute the longitude and latitude for a given pixel coordinate.
//
function getLngLat(pixelX, pixelY, zoom) {
  return new Point(
    pixelX / gZoomLevels[zoom] - 180,
    180 - pixelY / gZoomLevels[zoom]
  );
}

// Compute the pixel coordinate for a given longitude and latitude.
//
function getPixelPos(longitude, latitude, zoom) {
  return new Point(
    (longitude + 180) * gZoomLevels[zoom],
    (180 - latitude)  * gZoomLevels[zoom]
  );
}

Nothing too surprising there. The part that seems to have people scratching their heads is the image URL format. At first glance, it appears to be a wacky series of characters with lots of 'q, r, s, and t'. Well, there is a method to this madness. First, have a look at the de-obfuscated function below:

// Compute the URL of the tile with the given coordinates.
//   (note that these coordinates are not the same as in the
//    two functions above:  they must be divided by the tile
//    size, which is 256).
//
function tileUrl(x, y, zoom) {
  var range = Math.pow(2, 17 - zoom);

  // Wrap-around x coordinate.
  //
  if ((x < 0) || (x > range - 1)) {
    x = x % range;

    // The mod operator isn't quite the same on a computer as it is
    //   in your math class (negative isn't handled correctly).
    //
    if (x < 0)
      x += range;
  }

  // Build the quadtree path, working our way down from 2^16
  //   to the current zoom level.
  //
  var Ac = "t";
  for (var pow = 16; pow >= zoom; pow--) {
    // Drop to the next zoom level.
    //
    range = range / 2;

    if (y < range) {
      if (x < range) {
        // Upper-left quadrant.
        //
        Ac += "q";
      } else {
        // Upper-right quadrant.
        //
        Ac += "r";
        x -= range;
      }
    } else {
      if (x < range) {
        // Lower-left quadrant.
        //
        Ac += "t";
        y -= range;
      } else {
        // Lower-right quadrant.
        //
        Ac += "s";
        x -= range;
        y -= range;
      }
    }
  }

  return "http://kh.google.com/kh?" + "t=" + Ac;
}

Ok, so my comments were a bit of a giveaway. Basically, it looks like Keyhole stores its data in a quadtree structure, and the URL describes the 'path' through the tree to find a given tile. A quadtree, many of you may remember from your undergraduate graphics class, is a nice little structure for efficiently storing and accessing large tile arrays. It's particularly effective when not all areas need to be stored with the same level of detail (as is clearly the case with Keyhole).

I won't go into detail on the structure here, as others have already done a better job of explaining it (just do a Google search on 'quadtree'). For our purposes, suffice it to say that q, r, s, and t refer to the four quadrants at each depth in the tree. Thus, a sequence of these characters uniquely identifies a tile at some depth.

Tuesday, March 15, 2005

Ajax Buzz

I've always found it quite interesting how giving something a name seems to make it "real". Case in point? AJAX. When I first [heard][adpativepath] this term, I thought to myself, "who really needs a specific term for a mishmash of technologies that's been around for a long time?" Well, as is often the case, I misunderestimated the importance of having a mental "handle" for an otherwise complex concept. And just like LAMP, it's not the specific technologies that are important (the 'P' in LAMP is heavily overloaded, just as the 'XML' in AJAX isn't as important as the idea of communicating directly with the server).

I don't know if I particularly care for the appelation -- particularly the XML bit -- but that no longer really matters. The name has probably stuck now that the WSJ has an article that opens with the words "Meet Ajax, the technology powerhouse." So it's settled, the concept has a name now.

And I believe it is a good thing that some kind of name has stuck. For some time now, I've found that everyone has their own ideas about what "rich" means to a web application (or if you're more into Flash, an internet application); and it just takes too long to try and explain "it's like the difference between Hotmail and Outlook".

So now that we have a name, and an "AJAX" application has a specific meaning, what do we do now? Certainly there is more than one way to approach the problem. Should I use some sort of framework, or bang out all of the Javascript by hand? How should I encode my server calls? How should I handle back button support? What about URL-encoding my application's state? There is a host of issues like these that need solutions (probably many solutions, depending upon your needs). It's now time for us to begin gathering the existing solutions to these problems, and to build new ones, so that we can move the Web far beyond its current state.

Should be fun!

Saturday, February 12, 2005

Making the Back Button Dance

It seems that I keep discovering nifty things about Google Maps. I didn't notice anyone commenting on the details of the way it interacts with the browser history, but there's one really groovy trick it plays that really adds to usability.We already know that clicking the back button causes it to display the results of your previous search. That's certainly helpful, as you don't lose your search history. If you play with it a bit, however, you may also notice that it also moves the map to your previous search location as well. This applies to all of your search history, as you move forwards and backwards amongst your results. Try it out.

But how?

If you dig through the DOM a bit (e.g. using Mozilla's DOM Inspector), you will see that there are three input fields in the hidden IFrame that the application uses to communicate with the server. At first glance, it doesn't appear that they're used for much of anything (I myself first dismissed them as an unfinished attempt to make the application state easily to link to). If you look at the application code, you'll see that it stores the current map state (latitude, longitude, and zoom level) in these fields every time it changes.

Here's the tricky part. When a search result comes back from the server, the HTML in which it's wrapped contains yet another set of three empty fields. When you click back, though, the browser replaces the IFrame contents with its previous state, including the values that were stored in these fields. You've probably seen this before in regular web applications, when the browser tries to maintain the state of forms when it goes back to a cached version.

So you've clicked the back button, the hidden IFrame contains its previous state, and the right panel now displays your previous search results. The application also reaches into these three fields to get the previous map state, and pans/zooms as necessary to display the map in its previous state as well (you'll find the code that does this in the application's loadVPage() method). Voila!

Reclaiming the back button

I believe that, as more developers build web applications that make good use of DHTML, we will find tricks like this to be invaluable in maintaining the user experience. Hopefully, users will be able to depend upon the browser history to work as advertised -- something we can't often say about most web apps these days.

Friday, February 11, 2005

Still more fun with maps

Well, that was certainly an unexpected flood of responses. Apparently I wasn't the only one that found Google Maps interesting &emdash; I was amazed at some of the creative hacks that some commentors created. After sifting through this deluge, I decided to summarize some of the findings, clarify a few points, and add a few other comments on implementation details. This is kind of a grab bag of points, so please bear with me.

Cut Google some slack on Safari

I noticed a lot of people complaining both here and on Slashdot about the lack of Safari support. And I'm not completely unsympathetic, as I'm running a shiny new Mac Mini at home that I love. However, as anyone can attest that has ever tried to build an even remotely complex web application, it just ain't easy. And please don't blather on about who implements 'web standards' better &emdash; no one really implements them, and even if they did, you'd still be outta-luck if you wanted to do anything interesting in DHTML.

If you take some time to dig through Google's Javascript, you'll find that there is proto-Safari support all over the place. They're clearly working on it, and you really can't ask for much more than that.

The route overlay

There's one really interesting facet of the route display that I totally failed to notice the first time around &emdash; I was doing everything in Mozilla and simply didn't notice that they were actually using Microsoft's VML to render the route on IE. It may be non-standard, but you have to admit that it's very fast and effective! And switching off between client-side and server-side rendering in one code base is a pretty cool hack.

Decoding polylines and levels

Also because I only noticed the server-side route rendering the first time around, I failed to check whether the route's polylines were being decoded on the client. Well, as several commentors pointed out, they are. In fact, the decoding loop is fairly simple (just have a look at decodePolyline() for the details). I originally assumed this stream was encoded so that you could just grab it and send it back to the server for rendering (thus making the image server stateless, and the rendered route cacheable). However, since they're decoded on the client, it appears that it also served the purpose of keeping the size reasonable &emdash; encoding all those points as XML would get pretty fat.

I also glossed over the fact that there's another stream associated with the route called 'levels'. This is an interesting trick that allowed them to encode the route points at different zoom levels in the same stream (because there's really no opportunity to easily go back to the server for a new route when you change zoom levels &emdash; when you're rendering on the client).

Flow of control for form submit and IFrame

Although it's something of a wacky implementation detail, it's interesting to note how the search form at the top of the application works. It is actually contained in a FORM element, although it has three separate DIVs whose visibilities are swapped as you click on the different search links. However, it can't be submitted directly, as that would cause the entire application to reload. Instead, the event handler for the form's submit button suppresses the reload, gathers the search parameters, and calls the application's search() method. This method builds a query URL and sets a hidden IFrame's 'src' parameter, which causes it to gather a new chunk of XML from the server.

As I believe I mentioned in the previous article, requesting the XML via this IFrame has the additional benefit that it ties the browser's history perfectly to the application's state. Although it would be nice if someone would fix Mozilla such that the history titles are correct (this works properly in IE).

asynchronousTransform()

It turns out that Google Maps communicates with the server both through the hidden IFrame and the XMLHttpRequest object. I mentioned in the last article that it transforms the downloaded XML using XSL. Well, that XSL is not actually hard-coded into the application. Instead, whenever it needs to perform a transform, it requests the XSL via XMLHttpRequest, performs the transform, and caches the XSL itself so that it won't need to download it again.

Permalink and feedback

One of the potential downsides to building an application that runs entirely within a single page is that the address bar never changes, so it's not possible for the user to create links to the application's current state. Of course, many 'normal' web applications don't really work properly when you try to link to their internal pages, but people have still come to expect it to work most of the time.

Google's solution to this problem was to create the 'link to this page' anchor in the right panel. When you click on it, it refreshes the entire application with a URL that encodes the entire application state. Pretty nifty, as it gives you the important parts of the behavior everyone likes to call 'REST' without having to break the application up into a million little pieces.

You may have noticed that the 'feedback' link also encodes the application's state. The map application actually updates the hrefs of both of the stateful links every time the application state changes.

Profiler

Often you can tell quite a bit from code that is left lying around unexecuted. In this case, it appears that the Google Maps team may have had performance problems in some methods (or may simply have been trying to head them off). You can tell this because there are some prologue/epilogue functions being called in various methods that definitely smell like a hand-rolled profiler. And if you look at the list of methods that contain these hooks, it definitely makes sense:

Polyline.decodePolyline()
Polyline.decodeLevels()
Polyline.getVectors()
Page.loadFromXML()
Map.getVMLPathString()
Map.createRawVML()
Map.createVectorSegments()
Map.createImageSegments()
Map.drawDirections()

If this is indeed the case, I think it's worth noting that Mozilla's profiler actually does a reasonably good job. And although performance on one browser is not perfectly indicative of performance on the others, the Mozilla results have been roughly indicative of results on IE in my experience.

Tiles & longitude/latitude mapping

The mapping of tiles to longitude/latitude pairs is relatively straightforward. The code for doing transformations in both directions can be found in the functions getBitmapCoordinate(), getTileCoordinate(), and getLatLng(), and several commentors have picked apart the transforms in more detail. As mentioned before, the tiles' image source URLs encode the longitude, latitude, and zoom level.

Several commentors also suggested that the entire map was pre-rendered at every zoom level, so that the web server could simply deliver the tiles without further consideration. While I believe this is partially true, I also am fairly certain that some areas will be viewed far more often than others. Clearly, Google is working from vector data at some level, and it would probably make far more sense to render tiles on the fly and caching them. This would also make a big difference when it came to dealing with updates to the vector data, especially if those updates could be localized such that the entire tile cache need not be invalidated.

Crazy 'bookmarklet'

Finally, I have to give kudos to those who've been working on the 'bookmarklet' that reaches into the running application and makes it dance. It's completely novel to me that you can add bookmarks of the form 'javascript:' and have them run in the context of the current page. It makes perfect sense, of course &emdash; I just never thought of it. It appears that the current state of the art of this hack can be found here.

The only unfortunate part about this (through no fault of the authors') is that it's likely to be brittle in the face of updates to Google Maps &emdash; the Javascript has to reach into global variables within the running application, which will probably change when Google's code obfuscator is run. It does give us a glimpse of a possible future, however, when even web pages publish public API's for developers to use. Very interesting!

Wednesday, February 09, 2005

Mapping Google

By now, many of you will have gone and tried out the new Google Maps application. By and large, you have to admit that it's pretty damned slick for a DHTML web application -- even my wife was impressed, and that's not easy with geek toys. So, in the spirit of Google Suggest and GMail, I've decided to have a quick peek under the hood to figure out what makes it tick.

Not quite like GMail

The first thing I noticed is that it doesn't quite work like GMail. Whereas GMail uses XMLHttp to make calls back to the server, Google Maps uses a hidden IFrame. Each method has its benefits, as I'll discuss below, but this difference of approach does seem to imply that it may not be the same team doing the work.

The Graphics

Probably the most striking thing about Google Maps is the very impressive (for DHTML, anyway) graphics. Now, I'm sure that many of you old JavaScript hacks out there have known this sort of thing was possible for a long time, but it's very cool to see it (a) actually being used for something real, and (b) where normal users will see it.

For those to whom the implementation is less than obvious, here's a quick breakdown. The top and side bars are (more or less) simply HTML. The center pane with the map, however, is a different beast. First, let's address the map itself. It is broken up into a grid of 128x128 images (basically like an old tile-based scrolling console game). The dragging code is nothing new, but the cool trick here is that each of these images is absolutely positioned -- and the 'infinite' scrolling effect is achieved by picking up tiles that are off-screen on one end and placing them down on the other end. The effect is kind of like laying track for a train by picking up track from behind it.

Google map, with tiles outlined

The push-pins and info-popups are a different matter. Simply placing them is no big trick; an absolutely-positioned transparent GIF does the trick nicely. The shadows, however, are a different matter. They are PNGs with 8-bit alpha channels. Personally, I didn't even realize you could depend upon the browser to render these correctly, but apparently (at least with IE6 and Mozilla), you can. And they actually render pretty quickly -- for proof, check out the overlaid route image (at the end of the article), which is often as big as the entire map view.

The pushpin, with its two images outlined

Communicating with the Server

There are two ways in which Google Maps has to communicate with the server. The first is to get map images, and the second is to get search results. It turns out that getting map images is remarkably easy -- all you have to do is set an image tile's URL. Because the coordinate system is known and fixed (each tile represents a known area specified in longitude and latitude, at a given zoom level), the client has all the information it needs to set tile URLs. Each tile URL is of the following form:

http://mt.google.com/mt?v=.1&x={x tile index}&{y tile index}=2&zoom={zoom level}

I'm not sure what the 'v' argument specifies, but it never seems to change. The others are fairly self-explanatory. One nice side effect of this is that the images have fixed URLs for a given chunk of the earth's surface, so they get cached. If you're doing most of your searches in one region, then the app can be quite snappy once everything gets cached.

Doing searches is another matter. Clearly, you can't 'submit' the entire page, because that would destroy your map and other context. Google's solution is to submit a hidden IFrame, then gather the search results from it. Let's say, for example, that you simply wanted to go to Atlanta. You type 'Atlanta' in the search area, and the following HTTP GET is made:

http://maps.google.com/maps?q=atlanta&z=13&sll=37.062500%2C-95.677068&sspn=37.062500%2C80.511950&output=js

There are a couple of things to notice here. The 'question' is passed in the 'q' parameter (much like Google). The other arguments are 'z' for zoom, 'sll' for longitude & latitude (your current focus, I believe), and 'sspn' to specify the span/size of your viewing area. What's interesting is what comes back:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="content-type" content="text/html; charset=UTF-8"/>
    <title>Google Maps - atlanta</title>
    <script type="text/javascript">
    //<![CDATA[
    function load() {
      if (window.parent &amp;& window.parent._load) {
        window.parent._load({big chunk of XML}, window.document);
      }
    }
    //]]>
    </script>
  </head>
  <body onload="load()">
    <input id="zoom" type="text" value=""/>
    <input id="centerlat" type="text" value=""/>
    <input id="centerlng" type="text" value=""/>
  </body>
</html>

This HTML is loaded into the hidden IFrame which, when loaded, will punt a big chunk of XML back up to the outer frame's _load() function. This is kind of a cool trick, because it saves the outer frame from having to determine when the IFrame is done loading.

I mentioned before that there was some advantage to be had by using a hidden IFrame over making direct XMLHttp requests. One of these is that the IFrame's state affects the back button. So every time you do a search, it creates a new history entry. This creates an excellent user experience, because pressing the back button always takes you back to the last major action you performed (and the forward button works just as well).

Big Hunks of XML

Ok, so now the outer frame's code has a big chunk of XMl. What can it do with that? Well, it turns out that Google Maps depends upon two built-in browser components: XMLHttpRequest and XSLTProcessor. Oddly enough, even though it doesn't use XMLHttpRequest for making calls to the server, it does use it for parsing XML. I'll get to the XSLT later.

Here's an example of the XML response that comes back from the 'Atlanta' request above:

<?xml version="1.0"?>
<page>
  <title>atlanta</title>
  <query>atlanta</query>
  <center lat="33.748889" lng="-84.388056"/>
  <span lat="0.089988" lng="0.108228"/>
  <overlay panelStyle="/mapfiles/geocodepanel.xsl">
    <location infoStyle="/mapfiles/geocodeinfo.xsl" id="A">
      <point lat="33.748889" lng="-84.388056"/>
      <icon class="noicon"/>
      <info>
        <title xml:space="preserve"></title>
        <address>
          <line>Atlanta, GA</line>
        </address>
      </info>
    </location>
  </overlay>
</page>

Nothing surprising here -- we have a title, query, center & span, and the location and name of the search result. For a slightly more interesting case, let's look at the response when searching for 'pizza in atlanta':

<pre>
<?xml version="1.0" ?>
<page>
  <title>pizza in atlanta</title>
  <query>pizza in atlanta</query>
  <center lat="33.748888" lng="-84.388056" />
  <span lat="0.016622" lng="0.017714" />
  <overlay panelStyle="/mapfiles/localpanel.xsl">
    <location infoStyle="/mapfiles/localinfo.xsl" id="A">
      <point lat="33.752099" lng="-84.391900" />
      <icon image="/mapfiles/markerA.png" class="local" />
      <info>
        <title xml:space="preserve">Kentucky Fried Chicken/Taco Bell/<b>Pizza</b> Hut</title>
        <address>
          <line>87 Peachtree St SW</line>
          <line>Atlanta, GA 30303</line>
        </address>
        <phone>(404) 658-1532</phone>
        <distance>0.3 mi NW</distance>
        <description>
          <references count="9">
            <reference>
              <url>http://www.metroatlantayellowpages.com/pizzaatlanta.htm</url>
              <domain>metroatlantayellowpages.com</domain>
              <title xml:space="preserve">Atlanta<b>Pizza</b> Guide-Alphabetical Listings of Atlanta<b>...</b></title>
            </reference>
          </references>
        </description>
        <url>http://local.google.com/local?q=pizza&near=atlanta&amp;amp;amp;amp;amp;amp;amp;latlng=33748889,-84388056,11825991348281990841</url>
      </info>
    </location>
    { lots more locations... }
  </overlay>
</page>

Again, nothing too surprising when you think about the data that's going to show in the map pane. But how do the results get shown in the search result area to the right? This is where things get a little wacky. The JavaScript actually uses the XSLTProcessor component I mentioned earlier to apply an XSLT to the result XML. This generates HTML which is then shown in the right panel. We've come to expect this sort of thing on the server, but this is the first time I've ever seen it done on the client (I'm sure it saves Google lots of cycles, but personally I didn't even know XSLTProcessor existed!)

Driving Directions

There's one last case to discuss, and that's driving directions. This works just like other searches, including XSLT to show the results, with one exception: the result XML includes a tag that two opaque values encoding the geometric route to be taken. This data appears to be base 64 encoded (or something similar, anyway). Remember the giant transparent PNG I mentioned earlier for rendering routes? This data is used to render that sucker. The data looks like this:

<polyline numLevels="4" zoomFactor="32">
  <points>k`dmEv`naOdGC??EtD??@|DAxL??hEFjJ@ ...</points>
  <levels>BBB???BB?BB??@??@?????BB??@?????@? ...</levels>
</polyline>

This polyline data is then used to request the route PNG from the server using a URL like so:

http://www.google.com/maplinedraw?width=324&height=564&path=sRS?k@fB@?}As@e@CGAIA}@BwCEu@Bs@?E_@cACS@a@PaC ...

The route overlay

In Summary

That's about it. I hope that demystifies this application a bit; the real magic, of course, is all the work going on to enable this on the back-end. The fact that Google's servers can handle all of these images requests, route finding, line drawing, searches and the like so quickly is the real magic. I also want to point out that their map renderer (or the one they purchased) works much better than all the other ones I've seen on Mapquest, Mapblast, and the like. That alone makes it worth using, if only so you can actually read the map!

I also think it bears noting that Google is pulling out all the stops to build rich web apps, no matter how weirdly they have to hack the browser to make them go. And I strongly believe that this is a trend that is here to stay -- XHTML Strict/CSS/etc be damned. At the end of the day, what really matters to users is compelling apps that let them get their work done quickly.

Sunday, January 02, 2005

DHTML Leaks Like a Sieve

Most web browsers leak memory like a bloody sieve

You heard me: Like a sieve. Gobs and gobs of memory. Necessarily. "Gee, that's funny", you might say, "My browser doesn't seem to leak noticably". And you're probably right. However, the design of both major browsers (Internet Explorer and Mozilla) leaks memory necessarily (To be honest, I'm not sure about Safari and Opera, but it wouldn't surprise me).

Let's take a moment to think about that "necessarily" part. What I mean by this is that these browsers are not poorly implemented (I'm not going to pass judgment on that), but rather that their design leads inexorably to memory leaks. The reason you don't usually see this when browsing is that (a) most individual pages are simple enough so as not to exhibit the leak and (b) the few sites that use truly heavy JavaScript often work very hard to get around these leaks.

If you've ever built a really complex site using lots of JavaScript, you've probably seen this problem. You may even have some idea of where it comes from. Well, I'm writing this explanation because (a) I think I fully understand the problem and (b) there are a lot of confused (or simply wrong) explanations out there.

The Joy of Automatic Garbage Collection

The problem is not JavaScript. Nor is it really the DOM. It is the interface between the two. JavaScript uses (sometime after Netscape 2.0, I think) a fully garbage-collected memory allocator. For anyone who doesn't understand this, this simply means that memory can never be truly leaked, even when objects reference each other circularly (e.g. A->B->A). Both Internet Explorer and Mozilla are built on roughly similar component models for their native object layer (i.e. the DOM). Internet Explorer uses the native windows COM model, while Mozilla uses a very similar XPCOM model. One of the things these two models have in common is that objects allocated within them are not garbage collected &emdash; they are reference counted. Again, for those unfamiliar with the vagaries of memory management systems, this means that objects might not get freed if they take part in a circular reference (as above).

Now the designers of these browsers have gone to some trouble to keep their COM layers (I'll refer to both as COM for simplicity) from leaking during normal usage. If you're careful, this is not too difficult &emdash; you simply have to be vigilant about potential circular references and use various hacks to refactor them out of existence. And of course their JavaScript garbage collectors can't really leak at all. Where things start to go sour is when you have circular references that involve both* JavaScript objects and COM objects. Let me use an example to illustrate this point. Let's say you have a JavaScript object 'jsComponent' with a reference to an underlying DIV. And the DIV contains a reference to the jsComponent object. It might look something like this:

var someDiv = document.getElementById('someDiv');
jsComponent.myDomObject = someDiv;
someDiv.myComponent = jsComponent;

What's wrong with this? What basically appears to happen is that jsComponent holds a reference to someDiv. In a reference-counted memory manager, this means that it has a reference count of at least 1, and thus cannot be freed. Now someDiv also holds a reference to jsComponent (because jsComponent cannot be freed if it is still accessible via someDiv, or things could go really bad). Because COM objects cannot truly participate in garbage collection, they must create a 'global' reference to myComponent (I'm not sure what the actual implementation looks like under the hood, because I haven't dug through the source for either browser, but I imagine it's similar to the semantics of Java's createGlobalRef() JNI call). Thus begins the deadly embrace: someDiv's reference count will stay at 1 as long as jsComponent is not freed, but jsComponent will not be freed until someDiv drops its global reference to it. Game over: that memory is irretrievable without human intervention.

At this point, you might be asking yourself how common circular references of this sort really are. First, I would argue that they ought to be relatively common, because building any sort of reusable component framework in JavaScript requires this sort of structure to tie the component and DOM layers together. However, there are many schools of thought on this subject, and if that were the only problem, it wouldn't be so bad. However, there are two very common cases that make this problem much more notable.

Event Handlers

Perhaps the most common manifestation is in DOM event handlers. Most event handlers take the form " onclick='this.doSomething()' ". This doesn't really pose a problem. However, if the event handler references a JavaScript object in any way (as in the aforementioned component scenario), then it serves as a back-reference. This is why, in many posts and articles I've read about avoiding memory leaks, the statement "don't forget to unhook all of your event handlers" is often made.

Closures

A much more subtle (and therefore nasty) situation where circular references occur is in JavaScript closures. For those not familiar with the concept, a closure binds stack variables to an object created in a local scope. You may well have used them before without realizing it. For example:

function foo(buttonElement, buttonComponent) {
  buttonElement.onclick = new function() {
    buttonComponent.wasClicked();
  }
}

At first glance, it may appear that this method of hooking events on a button avoids the circular reference problem. After all, the button's 'onclick' event doesn't directly reference any JavaScript object, and the button element itself contains no such reference. So how does the event find its way back to the component? The answer is that JavaScript creates a closure that wraps the anonymous function and the local variables (in this case, 'buttonElement' and 'buttonComponent'). This allows the code in the anonymous function to call buttonComponent.wasClicked().

Unfortunately, this closure is an implicitly created object that closes the circular reference chain containing both the JavaScript button component and the DOM button element. Thus, the memory leak exists here as well.

So Now What?

Unfortunately, there really is no easy way around this problem. If you want to build complex reusable objects in JavaScript, you are probably going to have to deal with this and some point. And don't feel tempted to think "It's probably not that bad; I'll just ignore it" -- neither Internet Explorer nor Mozilla free these objects even after a page is unloaded, so the browser will just blow more and more memory. In fact, I first noticed this problem in my own work when I saw IE blowing around 150 megabytes of memory!

The only real option I know of is to be extremely careful to clean up potential circular references. This can be a little tricky if you're writing components for others to use, because you have to ensure that they call some cleanup method in the 'onunload' event. But at least by understanding the root cause of the problem, you have at least some hope of getting your leaks cleaned up before your users start complaining!

One Further Note

When I said at the beginning of this article that the design of most web browsers necessarily leaks, I was probably making too strong a statement. While their reference-counting and mark-and-sweep garbage collection implementations do not "play well" together, there is lots of research on this subject out there, and I'm sure a way could be found to fix this problem without throwing away most of the implementation.