Spying On Digg Spy

So by now I’m sure anyone reading here knows I’m a HUGE fan of all things AJAX. That being said the true love I have is for those who use it to make their sites/apps have that extra special touch. Digg Spy is a great example of this.

For those of you who may not know, Digg Spy, is a real-time view of any/all actions that occur throughout Digg. Examples of actions are story submissions, diggs, comments, and reporting of stories. In Digg Spy, these actions are displayed in a list that refreshes itself every few seconds, thus allowing the user to see an overview of everything happening in real-time Digg.

Here we’re going to look inside and explain the inner workings of Digg Spy. Starting by looking at what’s going on behind the scenes, then infering and designing a probable database schema, and the come up with a possible implementation of the server-side and client-side code needed to create something similar.

We’re going to focus on JavaScript, thus concentrating on the client-side portion of the application, and leave the back-end portion for a future article. Keep reading for the tutorial.

The Tools

As always, the best starting place is the Prototype JS library. The browser will be Mozilla Firefox with the Firebug extension installed (a great way to inspect the AJAX requests being made by Digg).

Getting The Scoop

First off, we need to find out what’s going on behind the scenes of Digg Spy.

Thanks to our trusty Firebug, we will start by eavesdropping client and server communication. This will help us get a better idea of what’s in store for us.

Eavesdropping AJAX

Head over to the Digg Spy page, open your Firebug console (remember enable AJAX tracing by flagging Show XMLHttpRequests in the Firebug Options menu) and wait a few seconds. You should see a series of AJAX requests sent to the Digg.com server at regular intervals.

The requested page (http://digg.com/spy_update) responds to the client with a list of the 26 most recent actions that occurred on digg. The updates don’t start immediately because the page loads with a few hidden entries used as an initial cache.

If you look closer to the URL in Firebug you should see something similar to this:

http://digg.com/spy_update?timestamp=1151659811&showtop=2&showitems=1&showdiggs=1&showburies=1&showcomments=1&showtop=2&maxitems=26

As you can see, the callback page accepts quite a few GET parameters that are used to filter the response. The names are pretty straight forward, but ones we are interested in are timestamp and maxitems. The timestamp value represents the time of the request expressed in seconds from the Unix Epoch (1 January 1970), while the maxitems variable indicates the maximum number of entries that should be returned by the server.

If you copy and paste the requested page in our browser you will discover that the data returned by the server is a JSON object, more specifically an array of Digg Spy items.

In the code returned every item in the array is an object that looks more or less like the following:

{
  "type":"dig",
  "itemid":"738265",
  "uid":"bomg31",
  "date":"2006-11-25 01:14:05",
  "timestamp":"1154333645",
  "plk":"http://digg.com/programming/How_NOT_To_Use_AJAX",
  "title":"How NOT To UseAJAX\"",
  "url":"http://migraineheartache.com/wp/2006/11/30/how-not-to-use-ajax/",
  "dig_count":"11",
  "area":"Upcoming",
  "userimage":"/userimages/bomg31/small.jpg"
}

Object literal representing an entry for Digg Spy

This is an object that represents a Digg Spy entry, expressed as a JavaScript object literal. The fields in the object are pretty self explanatory, and they quite simply map to the information displayed in a row of the Digg Spy.

The Digg Spy Database Schema

Now that we know what data feeds Digg Spy we can attempt to infer the schema of the database that runs behind it (well, at least a simplified version of it.)

Common sense dictates that Digg Spy probably relies on a database table (or view) that works as a temporary buffer for the latest actions that occur in the website. The table is probably very simple, a dozen columns at most, and it’s updated for every INSERT or UPDATE that occurs in other tables of interest, for example the Stories table or Comments table. The rows are probably inserted with some sort of trigger (Triggers in SQL Server or MySQL for more information) or with a common procedure called from the code. I’ll let you chose the most appropriate solution based on your needs.

The Server Side

We can now proceed to the server-side code that returns the JSON string needed to operate our implementation of Digg Spy.

To do this we are going to create a simple page that outputs a list of recent actions just like Diggs spy_update. We are going to accept two parameters via GET querystring, maxitems and timestamp, and use them to generate the SQL query needed to fetch the data from the db. Here is some pseudo-code:

// Set the variables to the values sent in GET querystring
var entryCount = GET["maxitems"]
var timestamp = GET["timestamp"]
// validate the values to avoid SQL injection attacks
if entryCount and timestamp are valid
    var query = "SELECT TOP " + maxitems + " * FROM Actions WHERE timestamp > " + timestamp
 
    var results = execute the query
    for each row in results
      generate json response
    end for
end if

Pseudocode that generates the JSON response

The JSON response generated in the loop is similar to the one that we looked at while inspecting the diggs spy_update response, just a an object literal containing an array of all the records returned by the query like so (note: each element in the array is an object):

[
    {who:"bigdogg",when:"2006-07-30",where:"Homepage",what:"Commented",url:"http://migraineheartache.com"},
    {who:"spidercat",when:"2006-07-31",where:"Post",what:"Voted",url:"http://migraineheartache.com/wp/2006/11/30/how-not-to-use-ajax/"},
    {who:"funnypig",when:"2006-07-31",where:"Post",what:"Commented",url:"http://migraineheartache.com/wp/2006/11/30/how-not-to-use-ajax/"}
]

Three entries for our Digg Spy clone

The properties in each object of the array depend on the data you decided to store in the Actions table.

That’s it; this very simple script is all we need to feed the needed data to the client side JavaScript.

The Client side

The bulk of the action takes place in the client, and this is the part we are going to analyze in a little more detail.

Now that we have a server side endpoint that provides us with fresh and up to date data on the actions, all we have to do is fetch it with AJAX and display it to the client.

The first thing to do is create a function that periodically calls the server side page. We could implement the logic from scratch, but the Prototype library comes to aid with PeriodicalExecuter, an object provides the logic for calling a given function repeatedly, at a given interval.

The following JavaScript code wires the function that makes the AJAX request to the PeriodicalExecuter.

var requestSpyData = function() {
  // The URL of the serverside processing script
  var url = '/spy_page';
 
  // The seed is required to avoid client side caching
  var pars = 'seed=' + Math.random();
 
  // the number of entries to be requested
  pars += '&maxentries=' + maxEntries;
  // The current timestamp
    var epoch = new Date(1970, 1, 1);
    var now = new Date();
    var timestamp = Math.floor((now - epoch) / 1000);
  pars += '&timestamp=' + timestamp;
 
  // Make the asynchronous request to the server and
  // execute the processResponse method on success
  currentRequest = new Ajax.Request(url, {
    method: 'get',
      parameters: pars,
      onSuccess: processResponse
  });
};
new PeriodicalExecuter(requestSpyData, updateInterval);

Requesting periodically updates from the server

When we call the Ajax.Request constructor we store a reference to the instanced request so we can cancel the execution if needed (for example with a stop or pause button).

We now have to implement the processResponse function, the core of our Digg Spy implementation. This function takes the JSON response from the server, parses it and injects the fetched entries into the designed <ul> list.

var processResponse = function(response) {
  // Evaluate the JSON response
  try {
    var result = eval('(' + response.responseText + ')');
  } catch(ex) {
    // Handle the error gracefully
    return;
  }
 
  // I use an array (buffer) to store the XHTML code an inject it into the DOM element
  // with innerHTML because it's way faster than DOM manipulation.
  // array.push() is also faster than plain string concatenation
  var buffer = [];
 
    // Generate the list from scratch
    for (var i = 0; i &lt; result.entries.length; i++) {
      // Generate the XHTML for every row
    }
 
    // Get a reference to the list element that contains the spy
    // rows and inject the generate XHTML into it
    var spyElement = $(listElementId);
    spyElement.innerHTML = buffer.join('');
}

Processing the response and generating the list

One thing we can notice from this code is that the list is generated from scratch every time. This is not optimal since it would cause the list to only display recent updates and discard previous updates, a problem especially when between polls no new updates are found.

We can resolve this problem by add the following branch condition in the processResponse function:

// Check if a list of entries already contains elements
if ($(listElementId) &amp;&amp; $( listElementId).hasChildNodes()){
    // Reference the list
  var list = $( listElementId);
  // clean any empty text nodes
  Element.cleanWhitespace(list);
    // remove the last X nodes from the list, where X is
  // equal to the number of entries received from the server
  var lastNodeIndex = list.childNodes.length - 1;
  var nodesCount = Math.min(list.childNodes.length, result.entries.length);
  var firstNodeIndex = list.childNodes.length - nodesCount;
  for (var i = lastNodeIndex; i &gt;= firstNodeIndex; i--) {
    list.removeChild(list.childNodes[i]);
  }
  // Generate the only the new rows
  for (var i = result.entries.length - 1; i &gt;= 0; i--) {
 
    }
} else {
    // The list is empty, generate the list from scratch
}

Updating only the most recent actions

A simple demo

I hope this general outline was enough to get you started. If you want to take a look at a simple client-side implementation check out the following JavaScript file.

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Reddit
  • Furl
  • NewsVine
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • De.lirio.us
  • Fark
  • feedmelinks
  • LinkaGoGo
  • Ma.gnolia
  • Netvouz
  • RawSugar
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

No Comments to “Spying On Digg Spy”  

  1. No Comments

Leave a Reply