Bounty -- Babynames

This site takes 150 MILLION HITS a month, is powered by Drupal and just keeps on going! if you ever had any doubts about Drupal being able to take a serious load then fear no more. Coupled with the Boost! module and some good disciplines about getting the MySQL servers to do as much work and also enforcing a rigorous pre-render strategy, this site is more than able to deal with large amounts of user from an average day to a spike hit at tea-time when the adverts come on the telly!

Lead PHP Developer is what we did for this site and boy was it enjoyable. So many things going on here it's hard to know where to start.

From a request-response point of view, we used the Boost! module to speed up as much of the anonymous user content as possible. If you haven't heard of Boost! then go check it out. In a nutshell it uses the Apache mod_rewrite extension to dynamically create a static HTML page. If a request comes in and the page doesn't exists, it gets created. If the cached page exists, back it goes! For more details see the Boost! home page.

For those times when Drupal does get asked to do something, we also used memcached to keep as much of the big low-hanging fruity things in RAM for extra slick response times. The Memcache homepage will tell you more. I tried to get them to use CouchDB, having used Erlang but from a knowledge-pool perspective, it's a lot easier to find PHP guys that have used memcached. You win some you lose some.

For my own input I was responsible for the design and implementation of the name basket, the name charts (graphs), and lots of other ancillary stuff on the site. For example, on this page, the name Eric shows a heat map and a bar-chart on the lower right hand side. There's a story there... we had a CSV file given to use from an AS400 system that had about ten million rows of data! Seriously. It contained every name for every region for the last fifty years that Bounty have ever collected!

I took this data and then wrote a MySQL stored procedure that boiled this down into a set of data for each name, and then using the (somewhat limited) file handling functions available I generated about 5000 files that I called '.json' files. These contain the actual chart data as a JSON object. When the browser renders the page, it makes an AJAX call back to the server for the relevant file and uses JSChart to display it. If the MySQL servers had to do what the stored procedure did on demand, I think the site would last about ten seconds at peak time!

All in all a most enjoyable project.