Google Scalability Conference Report
While the travel to Seattle was horrible (missed my connection and got stuck at Washington Dullas for one night. Mental note: never ever Lufthansa again!) the conference and city after-all was quite nice. Not that I had much time for exploring Seattle – but it looked beautiful while I was hasting through my too-see list in a few hours. (So green!) I’ve not even had the time to meet up with some Apache folks over there. (Sorry, Henry!) The conference itself was well organized (what else would you expect from Google) and covered some quite interesting topics. Unfortunately it was not as technical as one would have hoped. Still there were some lessons learned that surely should be kept in mind while designing large scale systems. The session are supposed to be available on google video/youtube at some stage. So far I could only find one of them.
The sessions from Google itself where mostly introductions to MapReduce and BigTable. They were interesting but no big surprises. They did even talk briefly about hadoop as an open source implementation. Unfortunately there is no comparison between hadoop and the Google in-house implementation yet …though they agreed that it might be interesting to see the results. A little annoying was that every remotely technical question was answered with “Sorry, can’t talk about it”. What I found most amazing though is that according to them all Google developers have access to the production machines.
The session about the Lustre File System was quite interesting. Turns out it is one of the top 5 clustered file systems. It’s full posix compliant and supports cross-site deployments. People run it in highly critical installations. Throughput they achieve is quite impressive. Setting up lustre properly can take some time though. That’s where the company behind lustre provides consulting for.
A session about test selection talked about scaling with the number of test that gets accumulated over the releases. Running all test for a release will take longer and longer with the increasing number of tests. In order minimize risk for emergency releases they developed a technique to statistically map code changes to testcases and requirements. So based on testing history they can then select certain tests and still are quite confident about the QA status. Not sure I really buy into this though. ;)
For me the most interesting session was indeed the session from the Amazon guys about infinite data storage. They emphasized on that if something can go wrong – it will go wrong. And one should plan for that. According to them you only truly scale with more data centers across the world. And you gotta be prepared what happens if a data center just dies! Not just is unreachable – but dies. They had a couple of approaches until they finally arrived where they are now. From mainframes over distributed databases. It all just didn’t work. Especially they were bashing distributed databases. According to them it just happens that nodes run out of sync. For some bizarre reasons – but they do. And then it gets messy. Two-phase-commit was almost a “don’t say that word” for them. They came up with their own system as they believe there was (and still is) no product one can buy or open sources that fits their needs yet. Their contract is that every page (even with one data center down) should be delivered within less than 100ms. A machine or even a data center down should not even be noticeable to the user. While reads are easier to scale writes are not. Still they require their writes (e.g. to the shopping card) to return without any blocking. Their call was that versioning is the key to true scalability in this area. A quorum based process makes sure nodes get synchronized properly across the cluster. On application level they handle merge conflicts if there really are. They have nothing what one would call a master. Any machine can go down – at any time. Load is balanced and data partitioned via consistent hashing. They did not go down the low-end machine route like Google did – but no big investments into machines either. Again developers have access to the production environment and are responsible for the contracts (like the 100ms). Definitely worth a watch once it’s up on Google video.
Unfortunately I missed the session from the YouTube guys. So I hope the rest of the sessions will be online soon too. I’ll give an update here once I found out.
There are a couple of nice write-ups of the conference as mentioned here.
Update: Sylvain found them. Definitely worth watching the YouTube one. If you don’t know what MapReduce/BigTable is about also check out the Google talks.
- MapReduce, BigTable, and Other Distributed System Abstractions for Handling Large Datasets (Google)
- Scaling Google for every user (Google)
- SCTP’s Reliability and Fault Tolerance (University of British Columbia)
- Using MapReduce on Large Geographic Datasets (Google)
- YouTube Scalability (YouTube)
- Building a Scalable Resource Management System for Grid Computing (Platform Computing Corp.)
- Lessons in Building Scalable Systems (Google)
- Lustre File System (Lustre)
- Challenges in Building an Infinite Scalable Datastore (Amazon)
- Scalable Test Selection Using Source Code Deltas (Symantec)
- VeriSign’s Global DNS Infrastructure (Verisign)