Book Review – Scalability Rules to Live By

Having listened to Marty Abbot and Michael Fisher methodically dissect scalability bottlenecks and hash out fault isolation in large scale enterprise systems, I decided to pick up "50 Principles for Scaling websites" with high hopes and wasn't disappointed.

These fifty rules, some of them seemingly obvious and others with specific details serve as a hybrid guide of technical, organizational and managerial concerns regarding scalability in enterprise systems. These principles are broken down to address What, When to Use, How to Use, Why and Key Takeaways; I enjoyed this style of specifics better than their earlier book, the art of scalability which focuses more on people, processes and technology instead of the rules of scaling. There is indeed some vendor-specific-consultant-speak in 50 Principles for Scaling websites such as AKF cube but overall I found this manuscript to be fairly technology agnostic.

The use of term "scaling websites" was one of the key reservation I had regarding title of this book; not all enterprise systems are web centric and scaling middle-tier and avoiding big SOA mistakes would have made a more technically accurate title but probably not as lucrative as web. The rules begin with simpler maxims like Don't over Engineer the Solution, Design Scale Into the Solution (D-I-D Process), Simplify the Solution 3 Times Over and then get into specifics like Reduce DNS Lookups, Reduce Objects Where Possible and Use Homogenous Networks. I found fallacies of distributed computing being addressed in a well-rounded fashion by authors as they proceed into latency and boundary crossing concerns. Advices include work distribution to Split Reads and Writes, Split Different as well as similar things, Horizontal scalability, designing a Solution to Scale Out and Not Just Up with axioms like using Commodity Systems (Goldfish not Thoroughbreds), a time tested approach in most large scale distributed clusters.

This highly recommended reading for Tech-Ops, developers and architects goes on to recommending Scaling Out Data Centers, Design to Leverage the Cloud and info-sec concerns of firewalls. Health Monitoring is commended by Rule#16 "Actively Use Log Files" and functional-DRY principle is conversed as Don't Duplicate Your Work deliberating seemingly counter-intuitive ideas like Don't Check Your Work. I am glad that some well known (but not always well practiced) notions like limiting redirections also made it to this chapter. Caching is highly praised and has been heavily endorsed in chapter 6 while chapter 7 takes a solemn note on Learning from mistakes and one of my favorite rules, "Failing to Design for Rollback Is Designing to fail".

This ~250 page book is divided into 13 chapters and adheres to Martin Fowler's cover-to-cover reading constrains. Chapter 8 proceeds to deliberate on Database and examines relational integrity, cost of foreign key constraints, Right Type of Database Locks, Multi-phase Commits and avoidance of "Select for Update" notion. Interestingly it contains fillers like Rule 35 Don't Select Everything too which was surprising but probably having some SQL snippets throughout this chapter try to make up for missing specifics like snapshot isolation. Chapter 9 is focused on Fault Tolerance Design and Graceful Failures; with tips on graceful degradation and fault isolation. Rules include Design Using Fault Isolative "Swim Lanes", (identify) and Never Trust Single Points of Failure, Avoid Putting Systems in Series, Ensure You Can Wire On and Off Functions continuing on to next chapter which touches on the scalability Achilles heel, the one and only infamous functional arch-nemesis "state".

Having worked with various large scale clients (which we are reminded of throughout the book), authors at AKF partners have achieved a certain level of understanding and insight into system bottle-necks which is evident in their writings. Chapter 10 deals with "state" avoidance starting with Rule 40 "Strive for Statelessness". Next they proceed with recommending maintaining Maintain Sessions in the Browser When Possible and make Use of a Distributed Cache for States; guidance which likewise applies to web service design as it does to UI layer to avoid state gotchas. Chapter 11 delves into Asynchronous Communication and Message Buses where it promotes asynchronous communication and Message Bus scalability with AKF Scale Cube for message buses (surprise!) however rules like Avoid Overcrowding Your Message Bus leaves the reader wanting more concrete examples than generalities like "Physical fitness, for example, if taken to an extreme over long periods of time can actually depress the immune system of the body...". Authors then continue to Miscellaneous Rules bucket with items like Be Wary of Scaling Through 3rd Parties, Purge, Archive, and Cost-justify Storage, Remove Business Intelligence from Transaction Processing (which should have been a database rule) and Design Your Application to Be Monitored which can be merged with rule 16.

Chapter 13 is an overview of rules Rule Review and Prioritization and provides great summary and revision of what has been discussed. Additionally, each chapter concludes with a summary and end note containing significant number of references for further reading.

Like me, if you are looking for an intelligent, practical and perceptive guide/refresher for designing and building scalable systems, 50 rules should be your desktop companion.