Following the traditional outsourcing mantra, Google should be focusing on its core competencies, while outsourcing everything they possibly can of the base infrastructure to vendors. Turns out that that the company actually does exactly the opposite. A few weeks ago, Luiz André Barroso and Urs Hölzle of Google published a very interesting piece: The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Here is a quote [emphasis mine]:
Buy vs Build:
Traditional IT infrastructure makes heavy use of third-party software components such as databases and system management software, and concentrates on creating software that is specific to the particular business where it adds direct value to the product offering, for example, as business logic on top of application servers and database engines. Large-scale Internet services providers such as Google usually take a different approach in which both application-specific logic and much of the cluster-level infrastructure software is written in-house. Platform-level software does make use of third-party components, but these tend to be open-source code that can be modified inhouse as needed. As a result, more of the entire software stack is under the control of the service developer.
This approach adds significant software development and maintenance work but can provide important benefits in flexibility and cost efficiency. Flexibility is important when critical functionality or performance bugs must be addressed, allowing a quick turn-around time for bug fixes at all levels. It is also extremely advantageous when facing complex system problems because it provides several options for addressing them.
This is an interesting approach that works only if you have very talented individuals available to do design work. Infrastructure as a competitive advantage does not mean throwing money at the problem – if it were that simple, startups competing in fields with well-capitalized competitors would not have gotten off the ground. The entrenched incumbents would have crushed them by virtue of their superior capital resources. To make infrastructure work for you, rather than be a drain on your finances and focus, you need the very best people who can turn the bits and pieces into a well tuned engine that enables you to do things that others simply cannot do. Joel Spolsky did some research around this problem, which he summarized on Joel on Software. His central thesis is summed up by the following quote:
The Creative Zen team could spend years refining their ugly iPod knockoffs and never produce as beautiful, satisfying, and elegant a player as the Apple iPod. And they’re not going to make a dent in Apple’s market share because the magical design talent is just not there. They don’t have it.
The mediocre talent just never hits the high notes that the top talent hits all the time.
As an engineer/technical manager with some of the worlds largest networks under my belt, I’ve repeatedly seen what Joel said proven out in practice, and I will be naming names, because it helps to be specific. At UUNET, I worked with people like Juzer, Najam, Bill Barns, Louie Mamakos, Parantap, Andrew Partan, Joe Malcolm, Mike O’Dell, Tim Smith et al. When I was at AOL, I had the privilege of working with folks like Hung Le, Dr. Wu, Rich Colella, Mark Muehl, John Schanz, JR Mitchell, Girija, et al. At my current gig, I can’t even write down the full set of people before this becomes too long – folks like Warren, Jon, Bikash, Eiichi, Paul S, Paul G, Stephen S, Beck, Steve P and W, Sergei, Nicolas G, Johnny J and the rest of the folks on the infrastructure and operations teams. At Level (3), a company I’ve never worked for, but whose engineering and architecture folks I am familiar with, there are engineers like Shane Amante, Nasser, Tozz, Scott Madley, Epperson, Dr. Gibbings etc. These engineers (I am naming a few representative examples), have repeatedly done things that the vast majority of people I am familiar with in the networking world – simply could not do. They have done work, which in design and execution would be beyond normal people. This is not an exhaustive list of networking talent – it would be the height of hubris to think that this is the entirety of the talent pool, but what are the chances of those level of individuals working on your network? Slim at best. Best in the world infrastructure needs best in the world people. There is simply no way around that. For companies that are in the telecom space, hiring second rate people will get you third rate networks. If you are going to compete on basis of your infrastructure, you should be able to back that up with the appropriate people. To back it up with appropriate people, the executive management needs to read and understand what this is about. If they don’t, someone should send them a brief on Quark which Joel mentions in his post I quoted earlier.