Innovation and Outsourcing

October 13, 2009

Risk:

The CEO of Air New Zealand had this to say on their supplier:

“We were left high and dry and this is simply unacceptable. My expectations of IBM were far higher than the amateur results that were delivered yesterday, and I have been left with no option but to ask the IT team to review the full range of options available to us to ensure we have an IT supplier whom we have confidence in and one who understands and is fully committed to our business and the needs of our customers.”

Reward:

Fake Steve Jobs had this to say:

See, those outsourcing deals always sounded so good: Why do you want to run a messy old data center anyway? We can do it for less than it costs you to do it yourself, and you can focus on your real core competence, which is running an airline.
Except, um, no. An airline’s core competence is running computers. I mean, think about it. Duh

Thing is, these guys did think about it. They knew the deal, but they did it anyway. You know why? Because they got to take a bunch of assets off their balance sheet and send a few hundred IT employees to IBM. It was an accounting maneuver, a way to dress up their financial reports, and it was especially appealing to weak companies. IBM takes your data center off your hands — and in some cases even pays you some money — and then sells it back to you as a service over the next decade.

If you are outsourcing, your cost advantage is lost, and not only is your cost advantage going to go away, there are some things that you are never going to be able to do. One can argue that it would make the most sense for someone like Google to focus on their core competency, not waste time building servers.  But not only are they building servers, the fact that they viewed it as a core competency allowed them to make things better by optimizing the system, including on-board batteries which enabled datacenters without centralized UPS’s.

People define core competencies far too narrowly. It is not simply that someone chose to view building servers as a core competency, it is that they saw the massive advantage to all their efforts of controlling their infrastructure destiny as an enabler and thus took it as a core competency.

Those leaps of innovation are just not going to happen if you are focusing on your “core competencies” while letting others build your infrastructure. It can be argued that at Google’s scale, servers are a core competency – for example no one is going to argue that if you need a 1000 servers, you are better off  using a reverse auction, but if you are a global service provider, you are not building 1000 servers, you are in fact, working on your core competency, a point which does not seem as clear as it perhaps may appear.  How are you going to avoid being a dumb pipe if you can’t even control your own infrastructure at scale?

Edit: Benjamin Black added clarification


Infrastructure is software

July 22, 2009

In an earlier post I mentioned that “cloud is software.”  Thinking about it some more, I believe the statement can be generalized to “Infrastructure is software.”  This is a bit different from how people have traditionally viewed it – Internet infrastructure is viewed as pipes, disks, CPUs, data centers. The collection of items that form the physical units that provide pipe, storage, compute and the buildings that house them. My thesis is that those are necessary but not sufficient to be considered infrastructure.  Those elements in and of themselves, are just so much sunk capital – to make efficient use of them you need the correct provisioning APIs, monitoring, billing, and software primitives that abstract away the underlying systems, allowing a decoupling between the various technological and business imperatives so that each layer can evolve independently based on their different technological scaling domains (within reason – if you are writing ultra-high performance code, you will know the difference if you get instantiated on an Opteron vs. a Nehalem cluster).

Lets make this concrete and think about how the above can inform the building and operations of a global service provider that has a large network, with datacenters that are used for a cloud computing business. A large telecommunications company for example that wants to provide enterprise cloud computing among a suite of services.

Basic Axioms

All things come down to the fundamental problem of mapping demand onto a set of lower level constraints. For a telecom company, constraints at the lowest level consist of:

  1. Fiber topology (or path/Right of Ways)
  2. Forwarding capacity
  3. Power & Space
  4. Follow The Money (FTM)

Everything thing else is an abstraction of the above constraints. That is the good news. The bad news: everyone has the same constraints. No special routers available to you and not to others, the speed of light is constant (modulo fiber refractive index in your physical plant), So how do you differentiate yourself? Fortunately, those are also simple:

  • Latency
  • Cost (note I did not use price for a reason)
  • Open Networks
  • Rich connectivity
  • OSS/NMS

Latency

Latency has been well documented. Some excerpts from Velocity 2009:

Eric Schurman (Bing) and Jake Brutlag (Google Search) co-presented results from latency experiments conducted independently on each site. Bing found that a 2 second slowdown changed queries/user by -1.8% and revenue/user by -4.3%. Google Search found that a 400 millisecond delay resulted in a -0.59% change in searches/user. What’s more, even after the delay was removed, these users still had -0.21% fewer searches, indicating that a slower user experience affects long term behavior. (video, slides)

Phil Dixon, from Shopzilla, had the most takeaway statistics about the impact of performance on the bottom line. A year-long performance redesign resulted in a 5 second speed up (from ~7 seconds to ~2 seconds). This resulted in a 25% increase in page views, a 7-12% increase in revenue, and a 50% reduction in hardware. This last point shows the win-win of performance improvements, increasing revenue while driving down operating costs. (video, slides)

If you want to get into the cloud computing business, you will have to build your network and interconnection strategy to minimize latency. Your customers bottom line is at stake here, and by extension, so is your datacenter divisions P&L.

Cost

Sean Doran wrote “People that survive will be able to build a network at the lowest cost commensurate with their SLA.” He forgot to add – in a competitive market.  Assuming you are going up against competition, this should be fairly self-obvious: Efficiency and razor thin margins.  The killer App is bandwidth, and this means people need to emulate  Walmart ™. Learn to survive  on 10%  or lower margins. At those margins, your OSS/NMS are competitive advantages.  Every manual touch point in the business, every support call for a delayed order, failure in provisioning,  every salesperson that sells a service that can’t be provisioned properly, nibbles at the margin. Software that can provision the network,  enable fast turn up, proper accounting and auditing is the key.

And we react with great caution to suggestions that our poor businesses can be restored to satisfactory profitability by major capital expenditures.  (The projections will be dazzling – the advocates will be sincere – but, in the end, major additional investment in a terrible industry usually is about as rewarding as struggling in quicksand.)
-Warren Buffet

Efficiency also means fewer operational Issues. Couple ever increasing number of elements with ever growing mass of policy and you now are starting to lose any semblance of troubleshooting and operational simplicity. Does the network pass the 3 AM on-call test? More policy means more forwarding complexity, and that means more cost that hits your bottom line. A more insidious effect of intelligent, complex networks is that they inhibit experimentation. The theory of Real Options points out that experimentation is valuable when market uncertainty is high. Therefore, designing an architecture that fosters experimentation at the edge creates potential for greater value than centralized administration, because distributed structures promotes innovation and enables experimentation at low cost. This means that by putting the intelligence in the applications, rather than the network is a better use of capital – because otherwise, applications that don’t need that robustness will end up paying for it, and this will end up making experimentation expensive.

Open Networks

Open networks strikes fear into the heart of service providers everywhere.  If you are in a commodity business, how differentiate yourself?  How about providing service that works well, cheaply.  But wait a minute!  Whatever happened  to “climb up the value chain?” The answer is nothing. You have to decide what business you are in.  Moving up the value chain and providing ever higher-touch services are in direct conflict with providing low cost bulk bandwidth.  Pick businesses that require either massive horizontal scaling or deep vertical scaling. Picking both leaves you vulnerable to more narrowly focused competitors in each segment. If horizontal scaling is central to one business, trying to fit an orthogonal model also as a core business will end up annoying everyone and serving no one well.  However, if the software interface to the horizontal business is exposed to the vertical high-touch side of the business, both can be decoupled from each other and allowed to scale independently.  This means things like provisioning, SLA reporting, billing, usage reporting all exposed via software mechanisms.

Rich Connectivity

Let me start off by saying content is not king.

Gaming companies are making the same mistakes as the
content guys. They always over-estimate the importance of
the content and vastly underestimate the desire of users/people
to communicate with each other and share…
-Joi Ito

The Internet is a network of networks. The real value of a network is realized when it connects to other networks, more detail can be found in  Metcalfe’sLaw, and Reed’s Law.  Making interconnections with other networks harder than is necessary will eventually result in isolation and a drive to irrelevance (in an open market).  If people who are transiting your network to get to another network find that the interconnection between your network and their destination network is chronically congested or adds significant latency, the incentive to directly interconnect with the destination network or find another upstream becomes stronger.

It ain’t the metal, it ain’t the glass; it’s the wetware.
-Tony Li

OSS/NMS

Make the network be database authoritative.  This will allow for faster provisioning, consistency, auditing. You can tell authoritatively if two buildings across the country or the world are on-net and more importantly, if they can be connected together in what timeframe. This is especially true if you have a few acqusitions with a mixture of assets. Just mashing together the list of buildings that are now on-net with the merged entity doesn’t actually tell you if they can be connected together easily or through several different fiber runs, patch panels, and networks.  If the provisioning systems were correct, the sales folks could tell prospective customers when services could be delivered because they’d know if connecting two buildings involved ordering cross-connects or if it involved doing a fiber build. We provision thousands of machines automatically, why treat thousands of routers differently? The systems that automatically provision and scale your network are hard to implement, but they can be built. It only requires the force of will to make it happen.

All these things give a better quality of service to the end user and are a competitive advantage in reducing OPEX and SLA payouts due to error in configurations. You can futher extend your systems to do things like automatic rollbacks if you make a change and something goes wrong.

Software is the key, no matter what your business is if it deals with the internet and it will be increasingly true going forward.


Structure 09

June 26, 2009

I participated on a panel titled ‘On the Shoulder Of Giants’ at GigaOm’s Structure 09 conference today. Among the things we discussed were:

  • Network
  • Pain Points in infrastructure (software, storage)
  • Privacy and protection of user data
  • Sustainability
  • Software and hardware stacks

It was an interesting panel but it was clear that just one of the topics we discussed could easily take up an entire day, and then some. The shortage of time made using jargon mandatory, and unless you live and breathe infrastructure every day,  jargon is going to be a turn off. All in all, even though over 3000 people watched on the broadcast stream, and the audience hall was packed, I felt that we could have done more for our audience by taking the time to put the discussion into context.  It was also clear that infrastructure – the guts and bits and wires that make stuff happen – is irrelevant. What people want are solutions and platforms they can build upon.

However, as service providers compete for the users, having the lowest cost platforms capable of providing good enough service is going to be a competitive advantage. This means ever more sophisticated control software, automation, cheaper infrastructure,  efficiency and cost of operations. Infrastructure is going to be a competitive advantage. Again. I will further refine it by saying that the software for infrastructure is going to be the competitive advantage. For good software, you need great engineers – and most companies aren’t set up to do that, especially the telecom companies, so here is a prediction: The cloud intiatives of the telecom providers are going to come to naught.


Sometimes these things write themselves

May 7, 2009

As I mentioned  yesterday in point 5, Verizon Wireless has made a business out of their network infrastructure and they promote it as a core competitive advantage. Today, Fierce Wireless quotes Verizon Wireless CTO Tony Melone as saying:

“I am not a believer in outsourcing,” said Tony Melone, Verizon Wireless’ senior vice president and chief technology officer, during a question-and-answer session at Ericsson’s Capital Markets Day event here.

Melone said Verizon Wireless has long worked to promote the quality and reliability of its network–he trumpeted that the carrier has spent $50 billion on its wireless network since 2000. Thus, Melone said, outsourcing its network operations wouldn’t jive with the reliable-network image the carrier has spent billions pushing onto consumers.”

This jives exactly with my thoughts on the matter as mentioned yesterday.