Cloud or In-house with Chris Kinsman, Part 1

I sat down last week with Chris Kinsman, Chief Architect at PushSpring and a 20+ year veteran of software development, to talk about the factors that go into deciding when to migrate to the cloud. In Part 1 we’ll talk about what factors into the business and technical cases, and in Part 2 well look at additional arguments against moving to the cloud – and even at what point it makes sense to scale back from the cloud to “in-house” operation. I’ll let Chris do most of the talking here:

“My first question is always ‘what is the business case for in-house?’. Some companies like to say we’re not going to the cloud for privacy, or security which is closely related. And there are different reasons given by both business and IT.”

Traditionally one of the biggest arguments for keeping operations in-house is the high quality of existing staff and operations – tied to the “unique” requirements of whatever the particular business is doing.

“I think that is a fallacy. Look at the teams that Microsoft or Amazon has in place supporting their data centers – that’s way beyond what anyone else has. So that’ not the real argument – it’s more that they don’t want to give up control. And management plays along with that – I can’t yell at Amazon when there’s an outage, but I can yell at my guys.

But business guys are different. Yes, there are potential concerns about privacy, and outages, but their biggest concerns are around cost. And the problem there is people tend to compare hardware, depreciated, versus payments to Amazon. But if all you’re doing is putting your hardware compared to your monthly fee, then you really aren’t accounting for all costs. And even if you add in your co-lo costs, there’s a lot of additional monitoring, headcount, et cetera that need to be included to make it a true comparison – and a lot of companies don’t do that.

Another thing that companies do is assume a 3 to 5 year capitalization of the hardware. They then say the same machine from Amazon costs this amount. But Amazon keeps upgrading their machines and charging less. So a year in, you can get a better price than you did at year 1 and there’s no opportunity cost to switch it because there’s no sunk cost. So you could have relatively equal computer power on day 0, but get 10% more power after one year OR keep that power and pay less. So you have to look at historical pricing over the past few years. Azure and AWS have seen continued downward pressure.”

Other than historical data for Amazon spot instances, it is surprisingly hard to get reliable and continuously updated figures on historical cloud computing costs – and even more difficult to predict how those costs might decline in the future. Compared to hardware prices that have historically followed Moore’s Law and declined 20 to 30 percent per year where data density doubles approximately every 18 months (and thus cost declines proportionately), cloud computing costs have generally fallen 6 to 8 percent per year. Google made a strong statement about being more aggressive with price cuts with their 2014 announcements, but the industry has not fall in with Google’s “ongoing commitment to bringing Moore’s Law to the cloud”

“The other thing that isn’t well evaluated is when trying to make an apples to apples comparison is that they think Azure virtual machines versus AWS EC2 – but they don’t adequately consider the other parts of the ecosystem that those guys bring to the story.

For example, Azure storage and Amazon S3 are both block storage in the cloud that you don’t have to worry about provisioning as you grow. With bandwidth you don’t have to worry about provisioning as you grow. And that are fronted by a CDN that makes it super trivial to put it as close as possible to your users. You can do that in-house, but it’s something you have to build – or buy – and figure out how to scale it.

And that’s just one example. Let’s take that out to more complicated application-level logic comparing Azure Service Bus or AWS Simple Queue Service (SQS). So now we’re talking about building a message bus – which is something that is non-trivial to set up and manage and you don’t want someone who doesn’t know what he’s doing to do it. Dirt cheap, but…”

Well, this goes back to the scalability of – in this case – talent. It’s impossible for one individual to be a storage, cloud, and security expert. And it’s difficult to determine if you’re doing a thorough job in a particular area of expertise without first acquiring that expertise.

And there are a lot of things you don’t want someone who’s not an expert doing, and building a message bus has to be pretty high on that list! SQS gives you the first 1 million requests (and a message can contain up to 10 message!) for free, which sounds like a lot of value until you know that at $.50 per million requests that’s just $1.

“So a lot of guys just think about building what they have now and in my mind, that’s not taking advantage of the cloud. There are some guys who have to do that – they have “in-house” versions of their software as well as their cloud versions – they have a hard time deciding if they want to be a SaaS provider or sell on-premise software – so that’s a hard decision point for them, they can’t go all-in on cloud services which means that they’re not going to be as competitive against a pure-SaaS vendor. And that’s where the comparison becomes even more difficult. I don’t see them calculating what the costs are for building and maintaining these kinds of services.”

And those higher-level services are built on other services – nobody is building anything from scratch. And every one of those services is going to have potential issues with migration and so on. So if you’re a messaging guy and you understand building and optimizing pipelines, then go for it. But most organizations are going to stumble on this.

“SwiftType announced they were going to physical hardware. They were moving off of AWS EC2 and onto SoftLayer – which is excellent, I’ve used it previously – but SoftLayer was bought by IBM and their whole claim to fame is that they’re a “Cloud Provider” although they do not provide near the number of high order services of Amazon or Azure. They’re mainly a competitor to someone like RackSpace. But they’re physical hardware. You don’t own the hardware, it’s in a data center somewhere, and the attraction to that is that many folks are worried about the ‘noisy neighbor problem’.”

Meaning you’ve got some guy using up a ton of bandwidth using up some part of the shared resources. But there are many places where a bottleneck can occur.

“Right, and there’s other ways to deal with it. You should be building redundancy anyway, so make sure you have good diversification there and it’s unlikely that you’re going to have the ‘noisy neighbor problem’. But then they’d argue back that’s going to cost more because you’re going to have to build out more redundancy and more infrastructure. And I would argue back that means that you’re way more resilient. For example, what if one of the data centers goes down.

You have the sense of control, and to a certain point you do, but you’re not being realistic about your level of expertise or all of your investment.

Take security as an example – do you have a guy you hired, who says he’s a security guy, or do you have a team of 50 guys like Steve Riley who’re thinking about security all the time. AWS and Microsoft have 50 guys – maybe more than that – and unless you’re huge, you simply won’t have that capability.”

And in terms of resilience, if a security guy leaves one of the Cloud Providers they can handle that – if that’s your entire security department that just left you have a significant challenge. It all goes back to who do you trust. Do you trust your employee? High-power consultant? Or do you trust the providers to have the right people?

“As with nuclear weapons, ‘Trust but verify’”

About the Author/ Keith Pleas

Keith Pleas is the Organizer and Conference Chair for the annual ALM Forum, a software architect and one of the founders of Guided Design and has worked with the patterns & practices team for several years where he was the architect and subject matter expert for “Design for Operations”. Prior to that he worked for more than two years on the team developing the .NET Framework and Visual Studio .NET. Keith is an internationally known writer and speaker and past Editorial Chair for the VSLive conferences. He was also a founding Contributing Editor to “Visual Studio Magazine”, and a Contributing Editor to numerous other publications including “Windows NT Magazine”. Keith has developed several Microsoft Professional Certification Exams. Keith was a founding board member of INETA where he also created the INETA Speakers Bureau. You can reach Keith on Linked In.