Monthly Archives: January 2012

Snarky Comments on Cloud SLAs (Part II)

In my last post I wrote about the problems with Cloud Service Level Agreements (SLAs), and why they tend to be less useful than one would like.  OK, so what can you do about it?

1) Ask Questions — If your service provider offers an SLA, study it and ask questions.  “Is that all you can promise?  What’s your historical performance?  Can you show me some documentation of that from your monitoring tools?  Have you had any downtime incidents in the last five years?  What were the root causes, and what did you do to address those?”  The discussion will be useful in filling in your picture of their capabilities.  In the end, an SLA is just a number (or a few numbers), and it doesn’t give you much insight into the provider’s ability to deliver.  An SLA is a bit like a car warranty—it’s nice to know you have it, but your real hope is that it won’t be needed and the thing won’t have any significant breakdowns.

2) Try to Negotiate — Good luck on this, but give it a try.  I’ve found that, normally, the salesman doesn’t have the discretion to change the penalties in the SLA.  However, if the vendor doesn’t offer you an SLA, just asking might get you somewhere.  In the logistics software space, there is a company called SMC3 that creates a system used for calculating the rate of a Less than Truck Load (LTL) shipment, and most companies in the industry use this product.  They recently converted from a traditional licensed software model to a cloud model.  I was at their conference in Atlanta a few days ago, and asked about their SLA policy.  They don’t promise an out-of-the-box SLA, but they say that they’ll negotiate one on occasion.  So it seems that if you are a big enough customer to them, you’ve got a chance of getting an SLA in writing.  You’re not doing this to negotiate the SLA up to a level with which the vendor isn’t comfortable, because that isn’t going to change their real-world ability to deliver on the SLA.  But you want to understand what they can deliver, and get comfortable (or not) with this level of performance, assuming the vendor’s service is critical to your business.  In the case of SMC3, if you are moving LTL shipments for a living and you need to quote prices to your customers, then you’re really depending on SMC3 to deliver, because if their service goes down you can’t provide price quotes to customers.  So you owe it to yourself to understand their capabilities.

3) Look at Audit Reports – If the vendor has been audited (e.g., using the SAS 70 standard) then you should get a copy of the audit report.  Read it.  DO NOT just treat this as a check mark—“great, they passed an audit.”  Look where that got investors in Enron and Worldcom.  Like most standards of this nature, the audit “control objectives” for SAS 70 are essentially whatever the cloud vendor says they are.  So the cloud vendor tells the audit firm their control objectives, and the auditor audits to check that there is a reasonable assurance they would achieve those objectives (the auditor will state very clearly, though, that this is not a guarantee).  The auditor will make sure the business objectives that the cloud vendor uses are not completely stupid, but I find that a lot of people assume (wrongly) that SAS 70 is some kind of seal of approval that the vendor is doing “all the right things”.  SAS 70, for example, says nothing about providing disaster recovery (DR) capabilities.  A vendor can provide zero DR, and pass an SAS 70 audit with flying colors, simply by not stating any control objectives regarding DR, but showing compliance with whatever control objectives they do have.

4) Do Your Own Audit — While it might make you feel good to host your systems in-house, because you feel you have more control, chances are that you can’t do as good a job as a strong cloud vendor.  You probably know someone who feels comfortable driving a car, since they control it, but who doesn’t like flying in an airplane because they have to trust the pilot, even though flying is statistically safer than driving a car.  So maybe you should find out if your cloud vendor knows how to “run an airline”.  If the service provider is critical to you, arrange an audit where you visit them and understand their capabilities.  Are they doing the basics?  Do they have multiple power supplies on each server, or only on some of them?  Do they have multiple NICs on every server?  Ask them to show you.  Do they have current maintenance contracts with their key vendors (server hardware, network hardware, etc.) or did they “forget” to renew them?  How fast would that vendor promise to rush in a part required to fix their box if it went down?  Next business day? Hmmmm.  One of my customers once did an audit like this on my data center, and while it was painful at the time, ultimately it was beneficial for me and my team, and for the customer, since they had a much deeper knowledge of our real capabilities, and in the end they grew comfortable that we knew what we were doing.

Tagged ,

Snarky Comments on Cloud SLAs

When I read articles about contracting with cloud service providers (or other IT service providers such as colocation providers) often I see a lot of emphasis on Service Level Agreements (SLAs).  That’s the wisdom that they offer you: make sure you have a solid SLA with your cloud service provider.  The only problem with SLAs is that while, theoretically, they should be a useful tool for managing your cloud vendors, in practice they are a bit weak.   Here’s why:

1) The standard penalties you are likely to get in your contract with a cloud vendor if they miss their SLA are miniscule compared with the pain and suffering you are going to experience if the service goes down.  One cloud provider with whom I’m familiar (I won’t name them, but their SLA is pretty typical) will provide a 5% credit if their uptime is between 99-99.5% (their SLA guarantee is 99.5%).  So let’s say you are spending $5000 per month with that service provider, and that during one month they are down for 7 hours, and you can’t service your customers for most of a business day.  What do they give you?  Let’s see….a month with 31 days has 744 hours in it.  So if they go down for 7 hours, we’re just above 99% uptime.  So they owe you 5% of $5,000, which is $250.  ARE YOU KIDDING ME???  Your business is out one day’s revenues, the CEO is on the phone asking to speak with “shithead”, and your cloud service provider is going to give you $250?

2) Typically, cloud vendors lowball the SLA.  A typical number you’ll see in the marketplace now is uptime SLAs in the range of 99.5%.  So over the course of a year, they could be down for…365*0.005 = 1.825 days?  There’s a good chance I could hit 99.5% hosting your software on my laptop.   (Well, as long as I promise not to fire up iTunes, which seems to crash my Windows 7 laptop.  Yeah, yeah, I know.  It rocks on a Mac.)  By the way, that 99.5% target doesn’t include scheduled downtime.  So as long as they announce the downtime ahead of time, it doesn’t count.  By the way, ask your vendors how much notice they provide.  Some vendors only give about a day’s notice for maintenance windows, which makes it difficult if you have users that use the system during off-hours, when the vendor is doing the maintenance.  So anyway, the cloud vendors just aren’t setting very aggressive SLAs.  If you ask the vendor, they’ll probably admit that they are lowballing the SLA, or at least they’ll say, “Well, our long run average is certainly better than that.”

3) Some cloud vendors won’t even provide an SLA.  Salesforce.com is an example.  At least they won’t give you one if you’re not a mega-customer.  And apparently even for big customers they try to avoid doing so, to the point where according to some sources (http://www.online-crm.com/sla.htm), they won’t even call you back if you call and ask for an SLA!  So they are really just saying “look, we think our service level is good enough, but we’re not going to guarantee it, so just try it out and if you aren’t happy, then just leave.”  For certain kinds of services that’s probably good enough, but for “mission critical” services, it’s not a very reasonable answer.  One thing Salesforce does that I like a lot, though, is give you transparency to their current system status, and recent history.  Check this out:  http://trust.salesforce.com/trust/status/.  So you can have some feeling that they’ll be honest with you if there is an issue.  However, they don’t provide data on their long-run historical performance and they aren’t making you any promises.

So what do you need to do to?  Forget about SLAs?  Not entirely.  I’ll talk about some ideas for managing SLAs in my next post (people have been telling me I put too much information in my posts and I need to chop it up!)

Tagged ,

Is there a US Developer Shortage? Why Going Offshore is Just a Matter of Time

If you have a small development team of a dozen people, or fewer, you can probably do fine developing all of your software onshore.  It might even be the right call, provided you can afford it, since at that point you might just be figuring out what your product really is, and it’s better to work with developers who sit with “the business” and are able to react quickly to what the business is seeing in the marketplace.  Or you might need some specialized skills that just aren’t available offshore.  Sometimes if you’re working with the latest of the latest technologies, or something a bit arcane, then you might have trouble finding that skill set offshore.  Or, if you need some awareness of the business or social environment in your home country, then people at home would have a better intuitive sense of what you’re trying to accomplish.  For example, if you’re building a social media product for bar-hoppers in US cities, you’re probably best to develop version 1.0 as a “Lean Startup” with 2-3 developers in a major US city, since you’re going to have a tough time explaining to offshore developers exactly what your product vision is.  And if you try to explain it, they’ll most likely give you a product that doesn’t fit the bill, and tell you that “you kept changing the requirements” (and they’d probably be correct about that).

But at a certain point in your company’s life, assuming you are above a certain scale and size, you’ve most likely got to do some work offshore.  Of course there’s a cost argument.  Loaded costs for developers at an outsourcing shop in India are around 50% of the costs for comparable people in the US.  And if you have the time and knowledge to form a “captive” offshore development shop, you might save more than that.  But apart from cost, it seems there just aren’t enough software developers in the US to meet the demand anymore.  There’s a contrarian point of view regarding this, and you can hear it from commentators like Ron Hira, who claims that actually there is no shortage of developers in the US, and when companies go offshore they are just looking for low costs and there’s no supply issue when it comes to developer talent.  In some of his other articles, Hira makes some good points about the H1B visa program in the US which, while generally positive for the US, has been abused by some of the IT outsourcing firms to bring in undistinguished talents, and pay them below the US norm.  But in this case, I think Hira’s work suffers because he lumps all IT workers into one big bucket.  I’m not talking about a shortage of generic IT workers.  But if you need to hire highly skilled software developers, who are current in the latest web and mobile technologies, and who are the best in the world at what they do, there is data that shows that there is a shortage in the US.

The Wall Street Journal recently reported that there are over 150,000 openings for software developers in the US right now.  Intuitively that sounds a bit high to me, but there’s no doubt that there are major shortages in some key areas, such as mobile development—who isn’t thinking about their mobile strategy these days?  And in certain areas of the US, there just aren’t enough people in the region engaged in software development, and if you need to hire a few experienced Java developers, it turns out to be quite difficult.  If you’re looking for these people in quantity in the US, and your company is not called Facebook or Google, you might be fighting a losing battle.  So you might be better off offshore, where you can raise your wages a bit above market and attract some strong talent.

Is that the future of work in the US?  Is all of the software going to be built offshore, while those of us living in the US will all be “idea people” (and burger flippers and retail clerks, I suppose)?  Do we need to encourage more American kids to study “hard” subjects like science, engineering and math as their college majors, in order to make us more competitive as a nation?  One would think that the market would signal the need for more developers through rising salaries, and more people would respond to those signals and would study disciplines like computer engineering and computer science in college.  Good luck with that.  The problem is that the dogs (students?) just aren’t eating that dog food.  A recent study reported in the NY Times found that students drop out of these “hard” subjects in droves.  It’s not exactly clear why this is the case, but it seems to have something to do with these courses of study being something like the Bataan death march.  It seems for most American kids, the prospect of $60K+ starting salaries for entry level developer jobs just doesn’t justify having a shitty undergraduate experience, just when they are spreading their wings and getting out of the house.   Apparently President Obama set a goal of graduating an additional 10,000 engineers per year.  But if you believe the experts (at least the one guy quoted in the Times article), that’s just not gonna happen.  Even if we got to 10,000 additional engineers per year, and we peel away the civil and chemical engineers, and others who can’t help us close our 150,000 person software developer gap, then it would appear that our gap wouldn’t close for years anyway.

I don’t really have a realistic “three-step plan” to solve this software developer gap in the US, and I’m not aware that anyone else does either.  But I think it’s a real gap, so for the foreseeable future the only way US companies are going to get the developers they need is through a mixture of offshore and onshore software development.  So we might as well just admit this and figure out how to be great at managing mixed teams of onshore and offshore developers, probably using Agile-type processes.  I’m not of the belief that there’s no future for US developers.  It’s just the opposite—there’s plenty of demand for them.  But there just aren’t enough developers with the skills we need onshore, so companies above a certain scale and size need to think offshore.

Agile and Lean: Good Partners in Software Development

I’m a fan of Agile software development.  But there’s one big thing that bugs me about it.  The underlying assumption seems to be that the business knows what it wants, and can articulate the required “user stories”, and prioritize these correctly.  And that even before we talk about user stories, Agile assumes that the business people are pointing the business in the right direction, strategically, and that your product roadmap is more or less correct (at least at a high level), and so the adjustments you make at the start of each sprint are pretty incremental to the company’s overall product strategy.  Well, that ain’t always the case, and I’ve seen a lot more money wasted by the business being incapable of settling upon workable and consistent strategies, than by software developers who write bad code.  So using your product development efforts to validate the direction in which the business is driving is vital, particularly in cases where the business is unable to prioritize user stories correctly because there’s so much uncertainty about the business direction and strategy.

I like Eric Ries’ “Lean Startup” approach in cases like this.  Ries says that when you are in startup mode (and in his definition of startups, he includes any team that is trying to fire up a new business concept around a software product or service, so you could include teams working in larger companies who’ve been given this challenge) the business typically has little idea what’s going to work and what’s going to fail.  They may think they do, but the reality is they may be mucking around in the right area, but they don’t know what will work, and their chief assignment is to learn what works and what doesn’t, and what is important to customers and what isn’t.   In Ries’ view (and I agree with him), you don’t go out and do some market research to find out the answers (at least not right away), since if you’re inventing something very new the people you interview will have no idea what you are talking about and might mislead you.  And you don’t get senior management in a room with a white board to map it out either, since they may have known enough to get the startup funded, but don’t know all the answers either.  Instead, Ries tells you to perform a series of rapid experiments with customers (and prospects) and figure it out scientifically.  You test how users respond to a feature, and whether they use it.  If they don’t use it, then go talk to them–find out why.  Now you can have a useful discussion with them, not a theoretical discussion.  If nothing’s working, you might need to consider pivoting towards a new strategy that (hopefully) will work better.  But ideally, make any adjustments or decide to pivot as soon possible, so you don’t spend two years “perfecting” your product based on your own ignorant idea of what the product will be, only to have customers later tell you that you were way off target.  Even if your idea was mostly right, it will benefit from some contact with the customer as soon as you can get this.  If your idea was mostly wrong, it’s best to learn that before you waste a lot of time and resources.

OK so if you buy the idea that being a startup means you need to do the best you can to have a rapid feedback loop from learning something in the market, to implementing some changes in your product based on those market learnings, you’re going to need to invest to develop your product in a way that supports that learning process:

1)      Build an infrastructure for running A/B tests on users (Ries’ approach sounds a bit consumer-oriented, and not enterprise software-oriented, but we can take some inspiration from the idea and find ways to apply it to an enterprise world)

2)      Ensure you gather useful metrics on user behaviors, so you can analyze what they really did with the product, and have objective discussions about what’s working and what’s not

3)      Have a capability for frequently deploying incremental new releases (e.g., test automation is going to be critical here)

4)      Have a process that supports quick reaction to market feedback (in terms of feature prioritization)

When you consider these factors, 3 & 4 are provided by an Agile process, but 1 & 2 really aren’t.  Agile doesn’t provide an approach to figuring out your strategy, and quickly testing what’s working and what’s not when you are in startup mode.  And Agile doesn’t tell you that you need to do experiments on customers to figure out what they want, and what they are willing to pay for.  When you’re in startup mode, learning those things are the most important things you can do in order to apply your resources to what’s working.  Unfortunately, Agile doesn’t really tell you that.

So if you have a relatively mature business, and you’re not in startup mode, you don’t really need to be Lean.  You understand, more or less, what you’re doing, and you can run an Agile development process with a backlog of requirements, and prioritize user stories at the start of each release cycle, and be relatively confident that you’re going to add value with each release.  In fact, you’ll be much more confident than if you were using a waterfall-type process.  Even if you mis-prioritize the requirements in one release, there’s another release following soon after, so you’ll get to those requirements then, and it’s not a disaster.  But if you are starting up something entirely new, whether it’s an entirely new company, or a new product line for an existing company, and you can admit that you don’t know what’s really going to work, and whether customers will even buy the product you envision, then Agile isn’t sufficient.  A Lean approach and an Agile development process are complementary and support each other, and you’d benefit from using both.