Here are the 10 Laws and a few bonus insights:
The First Law of Cloudonomics: “Utility services cost less even though they cost more”
In other words, “More is Less.” Cloud computing can be viewed as a means of reducing cost, increasing revenue, improving business agility, and enhancing the total customer experience. Let’s focus on cost for a minute. There are some who believe that the unit cost of Cloud Computing is lower than that of “well run” enterprise data centers, some who believe the opposite, and inconsistent data supporting both viewpoints. Technology convergence in hardware, e.g., through increasing diffusion of containerized data centers, and in software, e.g., via offtheshelf integrated management tools, would appear to be shifting cost differentials, anyway. The correct way to look at the problem, however, is not unit cost, but total cost. Here’s where utility pricing, i.e., payperuse, makes a critical difference. As I show in the analysis “Mathematical Proof of the Inevitability of Cloud Computing” (PDF) and supporting “Value of Utility in the Cloud” (simulation), all other things being equal:
1) If cloud services have a lower unit cost than a doityourself dedicated solution does, you should use them; after all, why not save money?
2) If the unit cost of cloud services is higher than doityourself, but demand is spiky enough, even a pure cloud solution will have a lower total cost than a doityourself solution. The trick is that the payperuse model of the cloud means that the unit cost may be higher when the services are used, but it is zero when not used, unlike owned, dedicated resources.
3) In most cases, it’s economically optimal to utilize a hybrid model: dedicated resources that can be highly utilized, complemented by payperuse resources to handle the spikes. This is similar to owning a car for daily use but renting one occasionally. It is cheaper than either always renting, or, owning a car everywhere you might need one. The fraction of time spent near the peak demand turns out to be the key factor in determining the appropriate balance.
The phrase “all other things being equal” is a key one, because valid differences between approaches, behavioral economics factors, or network costs can shift the breakeven points and thus tip the balance.
The Second Law of Cloudonomics: “Ondemand trumps forecasting”
I quantify this in “Time is Money: The Value of OnDemand” (PDF) by defining an asymmetric penalty function associated with excess resources and with unserved demand, and examining how various factors influence the penalty costs.
It is important to distinguish between the terms “ondemand” and “payperuse.” Although they are often related, they drive different benefits and are logically separable: a hotel room reserved a month in advance may be payperuse but is not ondemand.
If demand is linearly increasing, even manual provisioning of dedicated resources can be effective. If demand is variable, but can be effectively forecasted and resources are sufficiently flexible, ondemand resourcing is of limited net value.
In most cases, though, when demand varies or is unpredictable, the benefit of ondemand can be sublinear, linear, or exponential, depending on the demand curve:
1) If demand growth is exponential, the gap associated with any fixed delay is also exponential. Colloquially, if you are responding to needs within a standard fixed interval, you will fall further behind at an accelerating rate.
2) If growth is linear, the penalty is proportional to the provisioning delay. Consequently, halving the provisioning interval will halve the penalty.
3) Some demand processes can to some extent be characterized as random walks. After a time t, the distance that they’ve “wandered” is proportional to √t, as demonstrated via this Random Walk (simulation) Consequently, it takes a fourfold reduction in time to halve the penalty cost.
The “Value of Utility in the Cloud” (simulation) enables assessment of different provisioning intervals and their resulting costs in light of the demand function.
The Third Law of Cloudonomics: “The peak of the sum is never greater than the sum of the peaks”
and
The Fourth Law of Cloudonomics: “Aggregate demand is smoother than individual”
These two laws are related, and I address them in depth in “Smooth Operator: The Value of Demand Aggregation” (PDF) as well as in simulations. Adding together independent demands leads to a smoothing effect, as peaks and valleys have a tendency to cancel each other out. This leads to beneficial economics due to what I’ve called the “statistics of scale.” As an example, suppose we can assume that each individual demand is normally distributed with identical mean and variance. Then, the coefficient of variation of n demands is only 1/√n as great as an individual demand. In the limit, a service provider with an infinite number of customers can theoretically achieve 100% utilization, leading to compelling economic advantages versus a “doityourself” approach. In the real world, the 1/√n effect means that being larger is better, but even midsized service providers can generate meaningful and real economic value.
While it is always beneficial for a customer with relatively spiky demand to use a service provider to average things out with one with relatively flat demand, the reverse is not necessarily true. The implication is that customers with similar degrees of variability (i.e., standard deviations and coefficients of variation) are likely to band together (i.e., reach a “separating equilibrium”) either via different service providers or via different tranches of pricing with incentive compatibility and selfselection assured by permitted scalability SLAs.
The Value of Resource Pooling (simulation) illustrates this effect: as the number of sites increases, the total capacity required for the pool decreases relative to the sum of partitioned capacities. Even more simply, the Value of Aggregation (simulation) illustrates the statistical effect: the normalized dispersion of the sum closes in on the mean as the number of customers is increased.
The Fifth Law of Cloudonomics: “Average unit costs are reduced by distributing fixed costs over more units of output”
This law points out that economies of scale can apply to cloud providers just as well as it does, say, paper mills or power plants. However, the concept of “economies of scale” tends to be primarily applied to production costs without consideration of demand patterns. For example, a large paper mill may produce a sheet of paper more cheaply than small batch production can, regardless of how much, how often, or when people choose to write or print. For services such as those in the cloud, however, it is useful to separate out factors such as volume discounts or distribution of costs to develop automation software across a broader base from the demandside statistics of scale benefits addressed above.
The Sixth Law of Cloudonomics: “Superiority in numbers is the most important factor in the result of a combat (Clausewitz)”
Regardless of whether there are any economies or diseconomies of scale, the degree of scale can be important in a variety of areas. One example is global coverage. Another example is in cyberattacks such as Distributed Denial of Service attacks, where large numbers of attacking bots can overwhelm limited web server resources. As some botnets are millions strong, it suggests that a large cloud (including large networkbased defenses) can be required to survive such an attack.
The Seventh Law of Cloudonomics: “Spacetime is a continuum (Einstein/Minkowski)”
and
The Eighth Law of Cloudonomics: “Dispersion is the inverse square of latency”
For parallelizable tasks, the number of resources can be traded off against time. Large search providers, for example, may use a thousand processors in parallel to respond to search queries in subsecond times. However, in today’s distributed environments, including the Internet, latency due to transmission delays over a network, e.g., due to physical signal propagation, must also be considered. If we want to reduce that latency, there are diminishing returns to increased investment, since it takes four times as many nodes to reduce latency by half on a plane. On the surface of the earth (or any sphere), the mechanics of covering circular areas defining worstcase or expected latency match this rule more and more closely as the size of the area covered decreases, but even at the scale of continents the Eighth Law is within a few percentage points of being accurate.
In “As Time Goes By: The Law of Cloud Response Time (PDF),” I explore both laws in depth, and propose that Amdahl’s Law should be superseded by a Law of Cloud Response Time, where response time T follows T = F + N/√n + P/p, where F is a fixed time, N is the network latency in the case of one service node, n is the number of service nodes, P is the time to run a parallelizable portion of code on one processor, and p is the number of processors. If one has Q processors to deploy, the lowest time T then occurs when the number of nodes n equals (QN/2P)^2/3.
While the exact value is thus dependent on the ratio between N and P, as the quantity of resources grows, the relative contribution of dispersion declines. Another key insight, which I prove in the paper, is that the cost structure of elasticity and the cost structure of dispersion in the cloud imply that in many cases, response time can be dramatically reduced without a cost penalty.
The Value of Dispersion in Latency Reduction (simulation) shows this √n effect on a plane, both for service nodes deployed in regular lattices as well as randomly positioned.
The Ninth Law of Cloudonomics: “Don’t put all your eggs in one basket”
The reliability of a system with n redundant components, each with reliability r, is 1[(1r)^n]. The good news is that higher reliability can be achieved through a judicious mix of replicated components, monitoring, management, and recovery processes, and overall architecture. The bad news is that no finite amount of replication will achieve perfect reliability.
The Tenth Law of Cloudonomics: “An object at rest tends to stay at rest (Newton)”
A preferred option may be infeasible due to switching or migration costs as well as psychological or behavioral economic factors. For example, power costs may be lower at a new site, but the costs of early contract or lease termination, temporary operations replication, testing, and physical equipment moves may make it challenging to migrate. Clouds may offer advantaged economics, but the cost and time to rewrite thousands or millions of lines of code alter those economics, suggesting that traditional / legacy environments will need to coexist with clouds for a long time. In addition, there are numerous behavioral economics factors that support use of the cloud—for example, hyperbolic discounting—or conversely, support traditional operations—for example, a need for control.
Payperuse is a Dominant Strategy
In “The Market for Melons: Quantity Uncertainty and the Market Mechanism” (PDF) I prove the dominance of payperuse pricing over flatrate pricing in competitive environments meeting certain basic conditions, such as nontrivial price dispersion and rational, active, maximizing consumers. This simulation demonstrates the effect. As the simulation runs, light users defect to payperuse plans, heavy ones to flatrate ones. The flatrate price rises, and defections tilt increasingly to payperuse. In the terminal state, virtually all customers are on payperuse plans. Perhaps surprisingly, even as consumers switch plans to reduce their costs, the total spend in such environments doesn’t change, in other words, aggregate shifts in plan selection are revenue neutral. This theoretical result matches the experience of major providers.
Cloud Computing is Computationally Intractable
In “Cloud Computing is NPComplete” (PDF), I prove that a basic problem in Cloud Computing, that of matching resources to demand over a network, is computationally intractable, even if there are sufficient total resources in the system to meet the aggregate demand. This result is something like the Heisenberg Uncertainty Principle, where position and momentum cannot simultaneously be precisely known, or Godel’s Proof that any nontrivial axiomatic system will either be incomplete or inconsistent. It implies that a “perfect solution” that optimizes cost may not be computable in short enough time–thus there either must be excess resources in the system, to ensure that all demand can be satisfied; excess latency, as imperfect solutions utilize unnecessarily distant resources to fulfill demand; and/or excess overhead, to divide up demand to be served via multiple nodes.
Cloud Computing can represent different things to different stakeholders: a technology, a business model, or a development and operations strategy. In addition, it may be viewed abstractly as distributed resources over a network with ondemand, payperuse attributes. As such, certain laws and behaviors apply as we outline above.
To jump right to the punchline(s), a payperuse solution obviously makes sense if the unit cost of cloud services is lower than dedicated, owned capacity. And, in many cases, clouds provide this cost advantage.
Counterintuitively, though, a pure cloud solution also makes sense even if its unit cost is higher, as long as the peaktoaverage ratio of the demand curve is higher than the cost differential between ondemand and dedicated capacity. In other words, even if cloud services cost, say, twice as much, a pure cloud solution makes sense for those demand curves where the peaktoaverage ratio is twotoone or higher. This is very often the case across a variety of industries. The reason for this is that the fixed capacity dedicated solution must be built to peak, whereas the cost of the ondemand payperuse solution is proportional to the average.
Also important and not obvious, leveraging payperuse pricing, either in a wholly ondemand solution or a hybrid with dedicated capacity turns out to make sense any time there is a peak of “short enough” duration. Specifically, if the percentage of time spent at peak is less than the inverse of the utility premium, using a cloud or other payperuse utility for at least part of the solution makes sense. For example, even if the cost of cloud services were, say, four times as much as owned capacity, they still make sense as part of the solution if peak demand only occurs onequarter of the time or less.
In practice, this means that cloud services should be widely adopted, since absolute peaks rarely last that long. For example, today, Cyber Monday, represents peak demand for many etailers. It is a peak who’s duration is only onethree hundred sixtyfifth of the time. Online flower services who reach peaks around Valentine’s Day and Mother’s day have a peak duration of only one onehundred eightieth of the time. While retailers experience most of their business during one month of the year, there are busy days and slow days even during those peaks. “Peak” is actually a fractal concept, so if cloud resources can be provisioned, deprovisioned, and billed on an hourly basis or by the minute, then instead of peak month or peak day we need to look at peak hours or peak minutes, in which case the conclusions are even more compelling.
I look at the optimal cost solutions between dedicated capacity, which is paid for whether it is used or not, and payperuse utilities. My assumptions for this analysis are that payperuse capacity is 1) paid for when used and not paid for when not used; 2) the cost for such capacity does not depend on the time of request or use; 3) the unit cost for ondemand or dedicated capacity does not depend on the quantity of resources requested; 4) there are no additional relevant costs needed for the analysis; 5) all demand must be served without delay.
These are assumptions which may or may not correspond to reality. For example, with respect to assumption (1), most payperuse pricing mechanisms offered today are pure. However, in many domains there are membership fees, nonrefundable deposits, option fees, or reservation fees where one may end up paying even if the capacity is not used. Assumption (2) may not hold due to the time value of money, or to the extent that dynamic pricing exists in the industry under consideration. A (payperuse) hotel room may cost $79 on Tuesday but $799 the subsequent Saturday night. Assumption (3) may not hold due to quantity discounts or, conversely, due to the service provider using yield management techniques to charge less when provider capacity is underutilized or more as provider capacity nears 100% utilization Assumption (4) may or may not apply based on the nature of the application and marginal costs to link the dedicated resources to ondemand resources vs. if they were all dedicated or all ondemand. As an example, there may be widearea network bandwidth costs to link an enterprise data center to a cloud service provider’s location. Finally, assumption (5) actually says two things. One, that we must serve all demand, not just a limited portion, and two, that we don’t have the ability to defer demand until there is sufficient capacity available. Serving all demand makes sense, because presumably the cost to serve the demand is greatly exceeded by the revenue or value of serving it. Otherwise, the lowest cost solution is zero dedicated and zero utility resources; in other words, just shut down the business. In some cases we can defer demand, e.g., scheduling elective surgery or waiting for a restaurant table to open up. However, most tasks today seem to require nearly realtime response, whether it’s web search, streaming a video, buying or selling stocks, communicating, collaborating, or microblogging.
It is tempting to view this analysis as relating to “private enterprise data centers” vs. “cloud service providers,” but strictly speaking this is not true. For example, the dedicated capacity may be viewed as owned resources in a colocation facility, managed servers or storage with fixed capacity under a long term lease or managed services contract, or even “reserved instances.” By “dedicated” we really mean “fixed for the time period under consideration.” For this reason, I will use the terms “payperuse” or “utility” rather than “cloud” except when providing colloquial interpretations.
Let the demand D for resources during the interval 0 to T be a function of time D(t), 0 <= t <= T.
This demand can be characterized by mean mu μ(D), which we shall simply call A, and a peak or maximum max(D) which we shall simply call P. Needless to say, based on the definitions of mean and maximum, A <= P. For example, the average demand A might be for 9 CPU cores, with a peak P of 31 CPU cores.
Let the unit cost per unit time of fixed capacity be C, and let U be the utility premium. By utility premium, I mean the multiplier for utility (payperuse) capacity vs. fixed. The unit cost of ondemand capacity is then U * C. For example, C might be $2.00 per core hour for fixed capacity. If ondemand capacity costs $3.00 per core hour, then U would be 1.5, i.e., there is a 50% premium for ondemand capacity.
To be slightly more precise, because ondemand capacity is assumed to be pure payperuse, in contrast to fixed capacity which is paid for whether or not it is used, there is a premium when the capacity is used, and a 100% discount when the capacity is not used. As stated above, this assumption may not be valid in all cases.
If U = 1, then fixed capacity and ondemand capacity cost the same. If U < 1, then payperuse resources (e.g., the cloud) are cheaper on a unitcost basis. It has been argued that economies of scale and statistics of scale can make cloud providers’ unit costs lower. If U > 1, then payperuse resources (e.g., the cloud) are assessed to be more expensive on a unitcost basis, as at least one study claims. Even under these unit cost assumptions, a pure utility or hybrid solution may be less expensive in terms of total cost, as we shall see.
Thanks to assumption (2), we can rearrange the demand curve to be monotonically nondecreasing, i.e., in ascending order, to help illustrate the points. In practical terms, this means that, for a site supporting special events, like concert or movie ticket sales, if they have a peak during 3 days each month, we can just treat it as if this peak occurred for 36 days at the end of the year. This reordering doesn’t impact mean, max, or any of the calculations below, but makes it easier to understand the proofs. In the real world, such an assumption may not be the case. Continuously growing, or at least nondecreasing, demand may be suitable for resourcing via fixed capacity.
Finally, it should be noted that thanks to assumptions (2) and (3), the cost of providing utility capacity to meet the demand D is just the utility premium U times the base cost C times the arithmetic mean A times the duration of time T. In other words, if the price of a hotel room doesn’t vary based on day or quantity–and we ignore the time value of money–then renting 8 rooms on one night and 2 rooms the next night costs the same as renting 5 rooms for two nights. This is because Σ U * C * D(t_{i}), i = 0 > T = U * C * Σ D(t_{i}), i = 0 > T = U * C * A * T.
Proposition 1: If U < 1, that is, the utility premium is less than unity, a pure payperuse solution costs less than a pure dedicated solution.
Proof: The cost of the payperuse solution is A * U * C * T. The cost of a dedicated solution built to peak is P * C * T. Since A <= P and U < 1,
A * U * C * T <= P * U * C * T < P * 1 * C * T = P * C * T
Therefore, the payperuse solution costs less than the dedicated solution.
Colloquially, the cloud total cost is advantaged due to only paying for resources when needed, as well as paying less for those resources when used.
Proposition 2: If U equals 1, that is the utility premium is unity, and A = P, that is demand is flat, then a pure payperuse solution costs the same as a pure dedicated solution built to peak.
Proof: The cost of the payperuse solution is A * U * C * T. The cost of a dedicated solution built to peak is P * C * T. Since A = P and U = 1,
A * U * C * T = P * U * C * T = P * 1 * C * T = P * C * T
Therefore, the payperuse solution costs the same as the dedicated solution.
In other words, if there is no difference between unit costs, and there is no variability in demand, it doesn’t matter which strategy you use. Of course, this assumes that your demand is predictable and that there is no financial risk, neither of which is typically the case. Even if you believed this to be true, all other things being equal, you might prefer the cloud solution due to demand forecasting risk and due to financial risk, e.g., residual values being lower than projected or changes in tax laws.
Proposition 3: If U = 1 and demand is not flat, that is A < P then a pure payperuse solution costs less than a pure dedicated solution.
Proof: The cost of the payperuse solution is A * U * C * T. The cost of a dedicated solution built to peak is P * C * T. Since A < P and U = 1,
A * U * C * T < P * U * C * T = P * 1 * C * T = P * C * T
Therefore, the payperuse solution costs less than the dedicated solution.
Interestingly, even if the unit cost of the payperuse utility is higher than the dedicated capacity, the total cost may be lower if the demand curve is “spiky” enough.
Proposition 4: Even if the utility premium U is greater than 1, if it is less than the peaktoaverage ratio P / A, that is, 1 < U < ( P / A ), then a pure payperuse solution costs less than a pure dedicated solution.
Proof: Again, the cost of the payperuse solution is A * U * C * T. The cost of a dedicated solution built to peak is P * C * T. Since U < (P / A)
A * U * C * T < A * (P / A) * C * T = P * C * T
Therefore, the payperuse solution costs less than the dedicated solution.
In other words, as I point out in my first law of Cloudonomics, even if a utility costs more (on a unit cost basis), the total cost can be lower than a dedicated solution, because of the savings when resources are not needed due to variations in demand. The more “spiky” the demand is, the higher rate one might be willing to pay for the utility. For example, if one needs a car every day for years, it makes sense to own / finance / lease it, for a rate of, say, ten dollars a day. If one needs a car for only a few days, it makes sense to rent it, even though the rate might be, say, fifty dollars a day. And if one needs a car for only a few minutes, it makes sense to grab a taxi, even though paying a dollar a minute works out to an equivalent rate of over a thousand dollars per day.
Let us define the total duration of the peak of the demand D to be T_{p}. That is, even if there are multiple periods when D is at peak, we sum them up to get T_{p}. This turns out to be an important criterion for determining the value of hybrid clouds.
Proposition 5: If the utility premium U is greater than 1, and (T_{p} / T) < (1 / U), that is, the percentage duration of the peak is less than the inverse of the utility premium, then a hybrid solution costs less than a dedicated solution.
Proof: Consider the cost of a hybrid solution consisting of P – ε dedicated resources with any overflow handled on demand by payperuse capacity. Because utility resources are only required to handle the ε worth of demand, and this demand only occurs for a duration of T_{p} of time, the total cost to solution the demand is:
[(P – ε) * T * C] + [ε * T_{p} * C * U]
However, since T_{p} / T < 1 / U, multiplying both sides by T * U we see that T_{p} * U < T. But then,
[ε * T_{p} * C * U] < [ε * T * C]
Which provides the inequality we need, namely that the total cost of the hybrid solution is
[(P – ε) * T * C] + [ε * T_{p} * C * U] < [(P – ε) * T * C] + [ε * T * C]
Since [(P – ε) * T * C] + [ε * T * C] = P * T * C is the cost of dedicated capacity, the total cost of the hybrid solution is less than the cost of dedicated capacity.
Note that P – ε is not necessarily an optimal solution, it just helps to demonstrate that there is a cheaper way to do things than using dedicated resources when the peak is sufficiently shortlived. To find an optimal solution we would need to know more about the characteristics of the underlying demand curve, as we shall see below.
Conversely, let us define the total duration of nonzero demand to be T_{NZ}. That is, even if there are multiple periods when D is greater than zero, we sum up their durations to get T_{NZ}. This turns out to be an important criterion for determining when a hybrid architecture beats a pure cloud.
Proposition 6: If the utility premium is greater than unity and the percentage duration of nonzero demand is greater than the inverse of the utility premium, i.e., (T_{NZ} / T) > (1 / U) < 1, then a hybrid solution costs less than a pure payperuse solution.
Proof: This proof is the mirror image of the prior one. Consider the cost of a hybrid solution consisting of ε dedicated resources with the remainder addressed by ondemand resources. The cost of serving this extra remaining demand doesn’t change between the pure payperuse and the proposed hybrid solution, so we need only consider the differential between using a dedicated solution and a utility solution for this first e of demand. The cost of serving this demand with utility resources is ε * T_{NZ} * U * C. The cost of serving the demand with dedicated resources is ε * T * C.
Since T_{NZ} / T > 1 / U, equivalently 1 / U < T_{NZ} / T, so multiplying both sides by T * U gives us the inequality T < T_{NZ} * U. Then
ε * T * C < ε * T_{NZ} * U * C
Therefore, a hybrid solution costs less than the pure utility.
In other words, if there is usually some baseline demand and utilities are somewhat costly, you may as well serve the typical baseline demand with the cheaper dedicated resources and save the ondemand resources for the variable portion of the demand. As Jens Lapinski put it when commenting on one of my articles, a good rule of thumb is to “own the base, and rent the spike.”
Knowing that, under the right conditions, a costoptimal solution may be a hybrid cloud does not tell us what balance of dedicated and ondemand resources achieves the optimum balance. For that, we will solve a specific example first, then argue for the general condition.
Proposition 7: Let D be uniformly distributed with peak P and the utility premium U > 1. Then the optimal hybrid solution consists of P / U ondemand capacity and P – (P / U) dedicated resources.
Proof: Let the fixed capacity be F and any demand over this amount be served by variable, ondemand payperuse capacity V, where F + V = P.
The total cost of the solution is then the sum of the fixed cost plus the ondemand cost. The fixed cost is just F * T * C. The variable cost is based on the size of the triangle, which has height V. The base of the triangle is based on the proportion between V and P, namely (V / P) * T. The cost, which is based on the area of the triangle, is ½V * (V / P) * T * U * C, so the total cost is:
[F * T * C] + [ ½V * (V / P) * T * U * C]
T and C are common, so this is just:
[T * C] * [F + ½V * (V / P) * U]
Substituting (P – V) for F and simplifying terms, the total cost is
[T * C] * [(P – V) + ½V^{2 }/ P * U]
The minimum occurs when the slope / derivative is zero. To solve this, it helps to remember that the derivative of a constant is zero, the derivative of a sum is the sum of the derivatives (as long as they exist), the derivative of x^{n} is nx^{n}^{1}, and the derivative of a constant times a function is the constant times the derivative of the function. T and C are nonzero constants, so we take the derivative with respect to V and set it to zero, getting a minimum at:
0 = [T * C] * [0 – 1 + V * U / P]
So,
0 / [T * C] = [0 – 1 + V * U / P]
Then
0 = 0 – 1 + V * U / P
Or, simplifying
1 = V * U / P
So the minimum occurs when V / P = 1 / U.
In other words, for uniformly distributed demand, the percentage of resources that should be ondemand is the inverse of the utility premium. If there is no premium, all resources should be ondemand, if the utility premium is 2, half the resources should be ondemand, if the utility premium is 4, a quarter of the resources should be ondemand, and so forth.
It turns out that this points the way to finding an optimal hybrid solution for any demand curve. Utility simulation models can be used to determine where the optimal solution lies, but the key insight is that if there is a lot of use, one may as well use dedicated resources, whereas if there is infrequent use, one should use a payperuse strategy. The breakeven point occurs where the cost of a dedicated solution equals the cost of a payperuse solution, which is when the percentage of use is 1 / U. For a fixed solution, the cost to service a sliver of demand for ε resources enduring for a period T/U would be ε * C * T, whereas for a payperuse solution, the cost would be ε * T/U * U * C, which is of course the same. This also means that there may not be a single optimum, but a range of optimal solutions that are equal cost because while there is a breakeven point, for some curves, a “breakeven zone” of a quantity of resources with the same duration can exist, and any of those resources can be assigned to dedicated or payperuse fulfillment without impacting the total cost.
These last few propositions show the value of hybrid resourcing strategies. If there is a short enough period of peak demand, rather than use only dedicated resources it makes sense to slice at least that out of the total solution and use ondemand payperuse resources to serve it. On the other hand, if there is a long enough duration of nonzero demand, you may as well use dedicated resources to serve that baseline.
So, these are the criteria for determining when pure clouds, pure dedicated solutions, or hybrid dedicated and payperuse solutions may be costoptimal. The analysis above is oversimplified, since it assumes that there are no additional (marginal) costs for hybrid solutions. Whether there are or not ultimately depends on the nature of the application and the architecture implementation and cost structure, as I discuss in 4 ½ Ways to Deal with Data During Cloudbursts.
While, strictly speaking, this isn’t proof of the inevitability of cloud computing, I’ve used reasonably rigorous math to determine the conditions under which cloud computing is relevant. And, because these conditions are so easily met given the demand fluctuations and price differentials seen in the real world, it means that cloud computing should be at least a part of virtually every enterprise’s IT strategy.
Joe Weinman is employed by a large telecommunications and cloud services company. The views expressed herein are his own, and not necessarily the views of his employer. (C) 2009 Joe Weinman

Connections are a natural function for the cloud. Until 1878, phone connections were done on a pointtopoint basis. Starting in January of that year, the first cloud bridging function, aka telephone exchange, was deployed in New Haven. As a result, instead of needing multiple phones at your residence—one physical phone for every other party you might want to call—you just needed one phone to connect up to the cloud. Communications take place over connections, and conversations are the long running transactions of communications. Also, communications need not be unicast, but can be multicast, anycast, or broadcast. The key reason such functions need to be processed in the cloud is that n connections to a hub are less costly than n squared pointtopoint connections.
Examples: telephony, microblogs 

– Correspondences  Another type of connection is matching, or correspondence. For example, dating sites make connections between people, and search sites make connections between advertisers and people searching for information. To best determine a correspondence requires the greatest number of options for each user, inherently driving a cloudbased solution.
Examples: search and dating sites 


Collaboration occurs when multiple parties communicate, or interact, for the sake of achieving a common goal where all can simultaneously achieve the goal. Competition occurs when they interact for a common goal, but only one or a subset can achieve that goal. Commerce is a mix of collaboration and competition. People gather to brainstorm. They gather to compete in arenas or engage in commerce. They gather in realworld markets such as the New York Stock Exchange or a neighborhood flea market. Similarly, collaboration, competition, commerce, or other interactions are best done in a common location—real or virtual—because collaborators benefit from more collaborators, it’s not a competition without other competitors, buyers benefit from more sellers, and sellers benefit from more buyers.
Examples: online gaming, vertical market sites, online auctions. 

– Clearing  When two parties engage in commerce, clearing, payments, and settlements can best be conducted in the cloud, by a neutral and trusted 3rd party. This is why lawyers have escrow accounts, and banks and brokerages utilize 3rd party financial intermediaries.
Examples: online payments and electronic funds clearinghouses 

– Conversion  Whether natural language translation, file format conversion, currency conversion, media gateways, transcoding, transrating, or the like, conversion functions are often (but not always) best done in the cloud. This is especially true when there may be a multitude of parties interacting in many formats. Exceptions would be compression and encryption, which are best done on the originating end, and of course decompression and decryption, which are best done at the destination.
Examples: audio and video bridging, currency services 

– Community  Communities are the pinnacle of the spectrum from connections to collaboration. Communities are built on connections where there are also shared goals, values, or interests. Since communities are made up of many participants, it doesn’t make sense for the community to reside with a single participant.
Examples: social networking sites 

– Crowdsourcing  Crowdsourcing for prediction markets, recommendation engines, or tagging tends to be a cloud function. Although crowdsourcing or open innovation for a particular initiative might occur at a single participant’s site, maintaining a regular community of crowdsourcing participants belongs to the cloud.
Examples: online movie rental / streaming sites, news or tagging sites with popularity voting or recommendation engines. 


Anytime that access to resources is partial or intermittent, a commons generates economic value. For example, it is difficult to singlehandedly use all of Central Park for all eternity. Spacedivision and timedivision multiplexing of common (shared) resources is less costly than dedicated resources. A collection, e.g., the Library of Congress, is an information resource. This cloud model dates back thousands of years. Even when the same information object can be used simultaneously, such as an online books or patents database, economics still favor sharing, because the cost to assemble the collection is incurred once and/or is roughly proportional to the size of the collection, whereas the value or revenue is derived in accordance with the number of users and uses.
Examples: online book or patent databases 


Information on an untethered device may not be available to even the owner of that device, unless he or she is colocated. Putting the info in the cloud means it can be accessed from anywhere and any device, given the right user access control.
Examples: web mail or photos accessible from your laptop or your smartphone, or anyone else’s, if you choose. 

– Cost  The cloud may be cheaper than doityourself, depending on your costs and the cloud service provider’s prices. Cost may be lower due to economies of scale—the cloud provider has lower HW costs, SW costs, tooling, cooling, power, etc., which are low enough to offset some additional expenses, such as SG&A. Or the provider may have better statistics of scale—by multiplexing demand from multiple sources can achieve better utilization. Or, the provider may have endtoend architecture benefits—for web sites, it may cost less to be hosted by an integrated service provider because all that traffic does not then need to further be transported to an enterprise data center.
Examples: electricity, cheaper to buy from the grid than run a nuclear power plant in your back yard 


Capital expenditures aren’t bad, unless they are for fixed capacity which is underutilized or insufficient, which is almost always the case. The other issue for smaller companies is that large capital expenditures may mean that there is not enough cash left to run the business.  

I’ve explored the cloudonomics of hybrid clouds both at ComplexModels.com and in some articles. Briefly, lowest total cost can be achieved through judicious investment in dedicated resources and ondemand use of payperuse resources, subject to additional concerns such as cost to store and transport data. These can be used for cloudbursting, where the dedicated resources are temporarily insufficient to handle a demand spike and the cloud is used for the overflow. They can also be used for business continuity, in terms of both failover and data availability via mirroring or replication, when the dedicated resources are impaired, unavailable, or destroyed, and the cloud becomes the home for applications, services, or data. Or, the cloud can be used for data center migration, where applications can be housed temporarily.  

David Ricardo showed that even when one entity enjoys cost advantages over another, it still makes sense to trade. Implications for IT shops: even if you can do everything yourself, it’s better to let someone else focus on context (noncore) activities, so you can devote your finite resources to things that develop or enhance competitive advantage.
Examples: outsourcing HR, CRM, tax prep, or stock picking via a mutual fund. 

– Capabilities and Competencies  Sometimes, you can’t do it all yourself, so that’s not an option. Cloud providers, especially as the SaaS market continues to evolve, develop unique expertise based on learning curve effects, “economies of skill,” and specialization.
Examples: proprietary search algorithms and heuristics 

– Celerity  “Celerity” is synonymous with “speed.” Cloud services can accelerate software development, testing, and deployment, by ready access to platforms and predeployed resources. Of course, using existing software in the cloud by definition is faster than trying to develop that same software from scratch.
Examples: software, platform, or infrastructure as a service 


Customers demand optimal experiences, or will go elsewhere. In today’s highly interactive world with globally distributed users, customer experience is enhanced through a geographically dispersed architecture. This is true on the web, with CDNs, but in the real world as well. It’s why there’s a coffee shop on every corner, rather than forcing you to fly all the way to a single consolidated location just to grab a cup of coffee.
Examples: content and application delivery networks. 


The cloud is great at providing a single reference copy or golden master, whether it is the official time, the length of a meter, or the most up to date version of the presentation you and your colleagues have been working on. A single update applied to this master defines the current version.
Examples: time.gov or any SaaS service. 

– Checkpoints  Checkpoints for controlling and validating transactions or entities entering a system from outside the system are optimally deployed at the perimeter of the system, because perimeters are smaller than areas, and surfaces are smaller than volumes. This is why customs, immigration, and border patrol are performed at points of entry.
Examples: networkbased firewalls, antiDDoS, email filtering for antivirus or antispam. 

– Chokepoints  Even if all transactions are valid, chokepoints for traffic management help manage the volume of inflow into a system to prevent the system from internally becoming too congested.
Examples: network traffic management, in digital networks, air traffic networks (gate delays), or highways. 

– Choicepoints  Providing a multitude of contextually relevant choices is a natural cloud function, partly because these choices form a collection, and partly because the choicepoint acts as a point of entry.
Examples: online portals, including search results pages 

– Combinations  Any or all of the above can be used together. 
Of course, in the real world, additional factors come into play in an apples to apples comparison. A large service provider may get deeper equipment discounts than a small or medium enterprise. Tooling costs can be distributed over a larger base. Staffing costs can be proportional to size of the environment, rather than requiring a minimum: for a small business with one server to have 24/7 monitoring requires five people on payroll — four people working 40 hour weeks doesn’t provide coverage for all 168 hours in the week, nor does it allow for vacations and sick days. Due to factors such as these, utility service providers can generate some economic value to ameliorate the premium.
Why would any enterprise, instead of spending $100 a month, intentionally choose to spend $140 or $180 or $220 or $280 or more?
The answer is that the flip side of utility services is that they cost less or nothing when not in use. While a hotel room may cost more per square foot per night when it is rented, it costs zero when it is not rented. A rental car costs more per car class per day than financing, leasing, or depreciating the car, but costs zero when it is not rented. A utility with a fully variable price, ends up costing less even though it costs more, as long as there is enough usage variability, i.e., demand variability to make the periods where nothing is being paid make up for the periods when the premium is being paid. The specific metric in question is the peaktoaverage ratio. If this ratio is greater than the utility premium (expressed as a fraction), then the utility will cost less than the fixed environment.
My paper “The Evolution of Networked Computing Utilities” addresses this in more detail. and you can run a Monte Carlo simulation “The Value of Utility Resources in the Cloud” using your own assumptions about variable demand, utility premiums, and other costs associated with fixed and utility environments. There are some other key assumptions that bear mentioning: the objective of the environment must be to serve demand, and demand must not be deferrable.
If a large spike in demand can just be ignored, then there is no need to ensure resource availability to serve it. In realworld situations, there is an opportunity cost to unserved demand, e.g., revenue loss from customers that are turned away rather than being serviced. The utility premium must be less than this opportunity cost, otherwise it is cheaper to do without the customer than to invest to capture their spend. If demand is deferrable, then a fixed environment that is sized to the average demand is cheaper than a variable environment incurring the utility premium. An example of nondeferrable demand is an emergency room: if you’ve just been hit by a truck, you’d prefer not to wait until next month to see the ER doctor. An example of deferrable demand is having your house painted: if the painter is busy the next few weeks, no problem, you can wait.
The economics of serving nondeferrable demand then, somewhat counterintuitively, demonstrate that more is less.