This is my first Cloudonomics.com post in a while, as I’ve been engaged in developing a number of Monte Carlo simulations and detailed proofs of The 10 Laws of Cloudonomics. Here’s an overview of the results, with summaries of conclusions for those that don’t want to read through over 200 pages of proofs and analyses spanning economics, behavioral economics, calculus, probability, statistics, trigonometry, system dynamics, and computational complexity theory. Links to the papers and supporting simulations at ComplexModels.com are included.
Here are the 10 Laws and a few bonus insights:
The First Law of Cloudonomics: “Utility services cost less even though they cost more”
In other words, “More is Less.” Cloud computing can be viewed as a means of reducing cost, increasing revenue, improving business agility, and enhancing the total customer experience. Let’s focus on cost for a minute. There are some who believe that the unit cost of Cloud Computing is lower than that of “well run” enterprise data centers, some who believe the opposite, and inconsistent data supporting both viewpoints. Technology convergence in hardware, e.g., through increasing diffusion of containerized data centers, and in software, e.g., via off-the-shelf integrated management tools, would appear to be shifting cost differentials, anyway. The correct way to look at the problem, however, is not unit cost, but total cost. Here’s where utility pricing, i.e., pay-per-use, makes a critical difference. As I show in the analysis “Mathematical Proof of the Inevitability of Cloud Computing” (PDF) and supporting “Value of Utility in the Cloud” (simulation), all other things being equal:
1) If cloud services have a lower unit cost than a do-it-yourself dedicated solution does, you should use them; after all, why not save money?
2) If the unit cost of cloud services is higher than do-it-yourself, but demand is spiky enough, even a pure cloud solution will have a lower total cost than a do-it-yourself solution. The trick is that the pay-per-use model of the cloud means that the unit cost may be higher when the services are used, but it is zero when not used, unlike owned, dedicated resources.
3) In most cases, it’s economically optimal to utilize a hybrid model: dedicated resources that can be highly utilized, complemented by pay-per-use resources to handle the spikes. This is similar to owning a car for daily use but renting one occasionally. It is cheaper than either always renting, or, owning a car everywhere you might need one. The fraction of time spent near the peak demand turns out to be the key factor in determining the appropriate balance.
The phrase “all other things being equal” is a key one, because valid differences between approaches, behavioral economics factors, or network costs can shift the breakeven points and thus tip the balance.
The Second Law of Cloudonomics: “On-demand trumps forecasting”
I quantify this in “Time is Money: The Value of On-Demand” (PDF) by defining an asymmetric penalty function associated with excess resources and with unserved demand, and examining how various factors influence the penalty costs.
It is important to distinguish between the terms “on-demand” and “pay-per-use.” Although they are often related, they drive different benefits and are logically separable: a hotel room reserved a month in advance may be pay-per-use but is not on-demand.
If demand is linearly increasing, even manual provisioning of dedicated resources can be effective. If demand is variable, but can be effectively forecasted and resources are sufficiently flexible, on-demand resourcing is of limited net value.
In most cases, though, when demand varies or is unpredictable, the benefit of on-demand can be sub-linear, linear, or exponential, depending on the demand curve:
1) If demand growth is exponential, the gap associated with any fixed delay is also exponential. Colloquially, if you are responding to needs within a standard fixed interval, you will fall further behind at an accelerating rate.
2) If growth is linear, the penalty is proportional to the provisioning delay. Consequently, halving the provisioning interval will halve the penalty.
3) Some demand processes can to some extent be characterized as random walks. After a time t, the distance that they’ve “wandered” is proportional to √t, as demonstrated via this Random Walk (simulation) Consequently, it takes a four-fold reduction in time to halve the penalty cost.
The “Value of Utility in the Cloud” (simulation) enables assessment of different provisioning intervals and their resulting costs in light of the demand function.
The Third Law of Cloudonomics: “The peak of the sum is never greater than the sum of the peaks”
The Fourth Law of Cloudonomics: “Aggregate demand is smoother than individual”
These two laws are related, and I address them in depth in “Smooth Operator: The Value of Demand Aggregation” (PDF) as well as in simulations. Adding together independent demands leads to a smoothing effect, as peaks and valleys have a tendency to cancel each other out. This leads to beneficial economics due to what I’ve called the “statistics of scale.” As an example, suppose we can assume that each individual demand is normally distributed with identical mean and variance. Then, the coefficient of variation of n demands is only 1/√n as great as an individual demand. In the limit, a service provider with an infinite number of customers can theoretically achieve 100% utilization, leading to compelling economic advantages versus a “do-it-yourself” approach. In the real world, the 1/√n effect means that being larger is better, but even mid-sized service providers can generate meaningful and real economic value.
While it is always beneficial for a customer with relatively spiky demand to use a service provider to average things out with one with relatively flat demand, the reverse is not necessarily true. The implication is that customers with similar degrees of variability (i.e., standard deviations and coefficients of variation) are likely to band together (i.e., reach a “separating equilibrium”) either via different service providers or via different tranches of pricing with incentive compatibility and self-selection assured by permitted scalability SLAs.
The Value of Resource Pooling (simulation) illustrates this effect: as the number of sites increases, the total capacity required for the pool decreases relative to the sum of partitioned capacities. Even more simply, the Value of Aggregation (simulation) illustrates the statistical effect: the normalized dispersion of the sum closes in on the mean as the number of customers is increased.
The Fifth Law of Cloudonomics: “Average unit costs are reduced by distributing fixed costs over more units of output”
This law points out that economies of scale can apply to cloud providers just as well as it does, say, paper mills or power plants. However, the concept of “economies of scale” tends to be primarily applied to production costs without consideration of demand patterns. For example, a large paper mill may produce a sheet of paper more cheaply than small batch production can, regardless of how much, how often, or when people choose to write or print. For services such as those in the cloud, however, it is useful to separate out factors such as volume discounts or distribution of costs to develop automation software across a broader base from the demand-side statistics of scale benefits addressed above.
The Sixth Law of Cloudonomics: “Superiority in numbers is the most important factor in the result of a combat (Clausewitz)”
Regardless of whether there are any economies or diseconomies of scale, the degree of scale can be important in a variety of areas. One example is global coverage. Another example is in cyberattacks such as Distributed Denial of Service attacks, where large numbers of attacking bots can overwhelm limited web server resources. As some botnets are millions strong, it suggests that a large cloud (including large network-based defenses) can be required to survive such an attack.
The Seventh Law of Cloudonomics: “Space-time is a continuum (Einstein/Minkowski)”
The Eighth Law of Cloudonomics: “Dispersion is the inverse square of latency”
For parallelizable tasks, the number of resources can be traded off against time. Large search providers, for example, may use a thousand processors in parallel to respond to search queries in sub-second times. However, in today’s distributed environments, including the Internet, latency due to transmission delays over a network, e.g., due to physical signal propagation, must also be considered. If we want to reduce that latency, there are diminishing returns to increased investment, since it takes four times as many nodes to reduce latency by half on a plane. On the surface of the earth (or any sphere), the mechanics of covering circular areas defining worst-case or expected latency match this rule more and more closely as the size of the area covered decreases, but even at the scale of continents the Eighth Law is within a few percentage points of being accurate.
In “As Time Goes By: The Law of Cloud Response Time (PDF),” I explore both laws in depth, and propose that Amdahl’s Law should be superseded by a Law of Cloud Response Time, where response time T follows T = F + N/√n + P/p, where F is a fixed time, N is the network latency in the case of one service node, n is the number of service nodes, P is the time to run a parallelizable portion of code on one processor, and p is the number of processors. If one has Q processors to deploy, the lowest time T then occurs when the number of nodes n equals (QN/2P)^2/3.
While the exact value is thus dependent on the ratio between N and P, as the quantity of resources grows, the relative contribution of dispersion declines. Another key insight, which I prove in the paper, is that the cost structure of elasticity and the cost structure of dispersion in the cloud imply that in many cases, response time can be dramatically reduced without a cost penalty.
The Value of Dispersion in Latency Reduction (simulation) shows this √n effect on a plane, both for service nodes deployed in regular lattices as well as randomly positioned.
The Ninth Law of Cloudonomics: “Don’t put all your eggs in one basket”
The reliability of a system with n redundant components, each with reliability r, is 1-[(1-r)^n]. The good news is that higher reliability can be achieved through a judicious mix of replicated components, monitoring, management, and recovery processes, and overall architecture. The bad news is that no finite amount of replication will achieve perfect reliability.
The Tenth Law of Cloudonomics: “An object at rest tends to stay at rest (Newton)”
A preferred option may be infeasible due to switching or migration costs as well as psychological or behavioral economic factors. For example, power costs may be lower at a new site, but the costs of early contract or lease termination, temporary operations replication, testing, and physical equipment moves may make it challenging to migrate. Clouds may offer advantaged economics, but the cost and time to rewrite thousands or millions of lines of code alter those economics, suggesting that traditional / legacy environments will need to coexist with clouds for a long time. In addition, there are numerous behavioral economics factors that support use of the cloud—for example, hyperbolic discounting—or conversely, support traditional operations—for example, a need for control.
Pay-per-use is a Dominant Strategy
In “The Market for Melons: Quantity Uncertainty and the Market Mechanism” (PDF) I prove the dominance of pay-per-use pricing over flat-rate pricing in competitive environments meeting certain basic conditions, such as non-trivial price dispersion and rational, active, maximizing consumers. This simulation demonstrates the effect. As the simulation runs, light users defect to pay-per-use plans, heavy ones to flat-rate ones. The flat-rate price rises, and defections tilt increasingly to pay-per-use. In the terminal state, virtually all customers are on pay-per-use plans. Perhaps surprisingly, even as consumers switch plans to reduce their costs, the total spend in such environments doesn’t change, in other words, aggregate shifts in plan selection are revenue neutral. This theoretical result matches the experience of major providers.
Cloud Computing is Computationally Intractable
In “Cloud Computing is NP-Complete” (PDF), I prove that a basic problem in Cloud Computing, that of matching resources to demand over a network, is computationally intractable, even if there are sufficient total resources in the system to meet the aggregate demand. This result is something like the Heisenberg Uncertainty Principle, where position and momentum cannot simultaneously be precisely known, or Godel’s Proof that any non-trivial axiomatic system will either be incomplete or inconsistent. It implies that a “perfect solution” that optimizes cost may not be computable in short enough time–thus there either must be excess resources in the system, to ensure that all demand can be satisfied; excess latency, as imperfect solutions utilize unnecessarily distant resources to fulfill demand; and/or excess overhead, to divide up demand to be served via multiple nodes.
Cloud Computing can represent different things to different stakeholders: a technology, a business model, or a development and operations strategy. In addition, it may be viewed abstractly as distributed resources over a network with on-demand, pay-per-use attributes. As such, certain laws and behaviors apply as we outline above.