What’s a Cloud Management Platform? (Part 2: Cloud Optimization Edition)
Two weeks ago, we talked about some of the ways that a Cloud Management Platforms (CMP) helps users relieve the headaches associated with DIY cloud resource management. This week, we’ll look at a few more compelling reasons to use a Cloud Management Platform like Sunshower.io for your cloud optimization and cloud resource management.
BizWest: Sunshower.io Now Tracks Cloud Usage on Amazon
By BizWest Staff — June 19, 2019
FORT COLLINS — Sunshower.io, a Fort Collins startup that helps companies optimize cloud computing, has just launched its Amazon Web Services EC2 cloud management platform. Soon, it will bring similar cloud management tools out of beta in order to service customers using Microsoft Azure and the Google Cloud Platform.
What’s a Cloud Management Platform? (And Why Do You Need One?) Part 1 of 2
Our official tagline at Sunshower.io is “beautifully simple cloud management and optimization.” But why do you need a Cloud Management Platform like Sunshower.io? When you work with a Cloud Service Provider (CSP) like AWS or Azure, doesn’t the CSP do the cloud optimization for you? Isn’t it the CSP’s job to make sure what you’re running in the cloud is rightsized, your applications are easy to view and manage, and that you’re getting the best possible value for your money? That’s what you’re paying them for, right?
We pushed out another update to the Sunshower platform yesterday.
On the Back End
We upgraded from Java 9 to Java 12 and from WildFly 14 to 16. As part of this, we also moved our L2 cache from Ignite to Infinispan. (We love Ignite, and are still using it for other things, but there was a memory leak in the version of Hibernate we were using, and Ignite was preventing us from upgrading).
On the Front End
When you come into your system and we don’t have your optimizations ready to go, you’ll now see a big refresh button. You can also rerun the optimization at any time by hitting the refresh button in the upper right:
The optimization summary now lists instances by current machine price, descending. It is currently paged, so users with more than 12 instances can expect to see some new navigation:
You’ll also see a little green trophy by any instances that are fully optimized. Good luck, and get optimizing!
The next release will contain enhancements related to regions: support for limiting optimization recommendations by region, as well as the ability to group and view infrastructure by region. As always, please let us know any feature requests.
Corey makes the argument that upgrading an m3.2xlarge to a m5.2xlarge for a savings of 28% is the correct course of action. We have a user with > 30 m3.2xlarge instances whose CPU utilization is typically in the low digits, but which spikes to 60+% periodically. Whatever, workloads rarely crash because of insufficient CPU — they do, however, frequently crash because of insufficient memory. In this case, their memory utilization has never exceeded 50%.
Our optimizations, which account for this and other utilization requirements, indicate that the “best fit” for their workload is in fact an r5.large, which saves them ~75%. In this case, for their region, the calculation is:
The approximate monthly difference is $8891.40/month
Now, these assume on-demand instances, and reserved instances can save you a substantial amount (29% in this case at $0.380 per instance/hour), but you’re locked in for at least a year and you’re still overpaying by 320%.
“An ‘awful lot of workloads are legacy’ -> Legacy workloads can’t be migrated”
So, this one’s a little harder to tackle just because “an awful lot” doesn’t correspond to a proportion, but let’s assume it means “100%” just to show how wrong this is according to the points he adduces:
If you’ve heard of cloud computing at all, you’ve heard of Amazon Web Services (AWS), Microsoft Azure and Google Cloud. Between the three of them, they’ll be raking in over $50 billion in 2019. If you’re on the cloud, chances are good you’re using at least one of them.
The latest RightScale State of the Cloud Report pegs AWS adoption at 61%, Azure at 52% and Google Cloud at 19% (see the purple above). What’s more, almost all respondents (as denoted in blue) were experimenting with or planned to use one of the top three clouds. Which, if you math that up, means that 84% of respondents are going to be using AWS at some point, 77% will be using Azure and 55% will be using Google Cloud.
Multi-cloud strategies are definitively A Thing, contrary to some folks’ opinions and the overwhelming one-cloud-to-rule-them-all desire of AWS. So it’s worth comparing them. On a broad level, AWS rocks and rolls with capabilities set to lock you into their cloud, while Azure’s great for enterprises and Google Cloud’s your go-to if you want to do AI. But, as with all things, there’s more to it than that, and it’s not just where you can get the best cloud credit deals.
You wouldn’t think that the primary issue with optimizing cloud computing workloads would be getting good data. Figuring out math problems (hello, integer-constrained programming) worthy of a dissertation, sure. Writing a distributed virtual machine, maybe. Getting good data about a workload to run against good data about what the viable machines to put it on are? Not so much.
Well, you would be wrong. While the majority of the IP is in said math problems, the majority of the WORK is in the data — getting it and cleaning it up. And the data problem alone is enough to make you realize why everyone just picks an instance size and rolls with it until it doesn’t work anymore.
Last week we started the work to expand our platform from AWS-only to Azure. One of the first steps to that is what we call a “catalog”: a listing of all the possible virtual machine sizes across all possible regions with all of their pricing information (because, of course, pricing and availability vary). You would hope that this sort of catalog would be readily accessible from a cloud service provider (CSP). At the moment, the state-of-the-art is the work of many open-source contributors working together to scrape different CSP sets of documentation.
For AWS, we love ec2instances.info for this information, though we still had to get all of the region information in less savory ways. Different folks have attempted to do similar things for Azure, but Azure doesn’t make it easy. Pricing is different across Linux and Windows, because of course it is, but the information they give you when trying to look at pricing is missing some bits:
When two random things fit together perfectly, it creates a special kind of magic — Like stumbling across a way to bring order to the chaos of everyday life. Maybe that’s what makes these 22 photos so satisfying?
At first glance, that statement might seem counter-intuitive. Reserved Instances (RIs) are widely advertised as the best way to save big on your Amazon Web Service (AWS) cloud compute bill. And in many cases, they are. With Reserved Instances, companies commit to long-term usage by agreeing to rent virtual machines for a set amount of time (typically 1 to 3 years) in exchange for a significantly lower rate than on-demand pricing. When viewed through this lens, they appear to be a vital part of an AWS cost management strategy.
Take Amazon EC2 as an example. When compared to on-demand pricing, Amazon EC2 RIs offer customers potentially deep discounts — sometimes as much as 75%, per their marketing. While reserving cloud capacity in advance seems like the smart thing to do because it has the potential to deliver a significant amount of savings, the savings promised by RIs often have a dangerous downside — and any missteps can have substantial costs for your company.
The calculations involved in deciding which RI to purchase can be frustratingly complicated. One year or 3 year contract? What about tenancy? Instance size? Region and zone? New or from the marketplace? And don’t forget about the nuance of offering class — do you want your RI standard, convertible or scheduled?
These calculations are difficult, but absolutely vital when committing to a RI. Rather than signing a contract for exactly what you have now (in terms of size, region, and tenancy) and guessing at the length of time that will fit your instance, it’s essential to understand the exact shape of your usage needs. Without that kind of granular insight into your workload, it’s impossible to choose a RI that will be the right fit six months from now, let alone three years in the future.
In the end, many companies buy RI capacity that ends up exceeding their actual needs, because they’re already using capacity that exceeds their needs. Unfortunately, committing to more capacity than you actually need can be very costly over the length of a RI contract. When that happens, the long-term return on investment (ROI) ultimately evaporates.