Cloud-Agnostic Infrastructure-As-Code Trials and Tribulations

Opinions around cloud-agnosticism vary, but generally it is considered a truism that for cloud-agnostic deployments, infrastructure-as-code (IaC) is the best, perhaps the only way to achieve them. Cloud-agnostic deployments are obviously not a goal for every organization, but we’ve found that as organizations become more sophisticated it frequently becomes a need. This is most often due to customers’ on-premise needs or their own hybrid-cloud requirements. Yet despite their sophistication, these organizations almost universally struggle to implement cloud-agnostic deployments, even with infrastructure-as-code tools.

Tools like Terraform and Kubernetes are the most often used for cloud-agnostic infrastructure-as-code, and while we love and use them as much as the next DevOps nerd, we’ve found that this solution can run into a variety of problems. Broadly, in our experience, these cloud deployment and management problems fall into the following categories:

  1. Visibility (what infrastructure am I renting, and where?)
  2. Infrastructure design and architecture
  3. Infrastructure deployment
  4. Cost analysis and optimization, especially as it pertains to infrastructure utilization
  5. Parameterization (secrets, credentials, etc.)

There aren’t very many cloud management tools that capture all of the categories. Those that exist do have issues with “common denominator” problems — that is, concerns around what functionality remains consistent across clouds. We find that folks overestimate the degree to which common denominator problems exist and how problematic they actually are. While the names and APIs for services vary wildly across cloud platforms, once you get past the syntax, they’re functionally the same. Indeed, we find that abstracting out sensible defaults in our abstraction layer solves most of those problems. Usually, the problems arise when mapping individual capability concepts — e.g. porting function-as-a-service-based code from AWS Lambda to another cloud. Even more typically, it comes from providing a common view of infrastructure elements between clouds, tenants, regions, etc.

We’ve sought to alleviate these problems with Troposphere, our drag-and-drop orchestration engine that can take whatever your current infrastructure-as-code solution is and make it cloud-agnostic. Over the coming posts, we’ll address the issues we’ve seen in implementing cloud-agnostic infrastructure-as-code solutions, and how to address them.

One Configurable, Cross-Platform Self-Extracting Executable Generator to Rule Them All

Generating configurable, cross-platform self-extracting executables has been the purview of commercial software for, well, forever. We wanted folks of all operating systems to be able to download and run our software locally. We also didn’t want to break the bank. After much deliberation, we decided to build the solution ourselves, making cross-platform installer generation easy and accessible.

Introducing the Zephyr Installer Plugin Ecosystem

zephyr installer logoWith Zephyr Build Plugins, you can generate any self-extracting executable targeting any supported platform from any supported platform, as well as executables for every supported platform on any supported platform from the same build. Even better? Generated executables do not require any JVM to be installed on the end
user’s system to function.

You can create self-extracting executables with code-signing, as well as modifiable icons, metadata and run permissions for Windows, Mac and Linux. Additionally Zephyr Build Plugins can generate ICO and ICNS icons from PNG, SVG and other formats. We currently support Maven for all platforms, but plan to support Gradle as well. Bazel and Ant support will be available with a commercial license.

Use Cases

Create a self-extracting executable targeting (Windows, Mac, or Linux)

Executables can be built on your current operating system (Mac, Windows, or Linux) with no modifications or third-party requirements.

Sign generated executables

Executables can be signed for any supported platform with the same configuration. Sign Windows executables with Authenticode on Mac, Linux, or Windows, or sign Mac app packages with CodeSign on Mac, Linux, or Windows.

Generate ICO/ICNS files

Traditionally, having ICO or ICNS icon files on hand has been a prerequisite, requiring third-party commercial tools, online icon generators, etc. The Zephyr build ecosystem allows you to generate ICO/ICNS files from standard raster formats like PNG and SVG with a variety of sizes.

Attach ICO/ICNS files to your executable

Inserting branding icons into executables has been a platform-dependent chore, but Zephyr allows you to brand your generated executable in a platform-independent way.

Create installers for JVM-based programs

Previously, installers for JVM-based programs required the installer to download the JVM, or forced the end-user to install it. Zephyr allows you to launch IzPack installers using a JVM bundled with your application

Automate everything!

Since these tools are included as build plugins for the most popular build systems, you can completely eliminate any manual steps in your installer generation process!

You can find the Maven plugin documentation at https://zephyr.sunshower.io/site/

Reflecting on 2019

Like most folks coming into this new year and new decade, we’ve been reflecting on the past. Sunshower.io is far less than a decade old (we turn two in March!), but even a year is a long time in the life of a small business. Despite our small size, it was a banner year.

First Six Months

The first six months of the year were largely occupied by releasing the Sunshower platform to the public. Our web application plays host to Stratosphere, allowing you to visualize your AWS EC2 infrastructure around the globe, and Anvil, allowing you to save 66% on your AWS EC2 bill with right-sizing.

We also spent a lot of the first half of the year doing the “startup circuit” — taking second in a startup pitch competition, doing lots of networking events with investors, and interviewing at Y Combinator.

Last Six Months

From there, we abruptly got pulled into the land of government contracting. A large defense contractor contacted us about doing a white paper together for the Air Force, which we submitted in July. We spent August waiting to hear if we we would receive an RFP, and September writing the proposal. Our contacts at the Air Force think our offering is incredibly valuable, and last we heard, the proposal was in the technology evaluation phase. Interested in what we pitched them? Download our marketing white paper.

This fall, we’ve renewed our commitment to open-source software. Most notably, we refactored our plugin framework into Zephyr, so other organizations can reap the benefit of a non-OSGi system for lifecycle and dependency management. We also renewed our work on Aire, a UI framework built on Aurelia and UIKit.

Last but not least, we finished out the year with a bit of a rebrand. We have a new logo and doubled down on our bright colors. We also changed our tagline to reflect our movement away from optimization and towards application and infrastructure management and deployment.

We’re looking forward to what 2020 has in store!

Introducing Zephyr: A Java Plugin System for the 21st Century

At Sunshower.io, we write software for people who write software. We’re pleased to announce something new to help folks scale their software: Zephyr, a next-generation plugin framework written in Java. Zephyr is an OSGi alternative — inspired by the best parts of it while dramatically reducing complexity and improving interoperability with existing frameworks and ecosystems.

Zephyr was born from our frustration with existing module systems. We started off using Wildfly and embedding OSGi, but this proved inadequate for the complex dependency graphs we encountered while developing the Sunshower platform. In particular, continually copy/pasting around manifests to import the dozens of packages from various frameworks was tedious and error-prone (and auto-generating them wasn’t much better, in fact). It greatly increased the complexity of our builds and deployments as we’d continually need to rev released versions of modules. This is to say nothing of the complexities of testing module interactions, or the joys of a ClassNotFoundException appearing suddenly after weeks of smooth operation caused by a forgotten Package-Imports declaration.

After over 18 months of working around framework limitations, we looked at the “Kernel” that arose from coping with these problems and decided “Hey, this is pretty useful. Let’s get rid of underlying systems and just use that.” And now we’re open-sourcing it.

Small but mighty, Zephyr aggressively and automatically parallelizes management operations while running in less than 512KB of memory. It intelligently manages all aspects of plugin lifecycle, including dependency resolution. Deploying new plugins is quick and painless. And, of course, setting up plugin dependencies for tests is, well, a breeze.

While we wrote it in Java, Zephyr works with whatever languages you normally use by installing language runtimes as plugins. You can have multiple frameworks running side by side, eliminating a lot of overhead associated with rewrites, scaling and transitioning architectures.

Zephyr is available on Github under an MIT license. Enterprise support contracts are available. Go check out the website, the docs or the repository. We’d love to have you involved!

Sunshower Platform 6/4 Release Notes

We pushed out another update to the Sunshower platform yesterday.

On the Back End

We upgraded from Java 9 to Java 12 and from WildFly 14 to 16. As part of this, we also moved our L2 cache from Ignite to Infinispan. (We love Ignite, and are still using it for other things, but there was a memory leak in the version of Hibernate we were using, and Ignite was preventing us from upgrading).

On the Front End

When you come into your system and we don’t have your optimizations ready to go, you’ll now see a big refresh button. You can also rerun the optimization at any time by hitting the refresh button in the upper right:

Refreshing the optimization is more obvious
No optimization results ready to go? Push the button!

The optimization summary now lists instances by current machine price, descending. It is currently paged, so users with more than 12 instances can expect to see some new navigation:

Paging at the bottom of the summary
Pages and pages of lovely optimization goodness

You’ll also see a little green trophy by any instances that are fully optimized. Good luck, and get optimizing!

Up Next

The next release will contain enhancements related to regions: support for limiting optimization recommendations by region, as well as the ability to group and view infrastructure by region. As always, please let us know any feature requests.

The Big Three: Comparing AWS, Azure and Google Cloud for Computing

If you’ve heard of cloud computing at all, you’ve heard of Amazon Web Services (AWS), Microsoft Azure and Google Cloud. Between the three of them, they’ll be raking in over $50 billion in 2019. If you’re on the cloud, chances are good you’re using at least one of them.

The latest RightScale State of the Cloud Report pegs AWS adoption at 61%, Azure at 52% and Google Cloud at 19% (see the purple above). What’s more, almost all respondents (as denoted in blue) were experimenting with or planned to use one of the top three clouds. Which, if you math that up, means that 84% of respondents are going to be using AWS at some point, 77% will be using Azure and 55% will be using Google Cloud.

AWS, Azure & GCP market share

Multi-cloud strategies are definitively A Thing, contrary to some folks’ opinions and the overwhelming one-cloud-to-rule-them-all desire of AWS. So it’s worth comparing them. On a broad level, AWS rocks and rolls with capabilities set to lock you into their cloud, while Azure’s great for enterprises and Google Cloud’s your go-to if you want to do AI. But, as with all things, there’s more to it than that, and it’s not just where you can get the best cloud credit deals.

Everything is a Data Problem

You wouldn’t think that the primary issue with optimizing cloud computing workloads would be getting good data. Figuring out math problems (hello, integer-constrained programming) worthy of a dissertation, sure. Writing a distributed virtual machine, maybe. Getting good data about a workload to run against good data about what the viable machines to put it on are? Not so much.

Well, you would be wrong. While the majority of the IP is in said math problems, the majority of the WORK is in the data — getting it and cleaning it up. And the data problem alone is enough to make you realize why everyone just picks an instance size and rolls with it until it doesn’t work anymore.

Last week we started the work to expand our platform from AWS-only to Azure. One of the first steps to that is what we call a “catalog”: a listing of all the possible virtual machine sizes across all possible regions with all of their pricing information (because, of course, pricing and availability vary). You would hope that this sort of catalog would be readily accessible from a cloud service provider (CSP). At the moment, the state-of-the-art is the work of many open-source contributors working together to scrape different CSP sets of documentation.

For AWS, we love ec2instances.info for this information, though we still had to get all of the region information in less savory ways. Different folks have attempted to do similar things for Azure, but Azure doesn’t make it easy. Pricing is different across Linux and Windows, because of course it is, but the information they give you when trying to look at pricing is missing some bits:

Screenshot comparing B-Series instances on Azure

How We Optimize Based on Resource Utilization Data

We frequently get asked what makes our AWS cost optimization so good. AWS cost management feels like it should be easy, and we talk to a lot of folks who think they’ve done a good job of it. The fact is, we’ve yet to see anyone who’s not wasting at least 40% of their EC2 bill. Let’s walk through it on our platform, and it’ll make sense why.

screenshot of a virtual machine report within the Sunshower platform

Fitting an Instance

It all starts with knowing what you’re actually using, resource-wise. Figuring this out as a human is surprisingly hard. For Sunshower, we look at the past month of a virtual machine’s life (if we have it — that’s our default) and sample every minute (by default, but it’s adjustable). After smoothing the data, that’s how we discover, in this case, 1 CPU (of the 8 they’re paying for) and 10G RAM (of the 30 they’re paying for) are actually being used.

In the screenshots below, you can see the resulting “shape” of the workload on the virtual machine. First, on the left: current vs utilized. The grey is what they’re currently paying for, and the purple is what they’re utilizing. Frankly, it LOOKS like a pretty good fit.

To compare, let’s look at the screenshot on the right: optimized vs utilized. There’s our purple triangle of utilization again. This time, you’ll see the optimized fit we found in blue. Even though the blue section looks a lot bigger, it actually reflects a substantial cost savings over the original, grey fit on the left.

resource utilization compared to purchased virtual machine

How is that possible? The thing you’re really paying for, in most machines, is CPU and Memory. So, the closer a fit you can get on those, the better. In the image on the left, you can see that the majority of the overprovisioning is taking place in the most expensive areas of cloud spend: CPU and memory. Tightening that fit up in CPU and memory, like you see represented in blue in the image on the right, might look like an incremental change from the image on the left, but in reality it adds up.

3 Reasons Your Cloud Bill is So High

At Sunshower.io, we talk to a lot of people about their cloud infrastructure usage. In our professional lives, we’ve dealt with the confusion caused by different cloud vendors, including confounding billing methods, lack of insight into the infrastructure you’ve built, and just throwing hardware and money at the current problem and hoping it’ll fix it. Understandably, the question we’re most frequently asked is the one that’s most mission-critical: How did my cloud bill get like this and how do I get it down?

1) You Forgot About Some Infrastructure

“Cloud sprawl” is extremely common, and happens when you’re running more cloud instances than necessary. It’s easy to see how this can happen—running workloads that you’ve forgotten about and unused and idle workloads are all key culprits. In a complex cloud ecosystem, it can be tough to keep watch over everything running in the cloud. Monitoring and controlling those workloads is key to making sure you’re not over-spending on the cloud. If your company isn’t using auto-scaling, you might be running instances 24/7 that aren’t always performing a necessary function. Running instances that you’re not using is essentially throwing money away—like going away for the weekend and leaving all of your lights on.

2) You Bought Too Much “Just In Case”

Overprovisioning refers to buying more cloud compute resources than you typically need. It’s important to tailor what you buy to actual usage, because it really adds up. The first step is figuring out what you’re actually using, which monitoring and cloud cost optimization tools can help with. If this process is overwhelming, there are vendors you can work with to help you sift through your options and make the best possible choices. Without good cloud monitoring tools, it’s impossible to see what you’re wasting. Only then should you start looking into what to buy instead.

3) You Drank The Vendor Kool-aid

The custom services provided by cloud service providers are tempting, but the cost can really add up. Even worse, it removes your ability to migrate to other cloud providers, so it’s hard to pivot to more cost-effective solutions over time. As you build your cloud strategy, try to avoid locking yourself into a relationship with a single cloud service provider. Don’t tie yourself to a single vendor because it’s convenient—make sure that you’re allowing yourself the flexibility to change providers and adapt new strategies when costs start to increase.

Save 40% or more in 40 Seconds

Sunshower.io’s optimizing algorithms help you save time and money on the cloud. There’s no upfront cost, and our results are better than our competitors. It’s cloud computing optimization unlike anything else out there? So how do we do the thing?

Imagine you need to rent a storage unit — you have a bunch of boxes, but nowhere to put them. No problem! There are a ton of companies out there that will rent you a storage space. You do a quick search, and find over 30 self-storage companies scattered across town. You don’t have time to talk to everyone to compare prices (who does?), so you call a company whose name you recognize. You’re not exactly sure what size storage unit you need, so they talk you into renting a 10 x 20 unit, “just in case”, at which point you end up with a storage unit that looks like the photo above.

End result: you’re paying for a lot more than what you need. Sure, you could move to another facility, but who wants to negotiate with another company, then give up their whole day to move a truck and switch facilities? Easier to stay put, and keep that extra space.

Buying too much “just in case” is a very common things for companies on the cloud, too. Why?

  • There are an overwhelming amount of cloud service provider choices
  • There are an overwhelming number of options on each of those cloud service providers
  • The UIs of cloud service providers are confusing
  • It’s hard to know exactly what you need and what you’re using

That’s where Sunshower.io comes in

When you work with us, we securely run metadata about your resource usage through a proprietary algorithm designed to find the exact right fit for your cloud compute needs. We use machine learning to calculate millions of data points, factor in fluctuations in data usage over time, and come up with a cloud plan that ensures you aren’t overpaying “just in case.” We find you a fit so good, we can save our customers 40% or more on their monthly cloud compute bill.

(Think Cinderella’s glass slippers, with good arch support and just enough wiggle room for your toes.)

Over time, this kind of cloud savings can be game-changing. Just imagine the decisions you could make even with an extra 20% of your monthly cloud spend back in your pocket, like hiring another engineer, or launching a great social media campaign. And that’s what we’re all about at Sunshower.io: helping you focus on what matters — your business.

To that end, we’re excited to announce our just-launched AWS EC2 optimizer

If you’re currently using Amazon Web Services EC2 for your cloud infrastructure, our service (colloquially known as Anvil) has been specifically tailored to analyze your data and come up with a better cloud usage plan to help with AWS cost optimization. The bottom line: Our AWS cost optimization can help you save money on AWS with just a few clicks.

Not using AWS EC2? We promise you won’t be left out. We’re launching AWS RDS optimization next, and we’ll be releasing optimizations for more public clouds as we go along. (Google Cloud and Azure are next up on the list, but let us know your needs and we can re-prioritize!)