Cloud-Agnostic Infrastructure-As-Code Trials and Tribulations

Opinions around cloud-agnosticism vary, but generally it is considered a truism that for cloud-agnostic deployments, infrastructure-as-code (IaC) is the best, perhaps the only way to achieve them. Cloud-agnostic deployments are obviously not a goal for every organization, but we’ve found that as organizations become more sophisticated it frequently becomes a need. This is most often due to customers’ on-premise needs or their own hybrid-cloud requirements. Yet despite their sophistication, these organizations almost universally struggle to implement cloud-agnostic deployments, even with infrastructure-as-code tools.

Tools like Terraform and Kubernetes are the most often used for cloud-agnostic infrastructure-as-code, and while we love and use them as much as the next DevOps nerd, we’ve found that this solution can run into a variety of problems. Broadly, in our experience, these cloud deployment and management problems fall into the following categories:

  1. Visibility (what infrastructure am I renting, and where?)
  2. Infrastructure design and architecture
  3. Infrastructure deployment
  4. Cost analysis and optimization, especially as it pertains to infrastructure utilization
  5. Parameterization (secrets, credentials, etc.)

There aren’t very many cloud management tools that capture all of the categories. Those that exist do have issues with “common denominator” problems — that is, concerns around what functionality remains consistent across clouds. We find that folks overestimate the degree to which common denominator problems exist and how problematic they actually are. While the names and APIs for services vary wildly across cloud platforms, once you get past the syntax, they’re functionally the same. Indeed, we find that abstracting out sensible defaults in our abstraction layer solves most of those problems. Usually, the problems arise when mapping individual capability concepts — e.g. porting function-as-a-service-based code from AWS Lambda to another cloud. Even more typically, it comes from providing a common view of infrastructure elements between clouds, tenants, regions, etc.

We’ve sought to alleviate these problems with Troposphere, our drag-and-drop orchestration engine that can take whatever your current infrastructure-as-code solution is and make it cloud-agnostic. Over the coming posts, we’ll address the issues we’ve seen in implementing cloud-agnostic infrastructure-as-code solutions, and how to address them.

One Configurable, Cross-Platform Self-Extracting Executable Generator to Rule Them All

Generating configurable, cross-platform self-extracting executables has been the purview of commercial software for, well, forever. We wanted folks of all operating systems to be able to download and run our software locally. We also didn’t want to break the bank. After much deliberation, we decided to build the solution ourselves, making cross-platform installer generation easy and accessible.

Introducing the Zephyr Installer Plugin Ecosystem

zephyr installer logoWith Zephyr Build Plugins, you can generate any self-extracting executable targeting any supported platform from any supported platform, as well as executables for every supported platform on any supported platform from the same build. Even better? Generated executables do not require any JVM to be installed on the end
user’s system to function.

You can create self-extracting executables with code-signing, as well as modifiable icons, metadata and run permissions for Windows, Mac and Linux. Additionally Zephyr Build Plugins can generate ICO and ICNS icons from PNG, SVG and other formats. We currently support Maven for all platforms, but plan to support Gradle as well. Bazel and Ant support will be available with a commercial license.

Use Cases

Create a self-extracting executable targeting (Windows, Mac, or Linux)

Executables can be built on your current operating system (Mac, Windows, or Linux) with no modifications or third-party requirements.

Sign generated executables

Executables can be signed for any supported platform with the same configuration. Sign Windows executables with Authenticode on Mac, Linux, or Windows, or sign Mac app packages with CodeSign on Mac, Linux, or Windows.

Generate ICO/ICNS files

Traditionally, having ICO or ICNS icon files on hand has been a prerequisite, requiring third-party commercial tools, online icon generators, etc. The Zephyr build ecosystem allows you to generate ICO/ICNS files from standard raster formats like PNG and SVG with a variety of sizes.

Attach ICO/ICNS files to your executable

Inserting branding icons into executables has been a platform-dependent chore, but Zephyr allows you to brand your generated executable in a platform-independent way.

Create installers for JVM-based programs

Previously, installers for JVM-based programs required the installer to download the JVM, or forced the end-user to install it. Zephyr allows you to launch IzPack installers using a JVM bundled with your application

Automate everything!

Since these tools are included as build plugins for the most popular build systems, you can completely eliminate any manual steps in your installer generation process!

You can find the Maven plugin documentation at https://zephyr.sunshower.io/site/

Reflecting on 2019

Like most folks coming into this new year and new decade, we’ve been reflecting on the past. Sunshower.io is far less than a decade old (we turn two in March!), but even a year is a long time in the life of a small business. Despite our small size, it was a banner year.

First Six Months

The first six months of the year were largely occupied by releasing the Sunshower platform to the public. Our web application plays host to Stratosphere, allowing you to visualize your AWS EC2 infrastructure around the globe, and Anvil, allowing you to save 66% on your AWS EC2 bill with right-sizing.

We also spent a lot of the first half of the year doing the “startup circuit” — taking second in a startup pitch competition, doing lots of networking events with investors, and interviewing at Y Combinator.

Last Six Months

From there, we abruptly got pulled into the land of government contracting. A large defense contractor contacted us about doing a white paper together for the Air Force, which we submitted in July. We spent August waiting to hear if we we would receive an RFP, and September writing the proposal. Our contacts at the Air Force think our offering is incredibly valuable, and last we heard, the proposal was in the technology evaluation phase. Interested in what we pitched them? Download our marketing white paper.

This fall, we’ve renewed our commitment to open-source software. Most notably, we refactored our plugin framework into Zephyr, so other organizations can reap the benefit of a non-OSGi system for lifecycle and dependency management. We also renewed our work on Aire, a UI framework built on Aurelia and UIKit.

Last but not least, we finished out the year with a bit of a rebrand. We have a new logo and doubled down on our bright colors. We also changed our tagline to reflect our movement away from optimization and towards application and infrastructure management and deployment.

We’re looking forward to what 2020 has in store!

Why We Wrote Zephyr

Since releasing Zephyr, we’ve been asked by numerous people why we wrote Zephyr instead of sticking to OSGi. Our goal was pretty simple: create an extensible system suitable for SaaS or on-prem.  We looked in our toolbox and knew that we could do this using OSGi, Java, and Spring, and so that’s how it started.

How We Started

First, we wrote our extensible distributed graph reduction machine: Gyre.  This allowed us to describe computations as graphs. It generated a maximally-parallel schedule, did its best to figure out whether to ship a) a computation to data or b) data to a computation or c) both to an underutilized node and executed the schedule.

Then we wrote Anvil, our general-purpose optimization engine that efficiently solved linear and non-linear optimization problems. These were described as Gyre graphs (including how the Gyre could better execute tasks based off of its internal metrics). We deployed Anvil and Gyre together as bundles into an OSGi runtime.  Obviously, Anvil couldn’t operate without Gyre, and so we referenced Gyre services in Anvil.  But Anvil and Gyre themselves were extensible.  We wrote additional solvers and dynamically installed them into Anvil, or wrote different concurrency/distribution/serialization strategies and deployed them into Gyre, and gradually added more and more references.

Then we wrote Troposphere, our deployment engine. Troposphere would execute its tasks on Gyre, and Anvil would optimize them. Troposphere would define types of tasks, and we exported them as requirements to be satisfied by capabilities. (For example, Troposphere would define a “discovery” task, and an AWS EC2 plugin would fulfill that capability.)

Handling OSGi with Spring

Being a small team, we pretty much only used one actual framework (Spring), so we deployed yet another bundle containing only the Spring classpath, to be depended on by any bundle that required it.  We initially used bnd to generate our package import/export statements in our manifest, and pulled in the bnd Gradle plugin as part of the build, but the reality was that if a plugin depended on Troposphere, then it pretty much always depended on Gyre, Anvil, and Spring.

If Anvil contains a service-reference to Gyre, and Troposphere contains one to Anvil, you get the correct start-order.  But if you stop Gyre while Troposphere is running?  Well, that’s a stale reference, and Troposphere needs to handle it, which means refactoring Troposphere and Gyre to use service factories, prototype service factories, or whatever else.

But we just wanted to write Spring and Java.  To really use Spring in an OSGi-friendly way, you have to use Blueprints, and now you’re back to writing XML in addition to all of the OSGi-y things you’re doing in your code. The point isn’t that OSGi’s way doesn’t work — it does. These are solid technologies written by smart people. The point is that introduces a lot of additional complexity, and you’re forced to really understand both Spring and OSGi to be productive when Spring is the only framework that’s actually providing value (in the form of features) to your users because the extensibility component (OSGi) is a management concern.

What Zephyr gets us that OSGi didn’t

Testability

We’re big fans of unit tests, and we write a lot of them.  Ideally, if you’re sure components A and B both work, then the combination of A and B should work.  The reality is that sometimes they don’t for a huge variety of reasons. For example, for us, using any sort of concurrency mechanism outside of Gyre could severely bork Gyre, which could and did bamboozle dozens of plugins. We’re small enough that we could just set a pattern and decree that hey, that is the pattern, and catch violations in reviews or PMD rules. But once again, we just wanted to write integration tests and we wanted to use Spring Test to do it.

With OSGi, you can create projects whose test classpath matches the deployment classpath (although statically), and we did.  We also wrote harnesses and simulations that would set up OSGi and deploy plugins from Maven, etc., and it all worked. But it was still complex, and it wasn’t just Spring Test. This was, and continues to be, a big source of pain for us.  The fact of the matter is that, once again, Spring was providing the developer benefit and OSGi was introducing complexity.

Quick Startup/Shutdown Times

We use a lot of Spring’s features and perform DB migrations in a variety of plugins — not an unusual use case.  A plugin might only take a few seconds to start, but amortized over dozens of plugins, startup time became pretty noticeable.  There are some ways to configure parallel bundle lifecycle, but they’re pretty esoteric, sometimes implementation-dependent, and always require additional metadata or code. With Zephyr, we get parallel deployments out-of-the-box and as the default, reducing startup times from 30+ seconds to 5 or so.

Remote Plugins

One of our requirements is the ability to run plugins whose processes and lifecycles reside outside of Zephyr’s JVM. OSGi (understandably) wasn’t designed to support this, but Zephyr was.

Getting it right with Zephyr

We spent about two years wrangling OSGi and Spring, by turns coping with these and other problems either in code or operations. It was generally successful, but there was always an understanding that we were paying a high price in terms of time and complexity. After the first dozen or so plugins, we’d really come to understand what we wanted from a plugin framework.

To boot, we are pretty good at graph processing, and it had been clear to us for a while that the plugin management issues we were continually encountering were graph problems. Classpath dependency issues could be easily understood through the transitive closure of a plugin, and most of our plugins had the same transitive closure. Even if they didn’t, that was the disjoint-subgraph problem and we could easily cope with that. Correct parallel start schedules were easily found and correctly executed by Coffman-Graham scheduling, and we could tweak all of these subgraphs through subgraph-induction under a property.  Transitive reduction allowed us to easily and transparently avoid problems caused by non-idempotent plugin management operations.

Once we’d implemented those, we discovered that a lot of the problems we struggled with just went away. Required services could never become stale, and optional services just came and went.  A lot of the OSGi-Spring integration code we’d written became dramatically simpler, and we could provide simple but powerful Spring Test extensions that felt very natural.

What’s Next

But we’re not stopping with Spring: Zephyr can support any platform and any JVM language, and we’re planning on creating support for Clojure, Kotlin, and Scala initially as installable runtimes. We’re investigating NodeJS support via Graal and should have some announcements about that in the new year. Spring is already supported, and we hope to add Quarkus and Dropwizard soon. And keep in mind that these integrations should require little or no knowledge of Zephyr at all.

We’re also in the process of open-sourcing a beautiful management UI, a powerful repository, and a host of other goodies — stay tuned!

Introducing Zephyr: A Java Plugin System for the 21st Century

At Sunshower.io, we write software for people who write software. We’re pleased to announce something new to help folks scale their software: Zephyr, a next-generation plugin framework written in Java. Zephyr is an OSGi alternative — inspired by the best parts of it while dramatically reducing complexity and improving interoperability with existing frameworks and ecosystems.

Zephyr was born from our frustration with existing module systems. We started off using Wildfly and embedding OSGi, but this proved inadequate for the complex dependency graphs we encountered while developing the Sunshower platform. In particular, continually copy/pasting around manifests to import the dozens of packages from various frameworks was tedious and error-prone (and auto-generating them wasn’t much better, in fact). It greatly increased the complexity of our builds and deployments as we’d continually need to rev released versions of modules. This is to say nothing of the complexities of testing module interactions, or the joys of a ClassNotFoundException appearing suddenly after weeks of smooth operation caused by a forgotten Package-Imports declaration.

After over 18 months of working around framework limitations, we looked at the “Kernel” that arose from coping with these problems and decided “Hey, this is pretty useful. Let’s get rid of underlying systems and just use that.” And now we’re open-sourcing it.

Small but mighty, Zephyr aggressively and automatically parallelizes management operations while running in less than 512KB of memory. It intelligently manages all aspects of plugin lifecycle, including dependency resolution. Deploying new plugins is quick and painless. And, of course, setting up plugin dependencies for tests is, well, a breeze.

While we wrote it in Java, Zephyr works with whatever languages you normally use by installing language runtimes as plugins. You can have multiple frameworks running side by side, eliminating a lot of overhead associated with rewrites, scaling and transitioning architectures.

Zephyr is available on Github under an MIT license. Enterprise support contracts are available. Go check out the website, the docs or the repository. We’d love to have you involved!

Startup Culture is an American Revolution

Happy Fourth of July!

In case you were wondering, this isn’t just another Independence Day blog post talking about the Sunshower platform and how it will bring you freedom, blah blah blah. Rather, this is a blog post emphasizing that the ideals that led to the American Revolution, both within Great Britain and the colonies itself, are alive and well in the American startup culture.

What’s a Cloud Management Platform? (Part 2: Cloud Optimization)

Beautifully simple cloud optimization.

What’s a Cloud Management Platform? (Part 2: Cloud Optimization Edition)

Two weeks ago, we talked about some of the ways that a Cloud Management Platforms (CMP) helps users relieve the headaches associated with DIY cloud resource management. This week, we’ll look at a few more compelling reasons to use a Cloud Management Platform like Sunshower.io for your cloud optimization and cloud resource management.

Sunshower.io Amazon Launch is Featured in BizWest

BizWest: Sunshower.io Now Tracks Cloud Usage on Amazon

By BizWest Staff — June 19, 2019

FORT COLLINS — Sunshower.io, a Fort Collins startup that helps companies optimize cloud computing, has just launched its Amazon Web Services EC2 cloud management platform. Soon, it will bring similar cloud management tools out of beta in order to service customers using Microsoft Azure and the Google Cloud Platform.

Read more.

Official Platform Launch: Sunshower.io for AWS EC2

Sunshower.io launch

We’ve been testing software.

We’ve been onboarding customers.

We’ve been providing unparalleled cloud visibility.

We’ve been rightsizing and optimizing.

We’ve been cutting cloud bills by an average of 66%.

And we’ve been giving it all away for free.

That ends today, with the much anticipated OFFICIAL LAUNCH of Sunshower.io for AWS EC2!

What’s a Cloud Management Platform? (Part 1)

Beautifully simple cloud management.

What’s a Cloud Management Platform? (And Why Do You Need One?) Part 1 of 2

Our official tagline at Sunshower.io is “beautifully simple cloud management and optimization.” But why do you need a Cloud Management Platform like Sunshower.io? When you work with a Cloud Service Provider (CSP) like AWS or Azure, doesn’t the CSP do the cloud optimization for you? Isn’t it the CSP’s job to make sure what you’re running in the cloud is rightsized, your applications are easy to view and manage, and that you’re getting the best possible value for your money? That’s what you’re paying them for, right?

In a word? Nope. That’s all on you, bub.