JUnit 5 and Spring 5

I’ve been extending JUnit to provide project-specific testing infrastructure for quite some time, and have always been disappointed with the extension mechanisms prior to JUnit 5.

To begin with, JUnit 4’s primary extension mechanism was the actual test runner. If you wanted to mixin multiple extensions, you were in for a hassle. For one test-infrastructure-project, I would have the runner dynamically subclass the test case and add behaviors based on the infrastructure-specific annotations that were present. This worked really well, except that most IDEs I tried would not run the tests correctly at the method level, so you’d be left re-running the entire class.

On another project, I jumped through all manner of hoops to avoid this. Spring-test alleviated many of the problems with their TestExecutionLisiteners, but not all, and it was less lovely than subclassing in use. The result worked quite well, but it was pretty excruciating to write. Now that we’re rewriting everything from scratch, I’m implementing our testing infrastructure on JUnit 5 and Spring 5, and JUnit 5 makes this awesome!

How’s it work?

JUnit 5 provides a new extension mechanism called, well, an Extension. Extension is a marker-interface with specializations for each portion of the test lifecycle. For instance, the interface BeforeAll will equip you with a beforeAll method which is called with the container extension context, providing, among other things, the test class.

For testing purposes, we need to inject and override properties pretty routinely. Spring 5’s YAML support isn’t there when used with @TestPropertySource, so I thought I’d whip up my own (full source)

Then, to use it (given a resource-structure) suchlike:

Properties Extension Usage

@RunWith(JUnitPlatform.class)
@ExtendWith({
        SpringExtension.class,
        PropertiesExtension.class
})
@Properties({
        "classpath:classes/frap.yml"
})
@ContextConfiguration(classes = PropertiesExtensionConfiguration.class)
public class PropertiesExtensionTest {

    @Inject
    private Environment environment;

    @Value("${classes.properties.frap}")
    String frap;


    @Test
    public void ensureInjectingPropertiesWorks() {
        assertThat(frap, is("lolwat"));

    }

    @Test
    public void ensureLoadingYamlAtClassLevelWorks() {
        String prop = environment.getProperty("classes.properties.frap");
        assertThat(prop, is("lolwat"));
    }

    @Test
    @Properties("classpath:methods/frap.yml")
    public void ensureOverridingOnMethodWorksWithInjection() {
        String prop = environment.getProperty("classes.properties.frap");
        assertThat(prop, is("overridden"));
    }

    @Test
    @Properties("classpath:methods/frap.yml")
    public void ensureOverridingOnMethodWorks() {
        assertThat(frap, is("overridden"));
    }

}

Build you a Distributed App: Part 1

‘Morning folks,

Today’s blog will be a bit out-of-order (there’s a subsequent blog that I haven’t finished that structurally precedes this one), but I hope it won’t throw too much wire into the hay-baler.

Persistence IDs

The problem we are trying to solve is how to identify objects within a system. Most of the time, we think of object identity as having two components:

  1. Object type and
  2. Some unique name or tag for a given object instance

Selecting the correct persistence ID scheme for your application can be quite a task, and people generally don’t give it much thought at the outset. Later on, when the initial ID scheme isn’t ideal for the application, developers have to perform an ID type migration, which for live systems is frequently one of the nastier modifications to make.

About 80% of the applications that I’ve seen and worked with just use sequential, database-generated integral IDs, and that mostly works for them. There are well documented problems with them, so we’ll avoid them altogether (except under very specific circumstances, which will be discussed on a case-by-case basis). They do have one desirable property that we’ll discuss below, viz. that they’re sorted.

UUIDs

Now that database-generated sequential IDs are out, what about UUIDs? It’s a good question, and I’ve used and seen used UUIDs quite successfully in quite a few applications. But we’re not using them in Sunshower for the following reasons:

  1. They’re pretty ugly. I mean, URLs like sunshower.io/orchestrations/123e4567-e89b-12d3-a456-426655440000/deployments/123e4567-e89b-12d3-a456-426655440000 aren’t great. We previously base-58 encoded our UUIDs to produce prettier URLs along the lines of sunshower.io/orchestrations/11W7CuKyzdu7FGXEVQvK/deployments/11W7CuKz27Y9ePpV2ju9. One of the problems that we encountered was that having different string representations of IDs inside and outside our application made debugging a less straightforward than it needed to be.
  2. They’re pretty inconsistent across different databases and workloads.
    — For write-intensive workloads, UUIDs as primary keys are a poor choice if you don’t have an auxiliary clustering index (which requires that you maintain 2 indexes per table, at least). Insertions into database pages will happen at random locations, and you’ll incur jillions of unnecessary page-splits in high-volume scenarios. On the other hand, adding the additional clustering index will incur additional overhead to writes.
    — Index scans that don’t include the clustering index can perform poorly because the data are spread out all over the disk.

So, is there a better way?

How about Flake?

Twitter back in the day encountered similar issues with ID selection, so they designed the Snowflake ID scheme. There is a 128-bit extension that minimizes the need for node-coordination, which is desirable in our case (especially since we were willing to tolerate 128-bit IDs for UUIDs). The layout of the ID is as follows:

  1. The first 64 bits are a timestamp (for a single-node without modifications to the system clock, monotonically increasing).
  2. 48 random bits (usually a MAC address of a network interface, other schemes could be used)
  3. A 16-bit monotonically-increasing sequence that is reset every time the system-clock ticks forward. This is important because it places an upper limit on the number of IDs that can safely be generated in a given time-period (65,535/second). My implementation provides back-pressure, but this can cause undesirable behavior (contention) in very high-volume scenarios. To put this in perspective, Twitter accommodates an average of 6,000 Tweets/second, but even this would only consume about 10% of the bandwidth of our ID generation for a single node.

Our implementation

full source

I’m sorry. I’m pretty old-school. I like Spring and JPA (and even EJB!). Things like JPQL and transparent mapping between objects and their storage representations (e.g. tables) are important to me. I also super-like transactions, and I really really like declarative transaction management. Why? Because not having these things places a very high burden on development teams, and in my experience, reduces testability, frequently dramatically. Another requirement is that we be able to easily serialize IDs to a variety of formats, so we’ll make our ID JAXB-enabled. Here’s the important parts of the Identifier class:

//not using @Embeddable because we will create a custom Hibernate type for this--that way we can use the same annotations for everything
@XmlRootElement(name = "id")
@XmlAccessorType(XmlAccessType.NONE)
public class Identifier implements
        Comparable<Identifier>,
        Serializable {

    static final transient Encoding base58 = Base58.getInstance(
            Default
    );


    @XmlAttribute
    @XmlJavaTypeAdapter(Base58ByteArrayConverter.class)
    private byte[] value;

    protected Identifier() {

    }

    Identifier(byte[] value) {
        if(value == null || value.length != 16) {
            throw new IllegalArgumentException(
                    \"Argument cannot possibly be a valid identifier\"
            );
        }
        this.value = value;
    }

// other stuff
}

Now, ideally, we would be able to make value final. If an ID is created from thread A and somehow immediately accessed from thread B, final would guarantee that thread A and thread B would always agree on the value of value. Since neither JAXB nor JPA really work with final fields, we can’t really do that. We could partially fix value‘s publication by marking it volatile, but there are downsides to that as well. The solution that I’m opting for is protecting the creation of Identifiers by forcing the creation of IDs to occur within a sequence (note the protected and package-protected constructors of Identifier):

public interface Sequence<ID extends Serializable> {
    ID next();
}

with a Flake ID sequence (full source: [](Flake ID))

    @Override
    public Identifier next() {
        synchronized (sequenceLock) {
            increment();
            ByteBuffer sequenceBytes =
                    ByteBuffer.allocate(ID_SIZE);
            return new Identifier(
                    sequenceBytes
                            .putLong(currentTime)
                            .put(seed)
                            .putShort((short) sequence).array()
            );
        }
    }

Now, we’re guaranteed that sequences can be shared across threads, and we have several options:

  1. Each entity-type could be assigned its own sequence
  2. Entities can share sequences

We don’t really care too much about ID collisions across tables, and we can generate a ton of IDs quickly for a given sequence, so we’ll just default to sharing a sequence for entities:

@MappedSuperclass
@XmlDiscriminatorNode(\"@type\")
public class AbstractEntity extends
        SequenceIdentityAssignedEntity<Identifier> {

    static final transient Sequence<Identifier> DEFAULT_SEQUENCE;

    static {
        DEFAULT_SEQUENCE = Identifiers.newSequence(true);
    }

    @Id
    @XmlID
    @XmlJavaTypeAdapter(IdentifierAdapter.class)
    private Identifier id;


    protected AbstractEntity() {
        super(DEFAULT_SEQUENCE);
    }


    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof AbstractEntity)) return false;
        if (!super.equals(o)) return false;

        AbstractEntity that = (AbstractEntity) o;

        return id != null ? id.equals(that.id) : that.id == null;
    }

    @Override
    public int hashCode() {
        int result = super.hashCode();
        result = 31 * result + (id != null ? id.hashCode() : 0);
        return result;
    }


    @Override
    public String toString() {
        return String.format(\"%s{\" +
                \"id=\" + id +
                '}', getClass());
    }
}

MOXy only allows you to use String @XmlID values, so we need to transform our IDs to strings (hence the @XmlJavaTypeAdapter)

In the next blog post, we’ll demonstrate how to make Identifiers JPA native types!

Announcing Sunshower.io

Hello and welcome to sunshower.io’s official blog! Josiah here excited to tell you about who we are and where we are going. Over the past 16 months, I’ve been working as the CTO of a startup. Unfortunately, my co-founders and I were not able to align our visions for the product and the project, so we decided to part ways. I had hoped that we would be able to reach an agreement to open-source the project, but I doubt that will ever fully materialize. In any case, I think it’s best to start a new project with a clean slate. This will ensure that the project is and always remains free of any legal or conceptual constraints, and I believe that with a more cohesive vision, we will be able to deliver a better product. With that in mind, let’s talk about what it is!

The goals of sunshower.io are to provide a robust, streamlined platform for Provisioning-as-a-Service (PaaS, not to be confused with Platform-as-a-Service) and Configuration-as-a-Service, so let’s talk about what we mean.

Configuration-as-a-Service

I had initially envisioned our project to be peer-to-peer, secure Git. The goal was to be able to commit arbitrary files (including very large binaries, helpful for installers), indicate that some files should be treated as templates (i.e. filtered through various data-sources available to the system, such as our key-value store, system properties, etc.), and then “push” out the repository to collections of systems. Failures could be handled according to several strategies (up to and including reverting the entirety of the push across the cluster), and you could seamlessly revert to a previous commit. Sunshower.io will not attempt to handle as many enterprise cases initially. At least, if we do, we’ll delegate down to the capabilities of the orchestration provider (e.g. Docker Swarm or Kubernetes). One of the primary goals of sunshower is to abstract the process of deployment away from the technologies used for the deployment, and so we’ll have to think carefully about how we go about it: not every orchestration provider is going to tackle the same problems the same ways.

Provisioning-as-a-Service

The other domain that we thought a lot about tackling was provisioning. Provisioning infrastructure elements such as virtual machines and security groups is conceptually the same across all the cloud providers, and so we would like to be able to take the same orchestration and deploy it to different clouds without any changes. Of course, things like our default sizing tiers won’t generally be adequate for many users needs, but there are sufficient similarities between most CSP provider instance-types that translating between them is frequently possible. In the cases where it’s not, overriding at deployment is always a possibility.

Thanks again for taking some time to learn more about us! Expect updates here frequently, and we’ll get this out the door ASAP. You can always check our progress out on Github.