Building a Home Cloud with Proxmox: DNS + Terraform

Overview

In our last post, we configured a Ceph storage cluster, which we’ll be using as the storage for our virtual machines that we’ll be using to host Kubernetes.

Before we get to that, however, we need to configure our DNS environment. Recall that, in part 2, we configured a Windows server to act as our DNS and DHCP server. For this, you don’t need a Windows Server installation, you just need an RFC 2136-compliant DNS server like PowerDNS. Keep in mind that RFC 2136 has some security vulnerabilities, so don’t use dynamic DNS in a high-security environment or an environment wherein untrusted devices may be on the network.

To provision our DNS names, we’ll be interacting with our DNS server via the Terraform DNS provider. Terraform’s easy to set up, so we won’t be covering that.

Note that this solution requires a fair amount of work in Bash. Other solutions like Chef/Puppet/Ansible/Salt may strictly be better-suited for this sort of work, but since our environment is quite homogeneous (99% Debian), we’ll just script this using basic tools.

A note on Terraform Typing

Terraform’s variable/parameter management system, specifically its inability to share variables between modules can result in some duplication between modules. This isn’t the cleanest, but I don’t feel like it’s a big enough problem to warrant bringing in other tools like Terragrunt. Terraform’s type-system supports a flavor of structural typing wherein you can declare a variable with a subset of another variable’s fields and supply the first variable as the second variable. For instance:

variable "A" {
type = object({
name = string
ip = string
})
}
variable "B" {
type = object({
name = string
})
}
view raw variables.tf hosted with ❤ by GitHub

Usage:

   // module B declares B, and can use a value of type A as the value for b:B
    b = var.a

We’ll be using this feature to share variable values across modules.

Setup

Since we’re reserving the upper range of our /24 subnet for MetalLB dynamically-provisioned IP addresses, we’ll configure the worker nodes with static IP addresses assigned from the lower range of our subnet. Since we have a small number of physical machines, we’ll opt for fewer, larger worker-nodes instead of more, smaller worker-nodes. The subnet-ranges we’ll select for these roles are as follows:

  1. 2-10: infrastructure (domain-controller, physical node IPs, etcd VMs, Kubernetes leader nodes)
  2. 10-30: static addresses (worker nodes)
  3. 30-200: DHCP leases for dynamic devices on the network + MetalLB IPs
  4. 200-255: available

To set that up in Windows Server:

  1. Navigate to the Server Manager
  2. From the top-right Tools menu, select DHCP
  3. Right-click on the IPv4 node under your domain-controller host-name and select New Scope
  4. Provide a Start address of <subnet prefix>.30 and an end-address of 200
  5. Continue through the wizard providing your gateway addresses and your exclusions. For the exclusions, select <subnet prefix>-1 through 30.
  6. Finish the wizard and click Create to create your scope.
  7. Right-click on the created scope node in the tree. Select the DNS tab, then enable DNS dynamic updates. Select the Dynamically update DNS records only if requested by the DHCP clients.
  8. Click Ok to save your changes.

Configure your DD-WRT Router

Navigate back to your gateway router. Most capable routers will have an option to either act as a DHCP server (the default), or to act as a DHCP forwarder. For DD-WRT, select the `DHCP Forwarder option and provide the IP address of your DHCP server as the target.

Configure your DNS Server

Go back to your domain controller.

  1. From the Server Manager, select DNS
  2. Locate the Forward Lookup Zone node in the DNS Manager tree that you created (ours was sunshower.io. Right-click on the Forward Lookup Zone node and click Properties
  3. In the General Tab, select the Nonsecure and Secure option from the Dynamic updates configuration. Note: This is not suitable for an open-network or one admitting untrusted/unknown devices
  4. Click Ok to apply.

Congrats! We’re now ready to provision our DNS in Terraform.

Terraform Configuration

Note: We’ve provided our Terraform configurations here, so feel free to point them at your infrastructure if you’ve been following along instead of creating your own.

In the directory of your Terraform project ($TFDIR), create a directory called dns.
1. In your dns directory, create 3 files: main.tf, variables.tf, and outputs.tf

Main.tf

Your main.tf file should contain:

terraform {
required_providers {
dns = {
source = "hashicorp/dns"
}
}
}
/**
provision DNS entries for each host in the `hosts` list
*/
resource "dns_a_record_set" "virtual_machine_dns" {
for_each = {for vm in var.hosts: vm.name => vm}
zone = var.dns_server.zone
name = each.value.name
addresses = [
each.value.ip
]
}
/**
provision the DNS A-record for the API server
*/
resource "dns_a_record_set" "api_server_dns" {
addresses = [
var.api_ip
]
zone = var.dns_server.zone
name = var.api_dns
}
view raw main.tf hosted with ❤ by GitHub

Now, in your Terraform project $TFDIR (the parent of the dns directory you’re currently in, create main.tf, outputs.tf, and variables.tf. Symlink the dns/variables.tf to this directory as dns-variables.tf via ln -s $(pwd)/vm-variables.tf $(pwd)/dns/variables.tf to make your DNS variables available to your root configuration.

Parent module

In your $TFDIR/main.tf, add the following block:

terraform {
required_version = ">=0.14"
required_providers {
dns = {
source = "hashicorp/dns"
version = "3.0.1"
}
}
}
view raw main.tf hosted with ❤ by GitHub

Configure the DNS provider to point to your RFC-2136-compliant DNS server:

provider "dns" {
update {
server = var.dns_server.server
}
}
view raw dns-config.tf hosted with ❤ by GitHub

Call the dns submodule:

/**
Create DNS entries for virtual machines
*/
module "dns_configuration" {
for_each = var.cluster_nodes
source = "./dns"
dns_server = var.dns_server
hosts = each.value
api_dns = var.api_server
api_domain = var.domain
api_ip = var.load_balancer
}
view raw main.tf hosted with ❤ by GitHub

At this point, we should be ready to actually provide our Terraform variables.

  1. Create a file in your $TFDIR directory called configuration.auto.tfvars–Terraform will automatically pick up any in the root directory according to the naming convention *.auto.tfvars. Be sure to never check these into source-control. I add the .gitignore rule **/*.tfvars

In your $TFDIR/configuration.auto.tfvars file, add the following configurations (substitute with your values):

dns_server = {
// replace with your zone-name (configured above)
zone = "sunshower.cloud." // note the trailing period here--it's mandatory
// replace with your DNS server's IP
server = "192.168.1.2"
}
// this generates the DNS configuration for `kubernetes.sunshower.cloud`
api_dns = "kubernetes"
api_domain = "sunshower.cloud"
api_ip = "192.168.1.10" // provide the API you

Finally, we’ll add our Kubernetes cluster DNS configurations:

cluster_nodes = {
etcd_nodes = [
{
name = "etcd-1"
ip = "192.168.1.5"
},
{
name = "etcd-2"
ip = "192.168.1.6"
},
{
name = "etcd-3"
ip = "192.168.1.7"
}
],
k8s_leaders = [
{
name = "k8s-leader-1"
ip = "192.168.1.8"
},
{
name = "k8s-leader-2"
ip = "192.168.1.9"
}
],
k8s_workers = [
{
name = "k8s-worker-1"
ip = "192.168.1.11"
},
{
name = "k8s-worker-2"
ip = "192.168.1.12"
},
{
name = "k8s-worker-3"
ip = "192.168.1.13"
}
]
}

At this point, running terraform apply --auto-approve should quickly generate the DNS entries. You should navigate to your DNS server and refresh the forward lookup zone to confirm that your entries exist.

Conclusion

In this part, we configured DHCP, DNS, and provided a Terrform module that provisions DNS entries for each of the VMs that we’ll provision in subsequent posts. Next time, we’ll create a Terraform module that will provision virtual machines of the correct roles for our Kubernetes cluster. Our final post of the series will show how to actually deploy the Kubernetes cluster–stay tuned!

Building a Home Cloud with Proxmox Part 3: Configuring Ceph and Cloud-Init

Overview

Last time, we configured a Windows Server domain controller to handle DNS, DHCP, and ActiveDirectory (our LDAP implementation). In this post, we’ll be configuring the Ceph distributed filesystem which is supported out-of-the-box by Proxmox.

Ceph Storage Types

Ceph supports several storage types, but the types we’re interested in for our purposes are

  1. Filesystem
  2. Block

A Ceph filesystem behaves like a standard filesystem–it can contain files and directories. The advantage to a Ceph filesystem for our purposes is that we can access any files stored within it from any location in the cluster, whereas with local-storage (e.g. local-lvm or whatever), we can only access them on the node they reside upon.

Similarly, a Ceph block device is suitable for container and virtual machine hard-drives, and these drives can be accessed from anywhere in the cluster. This node-agnosticism is important for several reasons:

  1. We’re able to create base images for all our virtual machines and clone them to any node, from any node
  2. We’re able to use Ceph as a persistent volume provider in Kubernetes

In this scheme, your Ceph IO performance is likely to be bottlenecked by your intra-node network performance. Our 10Gbps network connections between nodes seems to perform reasonably well, but is noticeably slower than local storage (we may characterize this at some point, but this series isn’t it).

Drive Configuration

In our setup, we have 4, 4 TB drives on each node. Ceph seems to prefer identical hardware configurations on each node it’s deployed on, so if your nodes differ here, you may want to restrict your Ceph deployment to identical nodes.

Create your Ceph partitions on each node

Careful Ensure that the drives you’re installing Ceph onto aren’t being used by anything–this will erase everything on them. You can colocate Ceph partitions with other partitions on a drive, but that’s not the configuration we’re using here (and how well it performs depends on quite a few factors)

We’re going to use /dev/sdb for our Ceph filesystem on each node. To prepare each drive, open a console into its host and run the following:

  1. fdisk /dev/sdb
  2. d to delete all the partitions. If you have more than one, repeat this process until there are no partitions left.
  3. n to create a new partition
    1 g to create a new GPT partition table
  4. w to write the changes

Your disks are now ready for Ceph!

Install Ceph

On each node, navigate to the left-hand configuration panel, then click on the Ceph node. Initially, you’ll see a message indicating that Ceph is not installed. Select the advanced option and click Install to continue through the wizard.

When you’re presented with the options, ensure that the osd_pool_default_size and osd_pool_default_min_size configurations are set to the number of nodes that you intend to install Ceph on. While the min_size can be less than the pool_size, I have not had good luck with this configuration in my tests.

Once you’ve installed Ceph on each node, navigate to the Monitor node under the Ceph configuration node and create at least 1 monitor and at least 1 manager, depending on your resiliency requirements.

Create an object-storage daemon (OSD)

Our next step is to allocate Ceph the drives we previously provisioned (/dev/sdb). Navigate to the OSD node under the Ceph node, and click create OSD. In our configuration, we use the same disk for both the storage area and the Ceph database (DB) disk. Repeat this process for each node you want to participate in this storage cluster.

Create the Ceph pool

We’re almost there! Navigate to the Pools node under the Ceph node and click create. You’ll be presented with some options, but the options important to us are:

  1. size: set to the number of OSDs you created earlier
  2. min-size: set to the number of OSDs you created earlier

Ceph recommends about 100 placement groups per OSD, but the number must be a power of 2, and the minimum is 8. More placement groups means better reliability in the presence of failures, but it also means replicating more data which may slow things down. Since we’re not managing dozens or hundreds of drives, I opted for fewer placement groups (16).

Click Create–you should now have a Ceph storage pool!

Create your Ceph Block Storage (RBD)

You should now be able to navigate up to the cluster level and click on the storage configuration node.

  1. Click Add and select RBD.
  2. Give it a memorable ID that’s also volume-friendly (lower case, no spaces, only alphanumeric + dashes). We chose ceph-block-storage
  3. Select the cephfs_data pool (or whatever you called it in the previous step)
  4. Select the monitor node you want to use
  5. Ensure that the Nodes value is All (no restrictions) unless you want to configure that.
  6. Click Add!

Congrats! You should now have a Ceph block storage pool!

Create your Ceph directory storage

In the same view (Datacenter > Storage):

  1. Click Add and select CephFS
  2. Give it a memorable ID (same rules as in the previous step), we called ours ceph-fs
  3. Ensure that the content is selected to all the available options (VZDump backup file, ISO image, Container Template, Snippets)
  4. Ensure the Use Proxmox VE managed hyper-converged cephFS option is selected

Click Add–you now have a location you can store VM templates/ISOs in that is accessible to every node participating in the CephFS pool! This is important because it radically simplifies configuration via Terraform, which we’ll be writing about in subsequent posts.

Create a Cloud-Init ready VM Template

We use Debian for virtually everything we do, but these steps should work for Ubuntu as well.

  1. Upload a Debian ISO (we use small installation image) to your CephFS storage
  2. Perform a minimal installation. The things you’ll want to pay attention to here are:
    • Ensure your domain-name is set to whatever domain-name you’ve configured on your domain-controller
    • Only install the following packages: SSH Server and Standard System Utilities
  3. Finish the Debian installation wizard and power on your VM. Make a note of the ID (probably 100 or something) I refer to as VM_ID in subsequent steps
  4. In the Proxmox console for your VM, perform the following steps:
    • apt-get update
    • apt-get install cloud-init net-tools sudo
  5. Once APT is finished installing those packages, edit your sshd_config file at /etc/ssh/sshd_config using your favorite editor (e.g. vi /etc/ssh/sshd_config) and ensure that PermitRootLogin is set to yes from its default of prohibit-password. These are all air-gapped in our environment, so I’m not too worried about the security ramifications of this. If you are, be sure to adjust subsequent configurations to use certificate-based auth.
  6. Save the file and shut down the virtual machine

Now, let’s enable CloudInit!

On any of your nodes joined to the CephFS filesystem:

  1. Open a console
  2. Configure a CloudInit drive:
    • Using the id for the Ceph RBD/block store (referred to as <BLOCK_STORE_ID> we configured above (ours is ceph-block-storage), and the VM_ID from the previous section, create a drive by entering qm set <VM_ID> --ide2 <BLOCK_STORE_ID>:cloudinit. For instance, in our configuration, with VM_ID = 100, this is qm set 100 --ide2 ceph-block-storage:cloudinit. This should complete without errors.
  3. Verify that your VM has a cloud-init drive by navigating to the VM, then selecting the Cloud-Init node. You should see some values there instead of a message indicating Cloud-Init isn’t installed.

You should also verify that this machine is installed on the correct block-storage by attempting to clone it to another node. If everything’s configured properly, cloning a VM to a different node should work seamlessly. At this point, you can convert your VM to a template for subsequent use.

Conclusion

In this post we installed and configured Ceph block and filestores, and also created a Cloud-Init capable VM template. At this point, we’re ready to begin configuring our HA Kubernetes cluster using Terraform from Hashicorp. Our Terraform files will be stored in our public devops Github repository

Building a Home Cloud with Proxmox Part 2: Domain Configuration

Overview

Last time I went over how to configure a router equipped with DD-WRT for managing a home cloud, as well as installing the awesome Proxmox virtualization environment. This time, we’ll go over how to configure a Windows domain controller to manage ActiveDirectory profiles, as well as DNS.

General configuration

The end goal is to provision a cluster that’s running a Windows domain controller (for DNS and ActiveDirectory), as well as a Kubernetes cluster. Once we have that we’ll deploy most of our services into the Kubernetes cluster, and we’ll be able to grant uniform access to every object via the domain controller.

Note that multitenancy isn’t a goal for this home cloud, although the domain controller is where we would add that. We could also partition our subnet into more subnets, providing functionality analogous to public cloud providers’ notions of Virtual Private Clouds.

Step 1: Create a Windows Server VM

  1. Ensure you have a suitable storage location:
    1. Open a shell to one of your nodes
    2. Type lsblk–it should show you a tree of your available hard drives and partitions
    3. If you don’t see a partition under your drives, type fdisk. In fdisk <drive>, where <drive> is the drive you want (probably not /dev/sda–that’s likely where Proxmox is installed), create a GPT partition by typing g. Write the changes by entering w
    4. Now you should be able to provision a VM on that drive/partition in the subsequent steps.
  2. Download the Windows Server ISO
  3. Upload the Windows Server ISO to Proxmox:
    1. The Windows server ISO is too large to upload via the GUI, so pick a node (e.g. athena.sunshower.io), and scp it up: scp <iso file> root@<node>:/var/lib/vz/template/iso, for example:scp windows_server_2019.iso root@athena.sunshower.io/var/lib/vz/template/iso/
    2. Create a new VM
      • Select Windows as the type, the default version should be correct, but verify it’s something like 10/2019/
      • Give it at least 2 cores, 1 socket is fine. The disk should have at least 50GB available.
      • I provided mine with 16 GB RAM, but we have pretty big machines. 4 works fine, and you can change this later.
    3. One of the prompts will be to assign your server to a domain. You should enter the domain that you want to control (ours being sunshower.io).

You should be able to access your Windows Server VM now, but it’s not quite ready.

Configure your Domain Controller’s DNS

The first thing we’ll want to do is ensure that our domain controller is at a deterministic location. For this, we’ll want to
1. Assign it a static DHCP lease in our router
1. Assign it a static IP locally
1. Assign it a DNS entry in our router

Assign the DHCP lease

  1. In the Windows VM, open an elevated console and run ipconfig /all. This will show all of your IP configurations, as well as the MAC addresses they’re associated with. Locate the entry for your subnet (ours is 192.168.1.0/24), and locate the associated MAC address (format AA:AA:AA:AA:AA:AA, where AA is a 2-letter alphanumeric string).
  2. Once you have your MAC address, log into your router, navigate to the Services/Services subtab, find the Static Leases section, and create an entry whose MAC address is the one you just obtained, and whose IP address is towards the low-end of your subnet (we selected 192.168.1.5). Choose a memorable hostname (we went with control.sunshower.io) and enter that, too.

Save and apply your changes.

Assign the static IP locally

  1. In your Windows VM, navigate to Control Panel > Network and Internet > Network Connections. You’ll probably only have one.
  2. Right-click on it and select Properties. Navigate to Internet Protocol Version 4 (TCP/IPv4)
  3. Select Use the following IP address
    • For IP Address, enter the IP you set in the previous section (Assign the DHCP lease)
    • For Subnet mask, (assuming a /24 network), enter 255.255.255.0. You can look these up here
    • For Default gateway, select your router IP (ours is 192.168.1.1)

For the DNS Server Entries, you can set anything you like (that is a valid DNS server). We like Google’s DNS servers (8.8.8.8, 8.8.4.4). Notice that we’re not actually setting our router to be a DNS server, even though it is configured to be one. That’s because we generally want the domain controller to be our DNS server.

Click Ok and restart just to be sure.

Configure your Domain Controller

Your Windows Server VM will need to function as a domain controller (and DNS server) for this to work. After you’ve created the Windows Server VM, log into it as an administrator. You should be presented with the Server Manager Dialog.

  1. From the top-right, click the manage button, select Add Roles and Features. This will bring you to the Roles & Features wizard.
  2. Select Role-based or feature-based installation. Click Next
  3. Select your server. Click Next
  4. Select the following roles
    • Active Directory Domain Services
    • Active Directory Lightweight Directory Services
    • Active Directory Rights Management Services -> Active Directory Rights Management Server
    • DHCP Server
    • DNS Server

Click Install–grab some coffee while Windows does its stuff. It’ll restart at least once.

Configure DNS

Almost done! We’re going to add all the entries that we had previously added in the router to the Domain Controller so that we only need to reference one DNS server: our domain controller. Finally, we’ll join a home computer to the domain.

Add the DNS records

  1. From the Server Manager window, select Tools and click DNS from the dropdown.
  2. Expand the Forward Lookup Zones node in the tree. You should see your domain under there, possibly along with other entries such as _msdcs.<your domain.
  3. Select <your domain>. Ours is sunshower.io
  4. Right-click <your domain>, add new host (A or AAAA)
  5. For each of your Proxmox nodes, add the name (e.g. athena)–the fully-qualified domain name should automatically populate.
    • Provide the IP address of the associated node. If you followed along closely to the previous entry, they’ll be (192.168.1.2 up to 1 + number of nodes)
  6. Create an entry for your router such as gateway.my.domain (e.g. gateway.sunshower.io) and point it at your gateway IP (e.g. 192.168.1.1)
  7. Create an entry for your controller such as control.my.domain (e.g. control.sunshower.io) and point it at your controller’s static IP.

Save and restart your controller.

Create a ActiveDirectory User

  1. Log back into your domain controller and, from the server manager select the Tools menu, then Active Directory Users and Computers.
  2. From the left-hand navigator tree, expand your domain node and right-click on the Users Sub-node.
  3. Select New, and then User. Fill in your information, and make a note of the username and password, we’ll use it in the next step.

Join a computer to your domain!

On your local computer, navigate to:
1. Control Panel > System and Security > System
1. Under the section Computer name, domain, and workgroup settings, select Change settings. The System Properties dialog should appear.
1. Select the Change button to the right of To rename this computer or change its domain or workgroup, click Change
1. Set your computer name (kitchen, workstation, etc.)
1. Set member of to domain with your domain value (e.g. sunshower.io).

Click OK–this can take a bit.

Once that’s done, restart your computer. You should now be able to log in as <DOMAIN>/ (configured in the ActiveDirectory section above)!

Your profile should automatically use your domain controller as its DNS server, so you should be able to ping all of your entries.

Conclusion

Windows Server makes it really easy to install and manage domains, almost enough to justify its steep price-tag. A domain server allows you to control users, groups, permissions, accounts, applications, DNS, etc. in a centralized fashion. Later on, when we configure Kubernetes, we’ll take advantage of Windows Server’s powerful DNS capabilities to automatically provision DNS entries associated with IPs allocated from our cloud subnet and use them to front services deployed into our own high-availability Kubernetes cluster.

If you don’t want to/can’t use Windows Server, you can replicate this functionality using Apache Directory, and PowerDNS. I may do a post on how to setup and configure these, but my primary goal is to move on to the Kubernetes cluster configuration, and managing DNS and LDAP are two relatively complex topics that are greatly simplified by Windows Server.

Building a Home Cloud with Proxmox: Part 1

We’re back!

Whew! As you’re probably already aware, 2020 was an incredibly weird year–but we’re getting back on the wagon and going leaner in 2021! Last year, we’d installed solar power to take advantage of Colorado’s 300+ days of sunshine, and we had some relatively beefy workstations left over from our defense work. At the same time, we noticed that our R&D AWS spend is a little high for our budget right now, so we decided to build a local cloud on top of the wonderful Proxmox virtualization management software. Let’s get started!

Structure

To begin with, we reviewed which services from AWS we needed, and came up with the following list:

  1. RDS
  2. Jenkins CI/CD
  3. Kubernetes (We hosted our own with KOPS on AWS–highly recommend it)
  4. Route53
  5. AWS Directory Service
  6. Nexus
  7. Minio

(I haven’t really decided if I want to use OpenShift or not for this cluster yet).

We’re also probably going to end up using MetalLB to allocate load-balancers for our bare-metal(ish) cluster.

Let’s get to it!

Step 0: Hardware configuration

Our setup is 3 HP Z840s with the following configuration:
1. 2TB RAM
1. 24 logical CPUs on 2 sockets
1. 4 4TB Samsung PRO 860 SSDs

For ease-of-use, we also recommend the BYTECC 4 HDMI port KVM switch (for these machines, anyway, YMMV)

Step 1: Configure your Network

  1. Check your network configuration. We have each box hooked up directly to a Linksys AC32000 running DD-WRT.
    1. (Note that these instructions are for DD-WRT–if you don’t use that, you’ll have to find equivalent configurations for your router).
    2. IMPORTANT: Ensure that your network is at least a /24 (256 total IPs)–you’ll end up using more than you expect.
    3. Useful tip: Install your nodes towards the low-end of your subnet. Ours are at 192.168.1.[2, 3, 4] with the gateway 192.168.1.1.
  2. Next, configure your DD-WRT installation to supply DNS over your LAN by:
    1. Navigating to 192.168.1.1 and logging in
    2. Navigate to services from the top level, then ensure the services sub-tab is selected
    3. Select Use NVRAM for client lease DB (this will allow your configuration to survive power-outages and restarts)
    4. Set Used Domain to LAN & WAN
    5. Set your LAN Domain to whatever you want it to be. This is purely local, so it can be anything. We own sunshower.io so we just used that. home.local or whatever should work just fine as well.
    6. Ensure DNSMasq is enabled
    7. Add bash
      local=/home/
      expand-hosts
      domain-needed
      bogus-priv
      addn-hosts=/jffs/etc/hosts.home
      strict-order

      to your Additional DNSMasq Options Text box
    8. Save & Apply your configuration

Step 2: Install Proxmox

Proxmox is an open-source alternative to (approximately?) vSphere from VMWare, and we’ve used it for a variety of internal workloads. It’s quite robust and stable, so I encourage you to check it out!

The first thing is to burn a USB drive with the Proxmox ISO. UEFI hasn’t worked well for me on these machines (they’re a little old), so the Rufus configuration I used was:

Partition scheme: MBR
File System: FAT32
Cluster Size: 4096 bytes (default)

I went ahead and checked the Add fixes for old BIOSes option since these are pretty old bioses. Hitting start will ask you if you want to burn the image in ISO image mode or DD image mode. With these 840s, ISO mode didn’t work for me, but DD did. Hit START and grab a coffee!

Once your USB drive is gtg, install Proxmox VE on each of the machines:

  1. Plug your USB drive into the first node in your cluster
  2. Restart/power on your node
  3. Select your USB drive from the boot options. If you followed our instructions for Rufus, it’ll probably be under Legacy instead of UEFI or whatever.
  4. IMPORTANT: Click through the wizard until it asks you which drive/partition you want to install Proxmox into. Select Options and reduce the installation size to about 30GB. I don’t know what the minimum is, but 30GB works great for this setup. This even gives you some space to upload ISOs to the default storage location. (Note that if you don’t select this the installation will consume the entire drive).
  5. Continue on the installation until you see your network configuration.
    1. Select your network (ours is 192.168.1.1/24–yours may not be, depending on how you configured your router (above))
    2. Input your node IP (we went with 192.168.1.2 for the first, .3 for the second, etc.)
    3. Add a cool hostname. If you configured your router as we did above, you should be able to input <name>.. For instance, for our 3 nodes we went with athena.sunshower.io, demeter.sunshower.io and calypso.sunshower.io.
  6. Make a note of the username and password you chose for the root. Slap that into Lastpass–you’ll thank yourself later.

Repeat this process for each node you have. Once that’s complete, navigate to any of the nodes at https://<ip&gt;:8006 where <ip> is the IP that you configured in the Proxmox network installation step. Your browser will yell at you that this isn’t a trusted site since it uses a self-signed certificate by default, but that’s OK. Accept the certificate in your browser, then login with the username and password you provided. In the left-hand pane of your Proxmox server, select the top-level Datacenter node, then click Create Cluster. This will take a second, at which point you’ll be able to close the dialog and select Join Information. Copy the contents of the Join Information text-area.

Once you have your Join Information, navigate to each of the other nodes in your cluster. Log in, then select the top level Datacenter node once again. This time, click on the Join Cluster button. Paste the Join Information into the text area, then enter the root password of the first node in the root password text field. In a second or two, you should see 2 cluster nodes under the Datacenter configuration. Repeat this process with all of the nodes you set Proxmox up on!

Configure DNS

You may need to configure your DNS within your router at this point. Click on the Shell button for each node, and run:
1. apt-get update
1. apt-get install net-tools

Then, run ifconfig. You should see something like:

enp1s0: flags=4163&lt;UP,BROADCAST,RUNNING,MULTICAST&gt;  mtu 1500
        ether d8:9d:67:f4:5e:a0  txqueuelen 1000  (Ethernet)
        RX packets 838880  bytes 173824493 (165.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 907011  bytes 170727177 (162.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 17  memory 0xef500000-ef520000  

lo: flags=73&lt;UP,LOOPBACK,RUNNING&gt;  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10&lt;host&gt;
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 669  bytes 161151 (157.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 669  bytes 161151 (157.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vmbr0: flags=4163&lt;UP,BROADCAST,RUNNING,MULTICAST&gt;  mtu 1500
        inet 192.168.1.2  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::da9d:67ff:fef4:5ea0  prefixlen 64  scopeid 0x20&lt;link&gt;
        ether d8:9d:67:f4:5e:a1  txqueuelen 1000  (Ethernet)
        RX packets 834450  bytes 158495280 (151.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 902499  bytes 166814715 (159.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

This is on athena.sunshower.io. Your list might look different. Note that under the vmbr0 entry there’s a set of letters/digits to the right of ether (in my case, d8:9d:67:f4:5e:a1). Make a note of these for each node, then go back to your gateway, log in, and navigate to Services once again. The default subtab Services should also be selected–if it’s not, select that.

You’ll see a field called Static Leases–click add. You’ll see a set of 4 fields:
1. MAC Address
1. Hostname
1. IP Address
1. Client Lease Time

For each of the host/MAC addresses you found previously (e.g. athena.sunshower.io -> d8:9d:67:f4:5e:a1), fill the MAC address field with the MAC address, the hostname with the corresponding hostname, and the IP address of the node. Save & Apply Settings. You should be able to visit any of the nodes at their corresponding DNS name. For instance, to access the Proxmox cluster from athena, I can visit https://athena.sunshower.io:8006

Conclusion

At the end of this exercise, you should have a Proxmox cluster with at least 1 node that’s accessible via DNS on your local cluster!

Cloud-Agnostic Infrastructure-As-Code Trials and Tribulations

Opinions around cloud-agnosticism vary, but generally it is considered a truism that for cloud-agnostic deployments, infrastructure-as-code (IaC) is the best, perhaps the only way to achieve them. Cloud-agnostic deployments are obviously not a goal for every organization, but we’ve found that as organizations become more sophisticated it frequently becomes a need. This is most often due to customers’ on-premise needs or their own hybrid-cloud requirements. Yet despite their sophistication, these organizations almost universally struggle to implement cloud-agnostic deployments, even with infrastructure-as-code tools.

Tools like Terraform and Kubernetes are the most often used for cloud-agnostic infrastructure-as-code, and while we love and use them as much as the next DevOps nerd, we’ve found that this solution can run into a variety of problems. Broadly, in our experience, these cloud deployment and management problems fall into the following categories:

  1. Visibility (what infrastructure am I renting, and where?)
  2. Infrastructure design and architecture
  3. Infrastructure deployment
  4. Cost analysis and optimization, especially as it pertains to infrastructure utilization
  5. Parameterization (secrets, credentials, etc.)

There aren’t very many cloud management tools that capture all of the categories. Those that exist do have issues with “common denominator” problems — that is, concerns around what functionality remains consistent across clouds. We find that folks overestimate the degree to which common denominator problems exist and how problematic they actually are. While the names and APIs for services vary wildly across cloud platforms, once you get past the syntax, they’re functionally the same. Indeed, we find that abstracting out sensible defaults in our abstraction layer solves most of those problems. Usually, the problems arise when mapping individual capability concepts — e.g. porting function-as-a-service-based code from AWS Lambda to another cloud. Even more typically, it comes from providing a common view of infrastructure elements between clouds, tenants, regions, etc.

We’ve sought to alleviate these problems with Troposphere, our drag-and-drop orchestration engine that can take whatever your current infrastructure-as-code solution is and make it cloud-agnostic. Over the coming posts, we’ll address the issues we’ve seen in implementing cloud-agnostic infrastructure-as-code solutions, and how to address them.

Reflecting on 2019

Like most folks coming into this new year and new decade, we’ve been reflecting on the past. Sunshower.io is far less than a decade old (we turn two in March!), but even a year is a long time in the life of a small business. Despite our small size, it was a banner year.

First Six Months

The first six months of the year were largely occupied by releasing the Sunshower platform to the public. Our web application plays host to Stratosphere, allowing you to visualize your AWS EC2 infrastructure around the globe, and Anvil, allowing you to save 66% on your AWS EC2 bill with right-sizing.

We also spent a lot of the first half of the year doing the “startup circuit” — taking second in a startup pitch competition, doing lots of networking events with investors, and interviewing at Y Combinator.

Last Six Months

From there, we abruptly got pulled into the land of government contracting. A large defense contractor contacted us about doing a white paper together for the Air Force, which we submitted in July. We spent August waiting to hear if we we would receive an RFP, and September writing the proposal. Our contacts at the Air Force think our offering is incredibly valuable, and last we heard, the proposal was in the technology evaluation phase. Interested in what we pitched them? Download our marketing white paper.

This fall, we’ve renewed our commitment to open-source software. Most notably, we refactored our plugin framework into Zephyr, so other organizations can reap the benefit of a non-OSGi system for lifecycle and dependency management. We also renewed our work on Aire, a UI framework built on Aurelia and UIKit.

Last but not least, we finished out the year with a bit of a rebrand. We have a new logo and doubled down on our bright colors. We also changed our tagline to reflect our movement away from optimization and towards application and infrastructure management and deployment.

We’re looking forward to what 2020 has in store!

Startup Culture is an American Revolution

Happy Fourth of July!

In case you were wondering, this isn’t just another Independence Day blog post talking about the Sunshower platform and how it will bring you freedom, blah blah blah. Rather, this is a blog post emphasizing that the ideals that led to the American Revolution, both within Great Britain and the colonies itself, are alive and well in the American startup culture.

What’s a Cloud Management Platform? (Part 2: Cloud Optimization)

What’s a Cloud Management Platform? (Part 2: Cloud Optimization Edition)

Two weeks ago, we talked about some of the ways that a Cloud Management Platforms (CMP) helps users relieve the headaches associated with DIY cloud resource management. This week, we’ll look at a few more compelling reasons to use a Cloud Management Platform like Sunshower.io for your cloud optimization and cloud resource management.

Official Platform Launch: Sunshower.io for AWS EC2

We’ve been testing software.

We’ve been onboarding customers.

We’ve been providing unparalleled cloud visibility.

We’ve been rightsizing and optimizing.

We’ve been cutting cloud bills by an average of 66%.

And we’ve been giving it all away for free.

That ends today, with the much anticipated OFFICIAL LAUNCH of Sunshower.io for AWS EC2!

What’s a Cloud Management Platform? (Part 1)

What’s a Cloud Management Platform? (And Why Do You Need One?) Part 1 of 2

Our official tagline at Sunshower.io is “beautifully simple cloud management and optimization.” But why do you need a Cloud Management Platform like Sunshower.io? When you work with a Cloud Service Provider (CSP) like AWS or Azure, doesn’t the CSP do the cloud optimization for you? Isn’t it the CSP’s job to make sure what you’re running in the cloud is rightsized, your applications are easy to view and manage, and that you’re getting the best possible value for your money? That’s what you’re paying them for, right?

In a word? Nope. That’s all on you, bub.