We’re back!
Whew! As you’re probably already aware, 2020 was an incredibly weird year–but we’re getting back on the wagon and going leaner in 2021! Last year, we’d installed solar power to take advantage of Colorado’s 300+ days of sunshine, and we had some relatively beefy workstations left over from our defense work. At the same time, we noticed that our R&D AWS spend is a little high for our budget right now, so we decided to build a local cloud on top of the wonderful Proxmox virtualization management software. Let’s get started!
Structure
To begin with, we reviewed which services from AWS we needed, and came up with the following list:
- RDS
- Jenkins CI/CD
- Kubernetes (We hosted our own with KOPS on AWS–highly recommend it)
- Route53
- AWS Directory Service
- Nexus
- Minio
(I haven’t really decided if I want to use OpenShift or not for this cluster yet).
We’re also probably going to end up using MetalLB to allocate load-balancers for our bare-metal(ish) cluster.
Let’s get to it!
Step 0: Hardware configuration
Our setup is 3 HP Z840s with the following configuration:
1. 2TB RAM
1. 24 logical CPUs on 2 sockets
1. 4 4TB Samsung PRO 860 SSDs
For ease-of-use, we also recommend the BYTECC 4 HDMI port KVM switch (for these machines, anyway, YMMV)
Step 1: Configure your Network
- Check your network configuration. We have each box hooked up directly to a Linksys AC32000 running DD-WRT.
- (Note that these instructions are for DD-WRT–if you don’t use that, you’ll have to find equivalent configurations for your router).
- IMPORTANT: Ensure that your network is at least a
/24
(256 total IPs)–you’ll end up using more than you expect. - Useful tip: Install your nodes towards the low-end of your subnet. Ours are at
192.168.1.[2, 3, 4]
with the gateway192.168.1.1
.
- Next, configure your DD-WRT installation to supply DNS over your LAN by:
- Navigating to 192.168.1.1 and logging in
- Navigate to
services
from the top level, then ensure theservices
sub-tab is selected - Select
Use NVRAM for client lease DB
(this will allow your configuration to survive power-outages and restarts) - Set
Used Domain
toLAN & WAN
- Set your
LAN Domain
to whatever you want it to be. This is purely local, so it can be anything. We ownsunshower.io
so we just used that.home.local
or whatever should work just fine as well. - Ensure
DNSMasq
is enabled - Add
bash
local=/home/
expand-hosts
domain-needed
bogus-priv
addn-hosts=/jffs/etc/hosts.home
strict-order
to yourAdditional DNSMasq Options
Text box - Save & Apply your configuration
Step 2: Install Proxmox
Proxmox is an open-source alternative to (approximately?) vSphere from VMWare, and we’ve used it for a variety of internal workloads. It’s quite robust and stable, so I encourage you to check it out!
The first thing is to burn a USB drive with the Proxmox ISO. UEFI hasn’t worked well for me on these machines (they’re a little old), so the Rufus configuration I used was:
Partition scheme: MBR
File System: FAT32
Cluster Size: 4096 bytes (default)
I went ahead and checked the Add fixes for old BIOSes option since these are pretty old bioses. Hitting start will ask you if you want to burn the image in ISO image mode or DD image mode. With these 840s, ISO mode didn’t work for me, but DD did. Hit START and grab a coffee!
Once your USB drive is gtg, install Proxmox VE on each of the machines:
- Plug your USB drive into the first node in your cluster
- Restart/power on your node
- Select your USB drive from the boot options. If you followed our instructions for Rufus, it’ll probably be under
Legacy
instead ofUEFI
or whatever. - IMPORTANT: Click through the wizard until it asks you which drive/partition you want to install Proxmox into. Select
Options
and reduce the installation size to about30GB
. I don’t know what the minimum is, but 30GB works great for this setup. This even gives you some space to upload ISOs to the default storage location. (Note that if you don’t select this the installation will consume the entire drive). - Continue on the installation until you see your network configuration.
- Select your network (ours is
192.168.1.1/24
–yours may not be, depending on how you configured your router (above)) - Input your node IP (we went with 192.168.1.2 for the first, .3 for the second, etc.)
- Add a cool hostname. If you configured your router as we did above, you should be able to input
<name>
.. For instance, for our 3 nodes we went with athena.sunshower.io
,demeter.sunshower.io
andcalypso.sunshower.io
.
- Select your network (ours is
- Make a note of the username and password you chose for the root. Slap that into Lastpass–you’ll thank yourself later.
Repeat this process for each node you have. Once that’s complete, navigate to any of the nodes at https://<ip>:8006
where <ip>
is the IP that you configured in the Proxmox network installation step. Your browser will yell at you that this isn’t a trusted site since it uses a self-signed certificate by default, but that’s OK. Accept the certificate in your browser, then login with the username and password you provided. In the left-hand pane of your Proxmox server, select the top-level Datacenter
node, then click Create Cluster
. This will take a second, at which point you’ll be able to close the dialog and select Join Information
. Copy the contents of the Join Information
text-area.
Once you have your Join Information
, navigate to each of the other nodes in your cluster. Log in, then select the top level Datacenter
node once again. This time, click on the Join Cluster
button. Paste the Join Information
into the text area, then enter the root password of the first node in the root password
text field. In a second or two, you should see 2 cluster nodes under the Datacenter
configuration. Repeat this process with all of the nodes you set Proxmox up on!
Configure DNS
You may need to configure your DNS within your router at this point. Click on the Shell
button for each node, and run:
1. apt-get update
1. apt-get install net-tools
Then, run ifconfig
. You should see something like:
enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether d8:9d:67:f4:5e:a0 txqueuelen 1000 (Ethernet) RX packets 838880 bytes 173824493 (165.7 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 907011 bytes 170727177 (162.8 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 17 memory 0xef500000-ef520000 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 669 bytes 161151 (157.3 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 669 bytes 161151 (157.3 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 vmbr0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.1.2 netmask 255.255.255.0 broadcast 192.168.1.255 inet6 fe80::da9d:67ff:fef4:5ea0 prefixlen 64 scopeid 0x20<link> ether d8:9d:67:f4:5e:a1 txqueuelen 1000 (Ethernet) RX packets 834450 bytes 158495280 (151.1 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 902499 bytes 166814715 (159.0 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
This is on athena.sunshower.io
. Your list might look different. Note that under the vmbr0
entry there’s a set of letters/digits to the right of ether
(in my case, d8:9d:67:f4:5e:a1
). Make a note of these for each node, then go back to your gateway, log in, and navigate to Services
once again. The default subtab Services
should also be selected–if it’s not, select that.
You’ll see a field called Static Leases
–click add. You’ll see a set of 4 fields:
1. MAC Address
1. Hostname
1. IP Address
1. Client Lease Time
For each of the host/MAC addresses you found previously (e.g. athena.sunshower.io
-> d8:9d:67:f4:5e:a1
), fill the MAC address field with the MAC address, the hostname with the corresponding hostname, and the IP address of the node. Save & Apply Settings. You should be able to visit any of the nodes at their corresponding DNS name. For instance, to access the Proxmox cluster from athena, I can visit https://athena.sunshower.io:8006
Conclusion
At the end of this exercise, you should have a Proxmox cluster with at least 1 node that’s accessible via DNS on your local cluster!
Hey, one hint from a networker. In /24 networks are 254 host IPs available.