Overview
Last time, we configured a Windows Server domain controller to handle DNS, DHCP, and ActiveDirectory (our LDAP implementation). In this post, we’ll be configuring the Ceph distributed filesystem which is supported out-of-the-box by Proxmox.
Ceph Storage Types
Ceph supports several storage types, but the types we’re interested in for our purposes are
- Filesystem
- Block
A Ceph filesystem behaves like a standard filesystem–it can contain files and directories. The advantage to a Ceph filesystem for our purposes is that we can access any files stored within it from any location in the cluster, whereas with local-storage (e.g. local-lvm
or whatever), we can only access them on the node they reside upon.
Similarly, a Ceph block device is suitable for container and virtual machine hard-drives, and these drives can be accessed from anywhere in the cluster. This node-agnosticism is important for several reasons:
- We’re able to create base images for all our virtual machines and clone them to any node, from any node
- We’re able to use Ceph as a persistent volume provider in Kubernetes
In this scheme, your Ceph IO performance is likely to be bottlenecked by your intra-node network performance. Our 10Gbps network connections between nodes seems to perform reasonably well, but is noticeably slower than local storage (we may characterize this at some point, but this series isn’t it).
Drive Configuration
In our setup, we have 4, 4 TB drives on each node. Ceph seems to prefer identical hardware configurations on each node it’s deployed on, so if your nodes differ here, you may want to restrict your Ceph deployment to identical nodes.
Create your Ceph partitions on each node
Careful Ensure that the drives you’re installing Ceph onto aren’t being used by anything–this will erase everything on them. You can colocate Ceph partitions with other partitions on a drive, but that’s not the configuration we’re using here (and how well it performs depends on quite a few factors)
We’re going to use /dev/sdb
for our Ceph filesystem on each node. To prepare each drive, open a console into its host and run the following:
fdisk /dev/sdb
d
to delete all the partitions. If you have more than one, repeat this process until there are no partitions left.n
to create a new partition
1g
to create a new GPT partition tablew
to write the changes
Your disks are now ready for Ceph!
Install Ceph
On each node, navigate to the left-hand configuration panel, then click on the Ceph
node. Initially, you’ll see a message indicating that Ceph is not installed. Select the advanced
option and click Install
to continue through the wizard.
When you’re presented with the options, ensure that the osd_pool_default_size
and osd_pool_default_min_size
configurations are set to the number of nodes that you intend to install Ceph on. While the min_size
can be less than the pool_size
, I have not had good luck with this configuration in my tests.
Once you’ve installed Ceph on each node, navigate to the Monitor
node under the Ceph
configuration node and create at least 1 monitor and at least 1 manager, depending on your resiliency requirements.
Create an object-storage daemon (OSD)
Our next step is to allocate Ceph the drives we previously provisioned (/dev/sdb
). Navigate to the OSD
node under the Ceph node, and click create OSD
. In our configuration, we use the same disk for both the storage area and the Ceph database (DB) disk. Repeat this process for each node you want to participate in this storage cluster.
Create the Ceph pool
We’re almost there! Navigate to the Pools
node under the Ceph
node and click create
. You’ll be presented with some options, but the options important to us are:
- size: set to the number of OSDs you created earlier
- min-size: set to the number of OSDs you created earlier
Ceph recommends about 100 placement groups per OSD, but the number must be a power of 2, and the minimum is 8. More placement groups means better reliability in the presence of failures, but it also means replicating more data which may slow things down. Since we’re not managing dozens or hundreds of drives, I opted for fewer placement groups (16).
Click Create
–you should now have a Ceph storage pool!
Create your Ceph Block Storage (RBD)
You should now be able to navigate up to the cluster level and click on the storage
configuration node.
- Click
Add
and selectRBD
. - Give it a memorable ID that’s also volume-friendly (lower case, no spaces, only alphanumeric + dashes). We chose
ceph-block-storage
- Select the
cephfs_data
pool (or whatever you called it in the previous step) - Select the monitor node you want to use
- Ensure that the
Nodes
value isAll (no restrictions)
unless you want to configure that. - Click
Add
!
Congrats! You should now have a Ceph block storage pool!
Create your Ceph directory storage
In the same view (Datacenter > Storage):
- Click
Add
and selectCephFS
- Give it a memorable ID (same rules as in the previous step), we called ours
ceph-fs
- Ensure that the content is selected to all the available options (VZDump backup file, ISO image, Container Template, Snippets)
- Ensure the
Use Proxmox VE managed hyper-converged cephFS
option is selected
Click Add
–you now have a location you can store VM templates/ISOs in that is accessible to every node participating in the CephFS pool! This is important because it radically simplifies configuration via Terraform, which we’ll be writing about in subsequent posts.
Create a Cloud-Init ready VM Template
We use Debian for virtually everything we do, but these steps should work for Ubuntu as well.
- Upload a Debian ISO (we use small installation image) to your
CephFS
storage - Perform a minimal installation. The things you’ll want to pay attention to here are:
- Ensure your domain-name is set to whatever domain-name you’ve configured on your domain-controller
- Only install the following packages:
SSH Server
andStandard System Utilities
- Finish the Debian installation wizard and power on your VM. Make a note of the ID (probably 100 or something) I refer to as
VM_ID
in subsequent steps - In the Proxmox console for your VM, perform the following steps:
apt-get update
apt-get install cloud-init net-tools sudo
- Once APT is finished installing those packages, edit your
sshd_config
file at/etc/ssh/sshd_config
using your favorite editor (e.g.vi /etc/ssh/sshd_config
) and ensure thatPermitRootLogin
is set toyes
from its default ofprohibit-password
. These are all air-gapped in our environment, so I’m not too worried about the security ramifications of this. If you are, be sure to adjust subsequent configurations to use certificate-based auth. - Save the file and shut down the virtual machine
Now, let’s enable CloudInit!
On any of your nodes joined to the CephFS filesystem:
- Open a console
- Configure a CloudInit drive:
- Using the id for the Ceph RBD/block store (referred to as
<BLOCK_STORE_ID>
we configured above (ours isceph-block-storage
), and theVM_ID
from the previous section, create a drive by enteringqm set <VM_ID> --ide2 <BLOCK_STORE_ID>:cloudinit
. For instance, in our configuration, with VM_ID = 100, this isqm set 100 --ide2 ceph-block-storage:cloudinit
. This should complete without errors.
- Using the id for the Ceph RBD/block store (referred to as
- Verify that your VM has a cloud-init drive by navigating to the VM, then selecting the
Cloud-Init
node. You should see some values there instead of a message indicating Cloud-Init isn’t installed.
You should also verify that this machine is installed on the correct block-storage by attempting to clone it to another node. If everything’s configured properly, cloning a VM to a different node should work seamlessly. At this point, you can convert your VM to a template for subsequent use.
Conclusion
In this post we installed and configured Ceph block and filestores, and also created a Cloud-Init capable VM template. At this point, we’re ready to begin configuring our HA Kubernetes cluster using Terraform from Hashicorp. Our Terraform files will be stored in our public devops Github repository
1 comment