dms-on-lxd

Last updated: 2024-09-17 21:09:08.591233 File source: link on GitLab

Introduction

This project aims to leverage terraform to provision LXD instances where multiple DMS executions will reside.

The ultimate goal of this is to have a generic and flexible provisioning standard to setup the DMS clusters wherever there is access to the LXD api.

Prerequisites

Remote

LXD API enabled on the target hosts. See HOW-TO: expose LXD to the network.
- For legacy versions you can refer to Directly interacting with the LXD API. This authentication method has been removed in recent versions of LXD.

Locally

Dependencies:

LXD CLI which must provide the lxc command line interface.
OpenTofu
DMS Deb file
- To download the latest dms release, refer to dms installation guide
jq
yq, a jq wrapper

This project can alternatively be run using the provided Dockerfile.

If using docker:

build the image

docker build -t dms-on-lxd .

run the image

docker run -it --rm -v $PWD:/app dms-on-lxd bash

You can use the dockerfile as a complete reference for all the dependencies that are expected to be present in order to execute this project

Usage

Configuring

First copy the configuration dist file:

cp config.yml.dist config.yml

Then modify the values accordingly:

lxd_hosts:
- host: localhost
  token: randomtoken
  port: 8443
- host: 10.250.251.1
  token: randomtoken
  port: 8443

If desired, you can customize the amount of DMS deployments by adding dms_instances_count to the config.yml file:

dms_instances_count: 2
lxd_hosts:
- #...

If ommited, one DMS instance per LXD host is deployed by default.

SSH Key

An ssh key called lxd-key (and lxd-key.pub) is created and used for the deployment of the instances. If you want to override the key, just add to terraform.tfvars:

ssh_pub_key = "ecdsa-sha2-nistp256 AAAABBBBBBBBBXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFmJ+rmL9YSVfXTEX+7P5VD6rciVYpig8BzmWJlJwdEmnuFhMyhsmtO31M2TwcW9TFNyfEsABCDEFGHI= EXAMPLE KEY"

Other terraform variables

The default terraform variables can be seen in variables.tf. Customizing their default values is optional.

To customize variables, for instance the dms file, which is "dms_deb_filepath", add this line to terraform.tfvars. Create the file if it doesn't exist:

dms_deb_filepath = "/full/path/to/nunet-dms-0.xxx.deb"

For a complete list of variables, check the file variables.tf.

Nebula

This project also supports using nunet/nebula which is a project that is based off slackhq/nebula.

Note that the nunet project is private as of the time of writing this document.

To enable the use of nebula, add to the terraform.tfvars:

enable_nebula = true

And provide the necessary nebula users with their respective associated IPs, adding them to the config.yml file:

nebula_users:
- username: nunet-test998
  password: boguspass
  ipv4: 10.251.252.253
- username: nunet-test999
  password: boguspass
  ipv4: 10.251.252.254
- # ...

Notice that you must provide at least the same amount of users as the expected dms instances to be deployed, otherwise the execution will fail.

Running

NOTE: If using docker, run these inside the container.

Spin up the cluster using bash make.sh. NOTE: make isn't actually used for the deployment.
Use instance_addresses.txt to connect and execute code in the remote instances:

for instance in $(cat instance_addresses.txt); do
  lxc exec --cwd=/opt/test-suite $instance -- {COMMAND_TO_EXECUTE}
done

When done, destroy the infrastructure using bash destroy.sh.

Outputs

The following files are produced after running this project with make.sh.

add-lxd-remotes.sh

This script is a helper to add the lxd remote servers to your local lxd client in order to help with managing remote instances.

It looks something like this:

#!/bin/bash

lxc remote add --accept-certificate --password securepass localhost https://localhost:8443
lxc remote add --accept-certificate --token randomtoken localhost https://10.251.252.1:8443
# ...

Upon execution, the remotes are added to your local machine. You can then list the virtual machines in each remote:

$ lxc ls localhost:       
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
|        NAME        |  STATE  |          IPV4           |                      IPV6                       |      TYPE       | SNAPSHOTS |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
| dms-0-on-localhost | RUNNING | 172.17.0.1 (docker0)    | fd42:bd31:9a92:4fb2:216:3eff:feb4:abcd (enp5s0) | VIRTUAL-MACHINE | 0         |
|                    |         | 10.167.120.14 (enp5s0)  |                                                 |                 |           |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
| dms-1-on-localhost | RUNNING | 172.17.0.1 (docker0)    | fd42:bd31:9a92:4fb2:216:3eff:fed2:efgh (enp5s0) | VIRTUAL-MACHINE | 0         |
|                    |         | 10.167.120.171 (enp5s0) |                                                 |                 |           |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+

You can then terminate instances at will, for instance if while using this project the opentofu component enters an inconsistent state:

$ lxc delete --force localhost:dms-0-on-localhost

reachable_hosts.yml

This is a list of hosts that have been tested and are reachable from the machine where this project is being executed:

lxd_hosts:
- {"host":"localhost","token":"randomtoken","port":8443}

unreachable_hosts.yml

This is a list of unreachable hosts, that during test failed to respond:

lxd_hosts:
- {"host":"10.250.251.1","token":"randomtoken","port":8443}

vms-ipv4-list.txt

This is a list of the IPv4s available for connection after provisioning the infrastructure. It is a simple file with one IP per line which can be easily iterated over using bash or any other language like python.

10.167.120.14
10.167.120.171

If nebula is enabled, these IPs are replaced with the internal IPs of nebula, assigned to each VM:

10.251.252.253
10.251.252.254

To iterate over the list for connecting over ssh using bash:

for ip in $(cat ./vms-ipv4-list.txt); do
  ssh -i ./lxd-key \
    -o IdentitiesOnly=yes \
    -o StrictHostKeyChecking=no \
    -o PubKeyAuthentication=yes \
    root@$ip -- echo ok
done

For processing the file in a script like python:

with open("vms-ipv4-list.txt") as file_fp:
    ip_list = [line.strip() for line in file_fp.readlines()]

Known issues

Docker and LXD

There are instances where docker prevents lxd instances to communicate with the internet consistently. This issue manifests itself in a scenario where the user can upgrade and install packages with APT but anything else will halt indefinitely.

To overcome this, add the following rules to iptables (using sudo whenever necessary):

iptables -I DOCKER-USER -i lxdbr0 -j ACCEPT
iptables -I DOCKER-USER -o lxdbr0 -j ACCEPT

Virtual Machines (SOLVED)

NOTE: in the current state, virtual-machine and container will work with the same terraform file at the expense of having async installation of DMS. Therefore beware that the terraform will return successfully while there will still be code running inside the lxd instances. Check either /var/log/cloud-init-output.log or /var/log/init.log inside each lxd instance for information whether installation finished successfully.

long explanation of the issue and solution

virtual-machine wasn't working before because the file block in lxd_instance resources expect to be able to provision files while the lxd instance is still in a stopped state, which work for containers because of the nature of their filesystem (overlayfs or similar) but not for virtual machines which have file system that isn't accessible as a direct folder. Using lxd_instance_file resource, which will upload a file once the instance is up and running solves the issue. However exec blocks in lxd_instance resource, which work synchornously with terraform won't work with lxd_instance_file if it depends on the file because execution can't be staggered until the file is provisioned. Therefore we have to leverage cloud-init's runcmd for that, which runs in the background after terraform returns.

Terraform docs

Generating docs

This part of the documentation is generated automatically from terraform using terraform-docs.

To update it, run:

terraform-docs markdown table --output-file README.md --output-mode inject .

Requirements

Name	Version
lxd	2

Name

Version

lxd

Providers

Name	Version
local	2.5.1
lxd	2.0.0

Name

Version

local

2.5.1

lxd

2.0.0

Modules

No modules.

Resources

Name	Type
local_file.ipv4_list	resource
local_sensitive_file.foo	resource
lxd_instance.dms_on_lxd	resource
lxd_instance_file.dms_deb	resource
lxd_instance_file.init	resource
lxd_instance_file.wait_for_dms_script	resource

Name

Type

local_file.ipv4_list

resource

local_sensitive_file.foo

resource

lxd_instance.dms_on_lxd

resource

lxd_instance_file.dms_deb

resource

lxd_instance_file.init

resource

lxd_instance_file.wait_for_dms_script

resource

Inputs

Name Description Type Default Required

Name	Description	Type	Default	Required
distro_name	The lxd instance distro version. Useful for overriding when testing multiple releases	`string`	`"ubuntu"`	no
distro_version	The lxd instance distro version. Useful for overriding when testing multiple releases	`string`	`"20.04"`	no
dms_deb_filepath	Path to the debian file which to use for installing DMS when provisioning the lxd instances	`string`	`"nunet-dms-latest.deb"`	no
dms_instances_count	The desired amount of DMS on LXD instances	`number`	`3`	no
enable_nebula	Enables nebula configuration in the lxd virtual machines. If set, add the necessary nebula users in config.yml. If there aren't enough users for the amount of lxd virtual machines, terraform plan will fail	`bool`	`false`	no
instance_name_prefix	Optional prefix for LXD instance names. Useful if sharing same host for multiple deployments without colliding.	`string`	`""`	no
instance_type	The type of the lxd instance	`string`	`"virtual-machine"`	no
limits	The lxd instance resource limits. See https://documentation.ubuntu.com/lxd/en/latest/reference/instance_options/#resource-limits	`object({ memory = optional(string, "4GB") cpu = optional(number, 2) })`	`{}`	no
ssh_pub_key	An SSH public key used in order to configure remote access to the lxd instances over SSH	`string`	`null`	no
timezone	Timezone for the lxd instances	`string`	`"America/Sao_Paulo"`	no

distro_name

The lxd instance distro version. Useful for overriding when testing multiple releases

string

"ubuntu"

distro_version

The lxd instance distro version. Useful for overriding when testing multiple releases

string

"20.04"

dms_deb_filepath

Path to the debian file which to use for installing DMS when provisioning the lxd instances

string

"nunet-dms-latest.deb"

dms_instances_count

The desired amount of DMS on LXD instances

number

3

enable_nebula

Enables nebula configuration in the lxd virtual machines. If set, add the necessary nebula users in config.yml. If there aren't enough users for the amount of lxd virtual machines, terraform plan will fail

bool

false

instance_name_prefix

Optional prefix for LXD instance names. Useful if sharing same host for multiple deployments without colliding.

string

""

instance_type

The type of the lxd instance

string

"virtual-machine"

limits

The lxd instance resource limits. See https://documentation.ubuntu.com/lxd/en/latest/reference/instance_options/#resource-limits

object({
    memory = optional(string, "4GB")
    cpu    = optional(number, 2)
  })

{}

ssh_pub_key

An SSH public key used in order to configure remote access to the lxd instances over SSH

string

null

timezone

Timezone for the lxd instances

string

"America/Sao_Paulo"

Outputs

Name	Description
instance_addresses	n/a

Name

Description

instance_addresses

n/a

Previousinfrastructure Nextnginx

Last updated 2 days ago