1 of 26

main

Last updated: 2024-11-07 21:05:29.356500 File source: link on GitLab

test-suite

This repository combines all tests and defines full testing pipeline of the platform. The test pipeline is rather complex and different stages are implemented in different repositories. This repository aims at holding all relevant information and definitions inrrespectively as well as provide the interface to test management framework, which displays results of each test run (see https://nunet.testmo.net/).

CLI

Please refer to CLI_GUIDE.md.

Repository structure

├── .gitlab         # issue templates and other technical gitlab related artifacts
├── stages          # definitions of use cases and bahaviors of each test stage
├── environments    # definitions of environments

Testing matrix

Stage

Feature

Develop

Staging

Production

How to read the table: columns are environments (defined and explained in ./environments folder of this repo), rows are test stages; the build stage is a special stage which both is used as a test (since if a package does not build, it is a test that something went wrong) as well as step for building environments.

y means that a corresponding environment is needed to run a stage (equivalent -- a stage runs in that environment);
n means the opposite -- a stage does not need that environment to run an is not executed in that environment;

The meaning of the testing matrix sells is explained below. Note, they may differ depending on the stage (e.g. manual execution of deploy stage means that at least part of the environment will need manual actions from community compute providers; likewise, manual acceptance tests consider involvement of beta testers). Furthermore, some test stages contain sub-stages and the whole matrix will be evolving together with the code base.

Branching strategy

The branching strategy is defined here.

Testing schedule

TBD: will define when and how we run advanced tests which need manual testing as well as heavy environments (we will not be able to run them very frequently, and that will need to be preperly scheduled).

.assets

Last updated: 2024-11-07 21:05:28.783967 File source: link on GitLab

.assets

Contains different pictures, assets, etc that may be relevant to the contents of the repostitory (like diagrams that cannot be done properly in mermaid, which is our diagramming standard)

.testmo

Last updated: 2024-11-07 21:05:29.103524 File source:

.testmo

Any documents or artifacts that describe set up of testmo environment that we are using for displaying test results will be archived here; https://nunet.testmo.net

cicd

Last updated: 2024-11-07 21:05:29.926640 File source:

CICD

This folder contains job definitions for the cicd pipeline that implements the test matrix described in .

Usage

You need to import this project's cicd/Auto-Devops.gitlab-ci.yml and define a base image that will server as the default image for building the project:

There are multiple variables that are used to disable specific parts of the pipeline:

For later stages to work, like functional tests, you need to define a build job called Build.

Stages

Code quality

This stage contains jobs that control code quality using linters and analysers. Currently we implement Code Climate for general languages and golangci-lint for golang projects.

Unit Tests

Security Tests 1

Build

This stage builds artifacts necessary for further testing and distribution.

Functional Tests

Testing

Tests are described in the folder tests/. It contains high level functionality tests for the cicd pipeline that can either be triggered manually or with dummy merge requests, as needed.

scripts

Last updated: 2024-11-07 21:05:29.657302 File source: link on GitLab

feature_environment

Last updated: 2024-11-07 21:05:30.196367 File source: link on GitLab

Introduction

This document aims to lay out the architecture supporting the current implementation of the feature environment.

The feature environment is described at https://gitlab.com/nunet/test-suite/-/tree/develop/environments/feature?ref_type=heads.

The ADRs for this architecture can be found at https://gitlab.com/nunet/test-suite/-/tree/develop/doc/architecture/decisions?ref_type=heads.

Architecture

In a nutshell, the feature environment launches virtual machines in remote hosts pre-configured with DMS already installed. These virtual machines are designed to be accessed via SSH for remote code execution.

The main challenge in this project is to guarantee that the compute resource from which the feature environment is launched can access, authenticate, deploy and run remote code in the target hosts and virtual machines. In order to do so, an overlay network must be preconfigured in order to access resources that are not directly exposed to the web. Even if all the virtual machine hosts were exposed to the web, this overlay network would come in handy, first to limit access to the LXD API, reducing attack surface of the solution, but also to accomodate for future domestic compute resources, once the the test suite is mature enough for compute resource providers from the community.

The solutions stack, described in the ADRs, are LXD for virtual machine management, Terraform/OpenTOFU for interaction with the LXD API, Slack Nebula for the overlay network and Gitlab CI for the code pipeline execution.

The LXD VM management scripts and terraform declaration can be found at https://gitlab.com/nunet/test-suite/-/tree/develop/infrastructure/dms-on-lxd?ref_type=heads

The project for Slack Nebula deployment can be found at https://gitlab.com/nunet/nebula/-/tree/main?ref_type=heads. At the time of writing the project is private, but it should be opened to public view once some sensitive aspects of the implementation are resolved.

The Gitlab CI pipeline implementation can be found at https://gitlab.com/nunet/test-suite/-/blob/develop/cicd/Feature-Environment.gitlab-ci.yml?ref_type=heads.

Diagram

The following is a diagram showing the relation between compute elements in the feature environment, namely:

the Gitlab Runner, which effectively execute Gitlab CI jobs
the LXC Remotes, which host the virtual machines containing the DMS installations for testing
the virtual machines that are spun up by the Gitlab CI
the nebula overlay network, which provides connectivity between all the aforementioned moving parts

This topology diagram represents a Gitlab CI job, that is triggered via commit to develop branch. All compute elements reside inside the nebula overlay network and communicate internally using their respective private IPs. The virtual machines that are spun up also join the network.

Pre-requisites

This are the set of conditions needed before the feature environment can run.

First of all, there is the nebula network, which is used for connectivity. It has a lighthouse and users with pre-signed certificates that are used to connect and authenticate with the network. We have, therefore the need for one user for each Gitlab CI runner, LXD Host and LXD Virtual Machine. These users must be configured on a need-by-need basis, as there is no self-service, automated way to achieve this. For more information, see the nebula project in nunet gitlab group, as linked in the Architecture section.

Once nebula is configured for each of the compute elements, we need the LXD Hosts to expose the LXD API to the Gitlab CI Runners. This can be done either by setting the LXD API bind to the internal nebula IP, or to bind it globally (0.0.0.0). For instructions as to how to configure the LXD Api, see https://ubuntu.com/blog/directly-interacting-with-the-lxd-api.

Feature Environment pipeline

This section describes the flow of the pipeline from the point of view of the Gitlab CI.

The overall state flow of the CI Pipeline for the feature environment is as follows:

In this graph it is represented the flow of the upstream and downstream pipelines that compose the flow of the feature environment.

Once code is pushed to DMS develop branch, the project's pipeline is triggered. This is the upstream pipeline. Among all the jobs, the two jobs of interest for the scope of this document are the Build DMS and Trigger feature environment (job names in the diagram might differ from the actual names).

The downstream pipeline is triggered in the nunet/test-suite project. The job that creates the virtual machines pull artifacts from the build job in order to pre-configure the Virtual Machines. Then the functional tests are run over SSH. The results of those jobs are sent to testmo and uploaded to the ci-reports webserver.

Once the tests are run, the lxd virtual machines are torn down.

Spinning up the virtual machines

The following graph represents the communication flow from the Gitlab CI Runner to the LXD Hosts in order to spin up the virtual machines:

In this diagram, the first thing that happens when the job that creates the virtual machine is triggered is Gitlab CI pulling secrets from the vault. Those secrets are in the form of a base64 encoded config file. The specification of that file can be found at nunet/test-suite/infrastructure/dms-on-lxd/config.yml.dist.

Then it uses the information in that config file to interact with the LXD API using terraform and the internal nebula IP addresses for the LXD Hosts. It is important to note that, if any host is unavailable they are filtered out. The pipeline won't halt, but it will complain visually, with a warning sign that one or more hosts couldn't be reached. The pipeline halts and attempts to destroy the infrastructure if there is an error or if no LXD host is available.

The entire process can be understood if looking at the file nunet/test-suite/infrastructure/dms-on-lxd/make.sh.

Running the tests

The following graph allows for a broad illustration of the communication process employed in running the tests:

From the Gitlab Job, feature tests are run using gherkin and beahve.

The scripts are setup in a way that they take in an inventory file containing a list of IP addresses and a private ssh key and run remote CLI commands over ssh. The command used to run behave in the pipeline is:

behave \
    -D inventory_file=$CI_PROJECT_DIR/infrastructure/dms-on-lxd/vms-ipv4-list.txt \
    -D ssh_key_file=$CI_PROJECT_DIR/infrastructure/dms-on-lxd/lxd-key \
    features/device-management-service/cli-tests/nunet_cli_lxd.feature

Both the files vms-ipv4-list.txt and lxd-key are generated by the make.sh script.

Notifications

We use slack-notification for gitlab to implement a notification system that allows us to see in slack when a provisioning job fails when it isn't supposed to. These jobs can be seen in the Feature Environment CICD pipeline.

It uses a webhook to the server-alerts channel, which is configured using Slack API's incoming webhooks. There is a Slack APP called Gitlab CI Notifications in which these webhooks are configured.

The webhook endpoint is stored in a Gitlab CI variable called SLACK_ALERTS_WEBHOOK which is configured at Nunet group's CICD Variables.

Maintaining the webhook is a matter of recreating the webhook endpoint for the target slack channel, in case it ever expires, and update the variable SLACK_ALERTS_WEBHOOK in the cicd varibles for the nunet group with the new webhook endpoint.

environments

Last updated: 2024-11-07 21:05:30.491277 File source: link on GitLab

environments

Contain all files and scripts that are needed in order to spawn each environment that is used in testing stages (also to start the production environment -- which is the mainnet!)

Summary

Similarly to other decentralized computing projects (as blockchains), the network is running on the hardware provisioned via independent devices. In NuNet case, there is an additional complexity due to the fact that test networks have to resemble heterogeneity of the population of devices, operating systems and setups. Therefore, large portion of the NuNet CI/CD pipeline have to run not on centralized servers (e.g. in our case, via gitlab-ci runners), but on the geographically dispersed network. In order to manage the full life-cycle of the platform, including testing of separate features and iterations of the network components, NuNet is using isolated channels categorized into four environments:

feature environment is used to run ci/cd pipeline on each merge request from individual developers' feature branches;
development environment runs the ci/cd pipeline on the develop branch of the NuNet repositories;
staging environment runs extensive pre-release testing on the frozen features in the staging branch;
production environment runs the final releases of NuNet network, exposed to end users;

development

Last updated: 2024-11-07 21:05:30.792707 File source:

Purpose and audience

The development environment is composed of a somewhat larger (than feature) network of heterogeneous devices sourced from the community. Since NuNet, as a decentralized network, will not have control of the devices sourced from community, the development environment will encompass communication channels with the community members who will participate in the .

CI/CD stages

Branch: develop branch

Develop environment is used to run the following CI/CD pipeline stages according to the pre-defined schedule, to be communicated to community testers:

;
;
;
(if needed by the feature);
(if needed by the feature);

Architecture

The development environment contains:

virtual machines and containers hosted in NuNet cloud servers;
machines owned by NuNet team members;
machines provided by the community members on constant basis via program;

Triggering schedule

The CI/CD pipeline in development environment is triggered in two cases:

according to the pre-determined schedule for running stages that are more heavy on compute requirements -- which ideally may include the more advanced stages ; depending on the speed of development, NuNet may be schedule weekly or nightly builds and runs of the platform with the full pipeline (possibly including the latest stages of the CI/CD pipeline normally reserved for Staging environment only). In principle, Development environment should be able to run all automatic tests.

feature

Last updated: 2024-11-07 21:05:31.050885 File source:

Feature environment

Purpose and audience

Branch: As per NuNet GIT Workflow, all individual development will happen on the feature branches created from develop branch and related to each issue assigned via the team process and development board. When the developed feature is requested to merge to the develop branch, selected stages of the CI/CD pipeline are triggered on merge request (so that reviewers and approvers if it complies to relevant acceptance criteria). The Feature environment is used to run these stages.

Architecture

Most of the stages require only static environment (i.e. can be executed via gitlab-runners on NuNet servers), but some of them may require distributed deployment of the platform components. Note also, that the feature network may branch in a number of channels (see compute provider onboarding for details) for testing specific features or use-cases which need sourcing specific hardware.

CI/CD stages

Feature environment is used to run the following CI/CD pipeline stages on merge request and on merge:

static analysis;
unit tests;
static security tests;
build (if needed by the feature);
functional tests / API tests (if needed by the feature);

Triggering schedule

Since merge requests are be frequent, Feature environment mostly runs on GitLab servers triggered by CI/CD pipeline stages. However, it is also augmented by a small network of NuNet team owned hardware (still geographically distributed in private offices), which can be automatically updated and triggered in a centralized manner via CI/CD pipeline stages.

When necessary, a developer can request and run tests in NuNet's virtual machines and containers hosted in NuNet cloud servers or can run preliminary tests on their local machines before pushing into GitLab (always prefered). This environment is dotted in the above diagram since it is not exactly defined in the NuNet infrastructure.

production

Last updated: 2024-11-07 21:05:31.318115 File source:

Production environment

Note: Transferred from old Wiki ; to be updated as per new developments.

Purpose and public

This is the live environment used by the community to onboard machines/devices or to use the computational resources available on the NuNet platform.

CI/CD pipeline

No CI/CD pipeline stages are running on production environment. However, all users are provided with tools and are encouraged to report any bugs or file feature requests following the [https://gitlab.com/nunet/documentation/-/wikis/Contribution-Guidelines].

Architecture

The Production environment contains all community machines/devices connected to production network.

Production Release

When the tests in the Testnet (staging environment) are finished with success and approved by the testers, the module(s)/API(s) should be released to production. The following processes are being defined:

versioning process: versioning of modules and APIs;
compatibility/deprecation process: releasing modules/APIs that do not have compatibility with others modules/APIs currently running on the platform should be avoided since NuNet is a highly decentralized network; however old versions should be deprecated so maintaining the compatibility will not create other problems related to security, performance, code readability, etc.
communication process: how the community is notified of modules updates, bugs, security issues
updating process: how the modules/APIs are updated.

staging

Last updated: 2024-11-07 21:05:31.569793 File source:

Staging environment

Note: Transferred from old Wiki ; to be updated as per new developments.

Purpose and public

Testnet is this network and is used by developers, QA/security engineers and community testers. Manged by the Product Owner.

CI/CD stages

Branch: staging branch, created from developby freezing features scheduled for release;

CI/CD pipeline runs the following stages automatically as well as manually where required:

static analysis;
unit tests;
static security tests;
build (if needed by the feature);
functional tests / API tests (if needed by the feature);
security tests
regression tests
performance and load tests
live security tests

Architecture

The Staging environment contains:

virtual machines and containers hosted in NuNet cloud servers
machines owned by NuNet and NuNet team members
extensive network of community testers' machines/devices provided via NuNet Network private testers, covering all possible configurations of the network and most closely resembling the actual NuNet network in production:
- with different hardware devices
- owned by separate individuals and entities
- connected to internet by different means:
  - having IP addresses
  - behind different NAT types
  - having different internet speeds
  - having different stability of connection
  - etc

Triggering schedule

Testing on staging environment is triggered manually as per platform life-cycle and release schedule. When the staging branch is created from develop branch with the frozen features ready for release, the following actions are performed:

The staging environment / testnet is constructed by inviting community testers to join their machines in order to cover architecture described above;
All applications are deployed on the network (as needed) in preparation for automatic and manual regression testing and load testing;
Manual testing schedule is released and communicated to community testers;
CI/CD pipelsine is triggered with all automatic tests and manual tests;
Bug reports are collected and resolved;
Manual tests resulting in bugs are automated and included into CI/CD pipeline;

The above cycle is repeated until no bugs are observed. When this happens, the staging branch is marked for release into production environment.

infrastructure

Last updated: 2024-11-07 21:05:32.053878 File source: link on GitLab

This folder holds files that are deployed in our infrastructure.

Unless stated otherwise these files are manually deployed.

nginx/ci.nunet.io.conf

NGINX config file describing the ci reports web server. Deployed at dev.nunet.io.

cronjob/cron.daily/rotate_reports

Cronjob file to remove reports older than 90 days. Deployed at dev.nunet.io.

dms-on-lxd

Last updated: 2024-11-07 21:05:32.363152 File source: link on GitLab

Introduction

This project aims to leverage terraform to provision LXD instances where multiple DMS executions will reside.

The ultimate goal of this is to have a generic and flexible provisioning standard to setup the DMS clusters wherever there is access to the LXD api.

Prerequisites

Remote

LXD API enabled on the target hosts. See HOW-TO: expose LXD to the network.
- For legacy versions you can refer to Directly interacting with the LXD API. This authentication method has been removed in recent versions of LXD.

Locally

Dependencies:

LXD CLI which must provide the lxc command line interface.
OpenTofu
DMS Deb file
- To download the latest dms release, refer to dms installation guide
jq
yq, a jq wrapper

This project can alternatively be run using the provided Dockerfile.

If using docker:

build the image

docker build -t dms-on-lxd .

run the image

docker run -it --rm -v $PWD:/app dms-on-lxd bash

You can use the dockerfile as a complete reference for all the dependencies that are expected to be present in order to execute this project

Usage

Using Docker

You can run all commands through Docker after building the image:

# Build the image
docker build -t dms-on-lxd .

# Run make.sh
docker run -it --rm \
  -v $PWD:/app \
  -v ~/.ssh:/root/.ssh:ro \
  -e DMS_ON_LXD_ENV \
  -e DMS_DEB_FILE \
  dms-on-lxd bash make.sh

# Run destroy.sh
docker run -it --rm \
  -v $PWD:/app \
  -v ~/.ssh:/root/.ssh:ro \
  -e DMS_ON_LXD_ENV \
  dms-on-lxd bash destroy.sh

The -v ~/.ssh:/root/.ssh:ro mount is optional but useful if you need SSH access to the host machine.

Configuring

First copy the configuration dist file:

cp config.yml.dist config.yml

Then modify the values accordingly:

lxd_hosts:
- host: localhost
  token: randomtoken
  port: 8443
- host: 10.250.251.1
  token: randomtoken
  port: 8443

If desired, you can customize the amount of DMS deployments by adding dms_instances_count to the config.yml file:

dms_instances_count: 2
lxd_hosts:
- #...

If ommited, one DMS instance per LXD host is deployed by default.

SSH Key

An ssh key called lxd-key (and lxd-key.pub) is created and used for the deployment of the instances. If you want to override the key, just add to terraform.tfvars:

ssh_pub_key = "ecdsa-sha2-nistp256 AAAABBBBBBBBBXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFmJ+rmL9YSVfXTEX+7P5VD6rciVYpig8BzmWJlJwdEmnuFhMyhsmtO31M2TwcW9TFNyfEsABCDEFGHI= EXAMPLE KEY"

Other terraform variables

The default terraform variables can be seen in variables.tf. Customizing their default values is optional.

To customize variables, for instance the dms file, which is "dms_deb_filepath", add this line to terraform.tfvars. Create the file if it doesn't exist:

dms_deb_filepath = "/full/path/to/nunet-dms-0.xxx.deb"

For a complete list of variables, check the file variables.tf.

Nebula

This project also supports using nunet/nebula which is a project that is based off slackhq/nebula.

Note that the nunet project is private as of the time of writing this document.

To enable the use of nebula, add to the terraform.tfvars:

enable_nebula = true

And provide the necessary nebula users with their respective associated IPs, adding them to the config.yml file:

nebula_users:
- username: nunet-test998
  password: boguspass
  ipv4: 10.251.252.253
- username: nunet-test999
  password: boguspass
  ipv4: 10.251.252.254
- # ...

Notice that you must provide at least the same amount of users as the expected dms instances to be deployed, otherwise the execution will fail.

Running

NOTE: If using docker, run these inside the container.

Spin up the cluster using bash make.sh. NOTE: make isn't actually used for the deployment.
Use lxd_vm_addresses.txt to connect and execute code in the remote instances:

for instance in $(cat lxd_vm_addresses.txt); do
  lxc exec --cwd=/opt/test-suite $instance -- {COMMAND_TO_EXECUTE}
done

When done, destroy the infrastructure using bash destroy.sh.

Outputs

The following files are produced after running this project with make.sh.

add-lxd-remotes.sh

This script is a helper to add the lxd remote servers to your local lxd client in order to help with managing remote instances.

It looks something like this:

#!/usr/bin/env bash

lxc remote add --accept-certificate --password securepass localhost https://localhost:8443
lxc remote add --accept-certificate --token randomtoken localhost https://10.251.252.1:8443
# ...

Upon execution, the remotes are added to your local machine. You can then list the virtual machines in each remote:

$ lxc ls localhost:       
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
|        NAME        |  STATE  |          IPV4           |                      IPV6                       |      TYPE       | SNAPSHOTS |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
| dms-0-on-localhost | RUNNING | 172.17.0.1 (docker0)    | fd42:bd31:9a92:4fb2:216:3eff:feb4:abcd (enp5s0) | VIRTUAL-MACHINE | 0         |
|                    |         | 10.167.120.14 (enp5s0)  |                                                 |                 |           |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
| dms-1-on-localhost | RUNNING | 172.17.0.1 (docker0)    | fd42:bd31:9a92:4fb2:216:3eff:fed2:efgh (enp5s0) | VIRTUAL-MACHINE | 0         |
|                    |         | 10.167.120.171 (enp5s0) |                                                 |                 |           |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+

You can then terminate instances at will, for instance if while using this project the opentofu component enters an inconsistent state:

$ lxc delete --force localhost:dms-0-on-localhost

reachable_hosts.yml

This is a list of hosts that have been tested and are reachable from the machine where this project is being executed:

lxd_hosts:
- {"host":"localhost","token":"randomtoken","port":8443}

unreachable_hosts.yml

This is a list of unreachable hosts, that during test failed to respond:

lxd_hosts:
- {"host":"10.250.251.1","token":"randomtoken","port":8443}

vms-ipv4-list.txt

This is a list of the IPv4s available for connection after provisioning the infrastructure. It is a simple file with one IP per line which can be easily iterated over using bash or any other language like python.

10.167.120.14
10.167.120.171

If nebula is enabled, these IPs are replaced with the internal IPs of nebula, assigned to each VM:

10.251.252.253
10.251.252.254

To iterate over the list for connecting over ssh using bash:

for ip in $(cat ./vms-ipv4-list.txt); do
  ssh -i ./lxd-key \
    -o IdentitiesOnly=yes \
    -o StrictHostKeyChecking=no \
    -o PubKeyAuthentication=yes \
    root@$ip -- echo ok
done

For processing the file in a script like python:

with open("vms-ipv4-list.txt") as file_fp:
    ip_list = [line.strip() for line in file_fp.readlines()]

Known issues

Docker and LXD

There are instances where docker prevents lxd instances to communicate with the internet consistently. This issue manifests itself in a scenario where the user can upgrade and install packages with APT but anything else will halt indefinitely.

To overcome this, add the following rules to iptables (using sudo whenever necessary):

iptables -I DOCKER-USER -i lxdbr0 -j ACCEPT
iptables -I DOCKER-USER -o lxdbr0 -j ACCEPT

Virtual Machines (SOLVED)

NOTE: in the current state, virtual-machine and container will work with the same terraform file at the expense of having async installation of DMS. Therefore beware that the terraform will return successfully while there will still be code running inside the lxd instances. Check either /var/log/cloud-init-output.log or /var/log/init.log inside each lxd instance for information whether installation finished successfully.

long explanation of the issue and solution

virtual-machine wasn't working before because the file block in lxd_instance resources expect to be able to provision files while the lxd instance is still in a stopped state, which work for containers because of the nature of their filesystem (overlayfs or similar) but not for virtual machines which have file system that isn't accessible as a direct folder. Using lxd_instance_file resource, which will upload a file once the instance is up and running solves the issue. However exec blocks in lxd_instance resource, which work synchornously with terraform won't work with lxd_instance_file if it depends on the file because execution can't be staggered until the file is provisioned. Therefore we have to leverage cloud-init's runcmd for that, which runs in the background after terraform returns.

Terraform docs

Generating docs

This part of the documentation is generated automatically from terraform using terraform-docs.

To update it, run:

terraform-docs markdown table --output-file README.md --output-mode inject .

Requirements

Providers

Modules

No modules.

Resources

Inputs

Outputs

nginx

Last updated: 2024-11-07 21:05:32.625327 File source: link on GitLab

ci.nunet.io

This file is deployed at the same server whose IP is pointed by the URL. In the server the file will be different because certbot manages the certificate there.

stages

Last updated: 2024-11-07 21:05:32.887796 File source: link on GitLab

Test stages

Directory structure is organized following structure of test stages in CI/CD pipeline as defined in main pipeline file. In case more stages are added or renamed, both directory structure and pipeline structure have to be refactored simulataneously. In case of questions, the structure of the test-suite is the main reference and all other repositories and definitions have to be aligned to definitions provided in this repository.

Each stage folder is organized as follows:

 ├── {component_name} # (repository name)
    ├── features      # use case definitions in gherking (both automatic and manual!)
    ├── steps         # step definitions for automatic steps to be executed via pipelne

dependency_scanning

Last updated: 2024-11-07 21:05:33.176489 File source: link on GitLab

Dependency scanning

Dependency Scanning is a feature that analyzes an application's dependencies for known vulnerabilities, including transitive dependencies. It is part of Software Composition Analysis (SCA) and helps identify potential risks in your code before committing changes. For more information on these features, please visit the GitLab page about Dependency scanning.

functional_tests

Last updated: 2024-11-07 21:05:33.467870 File source: link on GitLab

functional_tests

Tests each API call as defined by the NuNet Open API of the respective version that is being tested. The goal of this stage is to make sure that the released versions of the platform fully correspond to the released Open APIs, which will be used by core team, community developers and app integrators to build further.

Implemented: https://gitlab.com/nunet/nunet-infra/-/blob/develop/ci/templates/Jobs/Functional-Tests.gitlab-ci.yml

Pre-requisites

python 3.11+
- older python versions might work though
python-venv
DMS:
- a native installation locally or using docker
- using the project dms-on-lxd

Usage

For detailed instructions on setting up and running the functional tests, please refer to the Quick Setup Guide which provides step-by-step instructions for:

Setting up the LXD environment
Running standalone and distributed tests
Common test scenarios and examples
Environment cleanup

Feature environment

This section documents the development guidelines of functional tests targeting the feature environment.

Introduction

The feature environment is a particular instance of an isolated network environment that has multiple DMS instances deployed. It uses the project dms-on-lxd to manage the virtual machines and network hosting DMS nodes. A full explanation of the feature environment architecture can be seen at the feature environment architecture documnetation.

There are conceptually two types of tests that will use the feature environment, standalone and distributed.

Standalone tests are subset of functional tests that don't explicitly test network integration, while distributed tests aims to produce particular outcomes when interacting with multiple DMS nodes in coordination.

Standalone tests will test things like hardware support, OS support, system resources footprint to name a few. It tries to answer questions like "can this particular ARM CPU run all the functionalities provided by DMS interface?", "can DMS be deployed on Ubuntu (24.04, 22.04, 20.04), Debian (Bookworm, Bullseye), Arch, etc...?", "are the minimum requirements for running DMS valid in practice?"...

Distributed tests will test things like peer to peer functionality, graph traversal and so forth. It tries to answer things like "can each DMS node in the graph see each other node?", "how long does it take for a node to be visibile to other nodes when joining the network?", "given multiple DMS nodes, can I successfully send files and messages from each node to another?", "given three DMS nodes, where A can only communicate with B through C, can I successfully interact with C from A?"...

Having this distinction in mind we can explore the interfaces of the feature environment and explore how they relate to the implementation of the functional tests.

Standalone API tests

The standalone API tests are structured in a way that they try to communicate with port 9999 using localhost and the http protocol. They can be used as is leveraging ssh tunneling.

Lets use the feature set described in device_api.feature as an example. Given we have a DMS installed locally, we can just run them:

behave features/device-management-service/api-tests/device_api.feature

However, in the context of the feature environment, the machine that run the tests and those that effectively execute the required commands and queries are different. Therefore we need to tunnel port 9999 to where we are running behave.

First we have to make sure that nothing is running bound to port 9999. For this we can use lsof to verify programs listenting to that port:

sudo lsof -i :9999

This command should not produce any output if there isn't anything listening to port 9999. If, however, there indeed is, that program should be interrupted before attempting to create the tunnel.

Once we made sure port 9999 is free to use, we can open the tunnel:

ssh-keygen -R $VM_IPV4
nohup ssh \
    -4 -N -L 9999:localhost:9999 \
    -o IdentitiesOnly=yes \
    -o StrictHostKeyChecking=no \
    -i $PROJECT_DIR/infrastructure/dms-on-lxd/lxd-key \
    root@$VM_IPV4 &
tunnel_pid=$!

Where PROJECT_DIR is the root of this project and VM_IPV4 is the IP of the target virtual machine we want to run the API test.

The first command uses ssh-keygen to update the known_hosts file with the updated signature of the virtual machine that has $VM_IPV4 attached to it. This is to make sure that ssh won't complain about signature changes and prevent us from opening the tunnel. Since the target virtual machines are ephemeral, this is a problem that can happen often. ssh-keygen in this context is safe to use because we are the ones provisioning the virtual machines, therefore the man-in-the-middle warning are known to be false alarms.

The second command uses ssh to create an IPv4 tunnel using port 9999. nohup combined with & is a bash idiom that will run ssh in the background without halting it, freeing your terminal to be used to run further commands. For more information see this stackoverflow answer.

The last command saves the process id of the tunnel in the variable tunnel_pid so it can be later used to destroy the tunnel.

Now we can just run behave again and it will use the local port 9999 but the connection will be redirected to the target host.

Once we are done, we can close the tunnel using the process ID we saved before:

kill $tunnel_pid

Standalone CLI tests

CLI tests don't have the same flexibility as API testing using http. They must be piped to the remote host using ssh directly, or at least there isn't a known way to pipe these commands transparently while running a python runtime locally.

Therefore there needs to be an effort employed to refactor the way the tests are structured so that if we pass a list of IPV4 addresses, a username (defaulting to root) and a key, it will run the necessary CLI commands over ssh, otherwise running the CLI commands locally.

The proposed way to do this is to run behave passing this information using -D:

behave \
  -D ssh_host=$target_ip \
  -D ssh_key_file=$PROJECT_DIR/infrastructure/dms-on-lxd/lxd-key \
  -D ssh_username=root \
  features/device-management-service/cli-tests/nunet_cli.feature

Note that this command uses files that are produced by the dms-on-lxd project. This assumes that the target IP which will run the commands is stored in the variable $target_ip. For more information about lxd-key and other files produces by dms-on-lxd, see dms-on-lxd documentation.

How exactly we implement this is up for debate, but there is a proof of concept that can be used as an example. For more information refer to the feature POC.

Distributed tests

To compose tests that require a certain level of coordination, the proposed way of doing it is through the implementation of the gherkin features using python, delegating to the behave framework and the python implementation the responsibility of coodinating these interactings and hiding them under high level functionality descriptions.

For this, take the Feature POC and its implementation as an example.

In it there are general features described, but each scenario is run on all nodes before moving to the next. This way, we can test that all nodes can onboard, that all nodes can see each other in the peers list, that all nodes can send messages to all other nodes, and that all of them can offboard, either using the CLI over SSH or the API using sshtunnel.

The code won't be repeated here to avoid risking them becoming obsolete in the future when we either change the POC code or remove them altogether.

SP and CP

It's not hard to imagine an extension of the POC for the scenario of a service provider and a compute provder.

Let's imagine the service provider has a workload that require GPU offloading but no GPU, while the compute provider has a GPU available for such workloads. In this scenario we can have a preparation step in behave that queries the remote hosts for resources, using lspci over ssh for instance, to identify machines that can serve as the service provider and the compute provider.

Doing this, we can have a test that describes exactly that, and we can implement the feature test in a way that will use the elected service provider (with or without GPU) to post a job specifically for the node that has GPU capabilities and will serve as a compute provider.

feature_environment_scripts

Last updated: 2024-11-07 21:05:33.739315 File source: link on GitLab

This a collection of helper scripts to facilitate the execution of the functional tests in the feature environment, managed by dms-on-lxd.

install.sh uses requirements.txt in functional_tests folder to create a virtual environment with the correct dependencies
run-standalone-tests.sh runs all the standalone tests in each virtual machine
run-distributed-tests.sh runs tests that require all virtual machines to run

Requirements

python 3
lsof
allure

Environment Variables

The test scripts support the following optional environment variables:

FUNCTIONAL_TESTS_DOCKER: Set to "false" to run tests directly without Docker container (default: "true")
BEHAVE_ENABLE_ALLURE: Set to "true" to enable Allure test reporting (default: "false")
BEHAVE_ENABLE_JUNIT: Set to "true" to enable JUnit test reporting (default: "false")
DMS_ON_LXD_ENV: Specify custom environment name to use alternate inventory files
CI_PROJECT_DIR: Override project directory path (defaults to detected test-suite path)
BEHAVE_SKIP_ONBOARD_CHECK: Skip device onboarding verification (default: "true")

Usage

cd stages/functional_tests
source install.sh
source run-standalone-tests.sh
source run-distributed-tests.sh

integration_tests

Last updated: 2024-11-07 21:05:34.005115 File source:

integration_tests

Tests that need to use more than one component. The purpose of the integration tests is to verify components in the integrated way, when there is a workflow that needs communication between them. These tests generally need some sort of environment (which could be mock environment or real environment).

load_tests

Last updated: 2024-11-07 21:05:34.578476 File source:

load_tests

Performance and load testing of the whol platfrom. Define performance scenarios that would exhaust the system and automatically run them for checking confidentiality, availability and integrity of the platform.

Implementation: https://gitlab.com/nunet/nunet-infra/-/blob/develop/ci/templates/Jobs/Load-Tests.gitlab-ci.yml

regression_tests

Last updated: 2024-11-07 21:05:35.066323 File source:

regression_tests

Regression tests will deploy and run all applications that are running on the platform. These tests may overlap or include user acceptance testing and testing behaviors on these applications as well as deployment behaviors. As well as user acceptance testing stage, regression tests may include manual beta testing phase.

Implemented: https://gitlab.com/nunet/nunet-infra/-/blob/develop/ci/templates/Jobs/Regression-Tests.gitlab-ci.yml

security_tests_1

Last updated: 2024-11-07 21:05:35.670324 File source:

security_tests_1

Automated security testing (using third party tools) does not need live environments or a testnet, i.e. can be run on the static repository code.

Implemented: https://gitlab.com/nunet/nunet-infra/-/blob/develop/ci/templates/Jobs/Security-Tests-1.gitlab-ci.yml

security_tests_2

Last updated: 2024-11-07 21:05:36.009165 File source: link on GitLab

security_tests_2

Tests security of API calls that do not need deployed testnetwork.

Implemented: https://gitlab.com/nunet/nunet-infra/-/blob/develop/ci/templates/Jobs/Security-Tests-2.gitlab-ci.yml

security_tests_3

Last updated: 2024-11-07 21:05:36.537909 File source:

security_tests_3

Live Security tests. Security Tests that need full platform and all applications running to test security aspects from user perspective. Will be done mostly manually and include 'red team' style penetration testing. All detected vulnerabilities will be included into security_tests_1 and security_tests_2 stages for automated testing of the further platform versions.

unit_tests

Last updated: 2024-11-07 21:05:36.778926 File source:

unit_tests

Runs units tests on the codebase for each language which exists in the codebase (since NuNet is a language agnostic platform, it may contain multiple language code). Coverage report is displayed via gitlab interface (this part is still being developed).

Implementation: https://gitlab.com/nunet/nunet-infra/-/blob/develop/ci/templates/Jobs/Unit-Tests.gitlab-ci.yml

user_acceptance_tests

Last updated: 2024-11-07 21:05:37.041124 File source:

user_acceptance_tests

Testing user behaviors from user perspective. Include all identify possible user behaviors and is constantly updated as soon as these behaviors are identified. The goal is to run most of the user acceptance tests automatically (describing scenarios BDD style), however, some of the tests will need to be run manually by the network of beta testers.

Implemented: https://gitlab.com/nunet/nunet-infra/-/blob/develop/ci/templates/Jobs/User-Acceptance-Tests.gitlab-ci.yml

dms-on-lxd

Last updated: 2024-11-07 21:05:32.363152 File source: link on GitLab

Introduction

This project aims to leverage terraform to provision LXD instances where multiple DMS executions will reside.

The ultimate goal of this is to have a generic and flexible provisioning standard to setup the DMS clusters wherever there is access to the LXD api.

Prerequisites

Remote

LXD API enabled on the target hosts. See HOW-TO: expose LXD to the network.
- For legacy versions you can refer to Directly interacting with the LXD API. This authentication method has been removed in recent versions of LXD.

Locally

Dependencies:

LXD CLI which must provide the lxc command line interface.
OpenTofu
DMS Deb file
- To download the latest dms release, refer to dms installation guide
jq
yq, a jq wrapper

This project can alternatively be run using the provided Dockerfile.

If using docker:

build the image

docker build -t dms-on-lxd .

run the image

docker run -it --rm -v $PWD:/app dms-on-lxd bash

You can use the dockerfile as a complete reference for all the dependencies that are expected to be present in order to execute this project

Usage

Using Docker

You can run all commands through Docker after building the image:

# Build the image
docker build -t dms-on-lxd .

# Run make.sh
docker run -it --rm \
  -v $PWD:/app \
  -v ~/.ssh:/root/.ssh:ro \
  -e DMS_ON_LXD_ENV \
  -e DMS_DEB_FILE \
  dms-on-lxd bash make.sh

# Run destroy.sh
docker run -it --rm \
  -v $PWD:/app \
  -v ~/.ssh:/root/.ssh:ro \
  -e DMS_ON_LXD_ENV \
  dms-on-lxd bash destroy.sh

The -v ~/.ssh:/root/.ssh:ro mount is optional but useful if you need SSH access to the host machine.

Configuring

First copy the configuration dist file:

cp config.yml.dist config.yml

Then modify the values accordingly:

lxd_hosts:
- host: localhost
  token: randomtoken
  port: 8443
- host: 10.250.251.1
  token: randomtoken
  port: 8443

If desired, you can customize the amount of DMS deployments by adding dms_instances_count to the config.yml file:

dms_instances_count: 2
lxd_hosts:
- #...

If ommited, one DMS instance per LXD host is deployed by default.

SSH Key

An ssh key called lxd-key (and lxd-key.pub) is created and used for the deployment of the instances. If you want to override the key, just add to terraform.tfvars:

ssh_pub_key = "ecdsa-sha2-nistp256 AAAABBBBBBBBBXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFmJ+rmL9YSVfXTEX+7P5VD6rciVYpig8BzmWJlJwdEmnuFhMyhsmtO31M2TwcW9TFNyfEsABCDEFGHI= EXAMPLE KEY"

Other terraform variables

The default terraform variables can be seen in variables.tf. Customizing their default values is optional.

To customize variables, for instance the dms file, which is "dms_deb_filepath", add this line to terraform.tfvars. Create the file if it doesn't exist:

dms_deb_filepath = "/full/path/to/nunet-dms-0.xxx.deb"

For a complete list of variables, check the file variables.tf.

Nebula

This project also supports using nunet/nebula which is a project that is based off slackhq/nebula.

Note that the nunet project is private as of the time of writing this document.

To enable the use of nebula, add to the terraform.tfvars:

enable_nebula = true

And provide the necessary nebula users with their respective associated IPs, adding them to the config.yml file:

nebula_users:
- username: nunet-test998
  password: boguspass
  ipv4: 10.251.252.253
- username: nunet-test999
  password: boguspass
  ipv4: 10.251.252.254
- # ...

Notice that you must provide at least the same amount of users as the expected dms instances to be deployed, otherwise the execution will fail.

Running

NOTE: If using docker, run these inside the container.

Spin up the cluster using bash make.sh. NOTE: make isn't actually used for the deployment.
Use lxd_vm_addresses.txt to connect and execute code in the remote instances:

for instance in $(cat lxd_vm_addresses.txt); do
  lxc exec --cwd=/opt/test-suite $instance -- {COMMAND_TO_EXECUTE}
done

When done, destroy the infrastructure using bash destroy.sh.

Outputs

The following files are produced after running this project with make.sh.

add-lxd-remotes.sh

This script is a helper to add the lxd remote servers to your local lxd client in order to help with managing remote instances.

It looks something like this:

#!/usr/bin/env bash

lxc remote add --accept-certificate --password securepass localhost https://localhost:8443
lxc remote add --accept-certificate --token randomtoken localhost https://10.251.252.1:8443
# ...

Upon execution, the remotes are added to your local machine. You can then list the virtual machines in each remote:

$ lxc ls localhost:       
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
|        NAME        |  STATE  |          IPV4           |                      IPV6                       |      TYPE       | SNAPSHOTS |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
| dms-0-on-localhost | RUNNING | 172.17.0.1 (docker0)    | fd42:bd31:9a92:4fb2:216:3eff:feb4:abcd (enp5s0) | VIRTUAL-MACHINE | 0         |
|                    |         | 10.167.120.14 (enp5s0)  |                                                 |                 |           |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+
| dms-1-on-localhost | RUNNING | 172.17.0.1 (docker0)    | fd42:bd31:9a92:4fb2:216:3eff:fed2:efgh (enp5s0) | VIRTUAL-MACHINE | 0         |
|                    |         | 10.167.120.171 (enp5s0) |                                                 |                 |           |
+--------------------+---------+-------------------------+-------------------------------------------------+-----------------+-----------+

You can then terminate instances at will, for instance if while using this project the opentofu component enters an inconsistent state:

$ lxc delete --force localhost:dms-0-on-localhost

reachable_hosts.yml

This is a list of hosts that have been tested and are reachable from the machine where this project is being executed:

lxd_hosts:
- {"host":"localhost","token":"randomtoken","port":8443}

unreachable_hosts.yml

This is a list of unreachable hosts, that during test failed to respond:

lxd_hosts:
- {"host":"10.250.251.1","token":"randomtoken","port":8443}

vms-ipv4-list.txt

10.167.120.14
10.167.120.171

If nebula is enabled, these IPs are replaced with the internal IPs of nebula, assigned to each VM:

10.251.252.253
10.251.252.254

To iterate over the list for connecting over ssh using bash:

for ip in $(cat ./vms-ipv4-list.txt); do
  ssh -i ./lxd-key \
    -o IdentitiesOnly=yes \
    -o StrictHostKeyChecking=no \
    -o PubKeyAuthentication=yes \
    root@$ip -- echo ok
done

For processing the file in a script like python:

with open("vms-ipv4-list.txt") as file_fp:
    ip_list = [line.strip() for line in file_fp.readlines()]

Known issues

Docker and LXD

To overcome this, add the following rules to iptables (using sudo whenever necessary):

iptables -I DOCKER-USER -i lxdbr0 -j ACCEPT
iptables -I DOCKER-USER -o lxdbr0 -j ACCEPT

Virtual Machines (SOLVED)

long explanation of the issue and solution

Terraform docs

Generating docs

This part of the documentation is generated automatically from terraform using terraform-docs.

To update it, run:

terraform-docs markdown table --output-file README.md --output-mode inject .

Requirements

Name

Version

Providers

Name

Version

Modules

No modules.

Resources

Name

Type

Inputs

Name

Description

Type

Default

Required

Outputs

Name

Description