Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Last updated: 2025-01-23 01:10:42.955122 File source: link on GitLab
The Nunet Command-Line Interface (CLI) serves as a powerful tool for interacting with the Nunet ecosystem, enabling you to manage network configurations, control capabilities, and handle cryptographic keys. It provides a comprehensive set of commands to streamline various tasks and operations within the Nunet network.
This command provides a suite of operations tailored for interacting with the Nunet Actor System. It enables you to communicate with actors within the network, facilitating actions like sending messages, invoking specific behaviors, and broadcasting information to multiple actors simultaneously.
Detailed documentation can be found here.
This command focuses on capability management within the Nunet ecosystem. It allows you to define, delegate, and control the permissions and authorizations granted to different entities, ensuring secure and controlled interactions within the network.
Detailed documentation can be found here.
The config command allows to interact with and manage your configuration file directly from the command line. This allows you to view existing settings, modify them as needed, and ensure Nunet DMS is tailored to your preferences.
edit
: Opens the configuration file in your default text editor for manual adjustments.
get
: Retrieve and display the current value associated with a specific configuration key.
set
: Modify the configuration file by assigning a new value to a specified key.
-h, --help
: Display help information for the config
command and its subcommands.
The Nunet Key Management CLI allows generating new keypairs and retrieve the Decentralized Identifier (DID) associated with a specific key.
nunet key
Description: The primary command to manage keys within the Nunet DMS
Usage: nunet key COMMAND
Available Commands:
did
: Retrieve the DID for a specified key
new
: Generate a new keypair
Flags:
-h, --help
: Display help information for the main key
command.
nunet key did
Description: Retrieves and displays the DID associated with a specified key. This DID uniquely identifies the key within the Nunet network.
Usage: nunet key did <key-name> [flags]
Arguments:
<key-name>
: The name of the key for which the DID is to be retrieved
Flags:
-h, --help
: Display help information for the did
command
nunet key new
Description: Generates a new keypair and securely stores the private key in the user's local keystore. The corresponding public key can be used for various cryptographic operations within the Nunet DMS
Usage: nunet key new <name> [flags]
Arguments:
<name>
: A name to identify the newly generated keypair
Flags:
-h, --help
: Display help information for the new
command.
Important Considerations:
Keep your private keys secure, as they provide access to your identity and associated capabilities within the Nunet DMS
Choose descriptive names for your keypairs to easily identify their purpose or associated devices.
Starts the Nunet Device Management Service (DMS) process, responsible for handling network operations and device management.
Usage:
Flags:
-c, --context string
: Specifies the key and capability context to use (default: "dms").
-h, --help
: Displays help information for the run
command.
Example:
This starts the Nunet DMS with the default "dms" context.
nunet tap
Purpose: Creates a TAP (network tap) interface to bridge the host network with a virtual machine (VM). It also configures network settings like IP forwarding and iptables rules.
Key Points:
Root Privileges Required: This command necessitates root or administrator privileges for execution due to its manipulation of network interfaces and system-level settings.
Usage:
Arguments:
main_interface
: (e.g., eth0) The name of the existing network interface on your host machine that you want to bridge with the TAP interface.
vm_interface
: (e.g., tap0) The name you want to assign to the newly created TAP interface.
CIDR
: (e.g., 172.16.0.1/24) The Classless Inter-Domain Routing (CIDR) notation specifying the IP address range and subnet mask for the TAP interface. This ensures that the VM or container connected to the TAP has its own IP address within the specified network.
Flags:
-h, --help
: Displays help information for the tap
command.
Example:
This command will create a TAP interface named 'tap0' bridged to your host's 'eth0' interface. The 'tap0' interface will be assigned an IP address of '172.16.0.1' with a subnet mask of '/24'. This configuration allows a VM connected to 'tap0' to access the network through your host's 'eth0' interface.
Important Notes:
Ensure you have the necessary permissions to execute this command.
Be cautious when configuring network settings, as incorrect configurations can disrupt your network connectivity.
nunet gpu
Purpose: The nunet gpu
command provides gpu related apis.
Usage:
Available Operations:
list
: List all the available GPUs on the system.
test
: Test the GPU deployment on the system using docker.
Flags:
-h, --help
: Display help information for the gpu
command and its subcommands.
Example:
This command will list all the available GPUs on the system.
This command will test the GPU deployment on the system using docker.
Last updated: 2025-01-23 01:10:42.686247 File source: link on GitLab
The api package contains all API functionality of Device Management Service (DMS). DMS exposes various endpoints through which its different functionalities can be accessed.
Here is quick overview of the contents of this directory:
README: Current file which is aimed towards developers who wish to use and modify the api functionality.
api.go: This file contains router setup using Gin framework. It also applies Cross-Origin Resource Sharing (CORS) middleware and OpenTelemetry middleware for tracing. Further it lists down the endpoint URLs and the associated handler functions.
actor.go: Contains endpoints for actor interaction.
Configuration
The REST server by default binds to 127.0.0.1
on port 9999
. The configuration file dms_config.json
can be used to change to a different address and port.
The parameters rest.port
and rest.addr
define the port and the address respectively.
You can use the following format to construct the URL for accessing API endpoints
Currently, all endpoints are under the /actor
path
/actor/handle
Retrieve actor handle with ID, DID, and inbox address
endpoint:
/actor/handle
method:
HTTP GET
output:
Actor Handle
Response:
/actor/send
Send a message to actor
endpoint:
/actor/handle
method:
HTTP POST
output:
{"message": "message sent"}
The request should be an enveloped message
/actor/invoke
Invoke actor with message
endpoint:
/actor/invoke
method:
HTTP POST
output:
Enveloped Response or if error {"error": "<error message>"}
The request should be an enveloped message
/actor/broadcast
Broadcast message to actors
endpoint:
/actor/broadcast
method:
HTTP POST
output:
Enveloped Response or if error {"error": "<error message>"}
The request should be an enveloped message
For more details on these Actor API endpoints, refer to the cmd/actor package on how the they are used.
Last updated: 2025-01-23 01:10:44.051677 File source: link on GitLab
This package defines the local database functionality for the Device Management Service (DMS). Currently two repository structures have been implemented:
gorm
: which is a SQlite database implementation.
clover
: which is a NoSQL
or document oriented database implementation.
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the package functionality.
db.go: This file defines the method which opens an SQlite database at a path set by the config parameter work_dir
, applies migration and returns the db
instance.
repositories: This folder contains the sub-packages of the db
package.
specs: This folder contains the class diagram of the package.
The class diagram for the db
package is shown below.
Source file
Rendered from source file
Refer to the README file defined in the repositories folder for specification of the package.
Last updated: 2025-01-23 01:10:41.894662 File source: link on GitLab
Device Management Service or DMS enables a machine to join the decentralized NuNet network both as a compute provider, offering its resources to the network, or to leverage the compute power of other machines in the network for processing tasks. Eventually users with available hardware resources will get compensated whenever their machine is utilized for a computational job by other users in the network. The ultimate goal of the platorm is to create a decentralized compute economy that is able to sustain itself.
All transactions on the Nunet network are expected to be conducted using the platform's utility token NTX. However, DMS is currently in development, and payment is not part of v0.5.0
release. NTX payments are expected to be implemented in the Public Alpha Mainnet milestone within later release cycles.
Note: If you are a developer, please check out the DMS specifications and Building from Source sections of this document.
You can install Device Management Service (DMS) via binary releases or building it from source.
You can find all binary releases here and other builds in-between releases in the package registry. We currently support ARM and AMD64 architectures. You may check your architecture with appropriate command (uname -p
for linux) and refer to the architecture name mapping e.g. here for figuring correct package to download.
Note: If you intalled the binary from a release and you would like to act as compute provider, you may need to check permissions and features to enable some required and optional features.
Ubuntu/Debian
Download the latest .deb package from the package registry
Install the Debian package with apt
or dpkg
:
Some dependencies such as docker
and libsystemd-dev
might be missing so it's recommended to fix install by running:
We currently support Linux and MacOS (Darwin).
Dependencies
iproute2 (linux only)
build-essential (linux only)
libsystemd-dev (linux only)
go (v1.21.7 or later)
git-lfs (for downloading large files)
Clone the repository:
Configure git-lfs:
Build the CLI:
This will result in a binary file in builds/ folder named as dms_linux_amd64
or dms_darwin_arm64
depending on the platform.
Note: If you built from source and would like to act as a compute provider, you may need to check permissions and features to enable some required and optional features.
To cross compile to arm, cross compilers need to be installed. In particular arm-linux-gnueabihf and aarch64-linux-gnu. For debian systems, install with:
You can add the compiled binary to a directory in your $PATH
. See the Usage section for more information.
The following applies only for compute providers using Linux. If you're running a client/orchestrator, you do not need to set any additional permissions.
Darwin users: unfortunately, the DMS can't work with granular permissions on Mac. So, for now, if running a compute provider, you will have to run the nunet daemon (
nunet run
) as root.
For Linux users, granular permissions will have to be set to the binary (optionally, but NOT recommended, you can run the binary as root).
Required: Net-admin permission and IP over libp2p
Note: step not needed if you're using our debian package.
It's needed for those building from source or downloading the binary releases.
Note: cap_net_admin
is a required capability for compute providers.
Setting the permission enables IP over libp2p which is a feature that enhances the capabilities of compute providers, allowing them to participate in a wider range of jobs. One capability enabled with this feature is to do port forwarding which it won't be possible without setting the right unix permissions.
One of the reasons for requiring this permission is because this feature depends on creating and managing tun interfaces.
To enable this feature, the nunet
binary requires network-admin
capabilities. These capabilities allow the application to perform network configuration tasks without needing to run the entire application as root, which is a more secure approach.
To set the necessary capabilities, run the following command:
The above command depends on: libcap2-bin
(Debian/Ubuntu) or libcap
(CentOS/RHEL/Arch...)
May be required: iptables upgrade
Some legacy versions of Linux iptables
do not work with our IP over libp2p feature.
Check the version of yours by running:
If it's using the nf_tables
version, you're fine. You can skip this step.
If it's using a legacy version, upgrade with:
Then, select the number which corresponds to the iptables-nft
option and press enter.
Skip doing an unattended installation for the new Ubuntu VM as it might not add the user with administrative privileges.
Enable Guest Additions when installing the VM (VirtualBox only).
Always change the default NAT network setting to Bridged before booting the VM.
Install Extension Pack if using VirtualBox (recommended).
Install VMware Tools if using VMware (recommended).
ML on GPU jobs on VMs are not supported.
Install WSL through the Windows Store.
Install the Update KB5020030 (Windows 10 only).
Install Ubuntu 20.04 through WSL.
Enable systemd on Ubuntu WSL.
ML Jobs deployed on Linux cannot be resumed on WSL.
Though it is possible to run ML jobs on Windows machines with WSL, using Ubuntu 20.04 natively is highly recommended to avoid unpredictability and performance losses.
If you are using a dual-boot machine, make sure to use the wsl --shutdown
command before shutting down Windows and running Linux for ML jobs. Also, ensure your Windows machine is not in a hibernated state when you reboot into Linux.
CPU-only machines
Minimum System Requirements
We require you to specify CPU (MHz x no. of cores) and RAM, but your system must meet at least the following requirements before you decide to onboard it:
CPU: 2 GHz
RAM: 4 GB
Free Disk Space: 10 GB
Internet Download/Upload Speed: 4 Mbps / 0.5 MBps
If the above CPU has 4 cores, your available CPU would be around 8000 MHz. So if you want to onboard half your CPU and RAM on NuNet, you can specify 4000 MHz CPU and 2000 MB RAM.
Recommended System Requirements
CPU: 3.5 GHz
RAM: 8-16 GB
Free Disk Space: 20 GB
Internet Download/Upload Speed: 10 Mbps / 1.25 MBps
GPU Machines
Minimum System Requirements
CPU: 3 GHz
RAM: 8 GB
GPU: 4 GB VRAM (NVIDIA, AMD, or Intel discrete GPU with manually installed drivers)
Free Disk Space: 50 GB
Internet Download/Upload Speed: 50 Mbps
Note: For AMD64 platforms, we recommend using HiveOS as it comes with all necessary drivers pre-installed. For other setups, proper GPU drivers must be manually installed. See the GPU Driver Installation section for instructions.
Recommended System Requirements
CPU: 4 GHz
RAM: 16-32 GB
GPU: 8-12 GB VRAM (NVIDIA, AMD, or Intel discrete GPU with manually installed drivers)
Free Disk Space: 100 GB
Internet Download/Upload Speed: 100 Mbps
NuNet DMS requires properly installed GPU drivers to function correctly. We do not automatically install drivers to ensure compatibility and flexibility across different user setups.
For AMD64 Platforms:
We recommend using the Ubuntu-based HiveOS for the easiest setup.
If you prefer to use a different operating system or need to install drivers manually, please follow these steps:
NVIDIA GPUs:
Visit the NVIDIA Official Driver Downloads page.
Select your GPU model and operating system.
Download and install the recommended driver.
Install the NVIDIA Container Toolkit.
Reboot your system after installation.
AMD GPUs:
Visit the AMD Drivers and Support for Processors and Graphics page.
Select your GPU model and operating system.
Download and install the recommended driver.
Reboot your system after installation.
Along with the drivers, you will need to install amdgpu using ROCm for AMD GPUs. You can find the installation instructions here.
Make sure you select the rocm usecase when installing the amdgpu.
Intel Discrete GPUs:
Select your GPU model and operating system.
Download and install the recommended driver.
Reboot your system after installation.
Along with the drivers, you will need to install XPU SMI for Intel GPUs. You can find the installation instructions here.
For detailed instructions specific to your operating system, please refer to the documentation provided by NVIDIA, AMD, or Intel.
Note: Ensure that you have the correct permissions to install drivers on your system. On Linux systems, you may need to use sudo
or log in as root to install drivers.
Before starting, ensure that you have properly installed GPU drivers if you're using a GPU-enabled machine. For AMD64 platforms, we recommend using HiveOS for the easiest setup. For other configurations, refer to the GPU Driver Installation section for instructions.
This quick start guide will walk you through the process of setting up a Device Management Service (DMS) instance for the first time and getting it running. We'll cover creating identities, setting up capabilities, and running the DMS.
The NuNet CLI
The NuNet CLI is the command-line interface for interacting with the Nunet Device Management Service (DMS). It provides commands for managing keys, capabilities, configuration, running the DMS, and more. It's essential for setting up and administering your DMS instance.
Key Concepts
Actor: An independent entity in the Nunet system capable of performing actions and communicating with other actors.
Capability: Defines the permissions and restrictions granted to actors within the system.
Key: A cryptographic key pair used for authentication and authorization within the DMS.
You can find a detailed documentation here.
Creating identities
The first step is to generate identities/keys and capability contexts. It is recommended that two keys are setup: one for the user (default name user
) and another for the dms (default name dms
)
A capability context is created with the nunet cap new <context>
command and it is anchored on a key with the same context name. The command automatically generates a key for the given context if not present. Keys can also be created manually with nunet key new <key>
if you prefer.
Note: If creating keys manually, make sure to use the same context name, otherwise it won't work.
In this example, we are going to set up two capability contexts:
First the user
then the dms instance.
You can create as many identities as you want, specially if you want to manage multiple DMS instances.
Each time a new identity is generated it will prompt the user for a passphrase. The passphrase is associated with the created identity, thus a different passphrase can be set up for each identity. If you prefer, it's possible to set a DMS_PASSPHRASE
environment variable to avoid the command prompt.
The key did
command returns a DID key for the specified identity.
Remember to secure your keys and capability contexts, as they control access to your NuNet resources. They are encrypted and stored under $HOME/.nunet
by default.
Using a Ledger Wallet
It is also possible to use a Ledger Wallet instead of creating a new key; this is recommended for user contexts, but you should not use it for the dms context as it needs the key to sign capability tokens.
To set up a user context with a Ledger Wallet, you need the ledger-cli
script from NuNet's ledger wallet tool. The tool uses the Eth application and specifically the first Eth account with signing of personal messages. Everything that needs to be signed (namely capability tokens) will be presented on your Nano's screen in plaintext so that you can inspect it.
You can get your Ledger wallet's DID with:
To create the capability context for the user
Setting up Capabilities
NuNet's network communication is powered by the NuActor System, a zero-trust system that utilizes fine-grained capabilities, anchored on DIDs, following the UCAN model.
Once both identities are created, you'll need to set up capabilities. Specifically:
Create capability contexts for both the user and each of your DMS instances.
Add the user's DID as a root anchor for the DMS capability context. This ensures that the DMS instance fully trusts the user, granting complete control over the DMS (the root capability).
If you want your DMS to participate in the public NuNet testnet (and eventually the mainnet), you'll need to set up capability anchors for your DMS:
Create a capability anchor to allow your DMS to accept public and deployment behavior invocations from authorized users and DMSs in the NuNet ecosystem.
Add this token to your DMS as a require anchor.
Request a capability token from NuNet to invoke public behaviors on the network.
Add the token as a provide anchor in your personal capability context.
Delegate to your DMS the ability to make public invocations using your token.
Add the delegation token as a provide anchor in your DMS.
Add a root anchor for your DMS context
You can do this by invoking the dms cap anchor
command:
Where <user-did>
is the user did created above in Creating identities and can be obtained by:
or if you are using a Ledger Wallet
Setup your DMS for the public testnet
The NuNet DID
Create a capability anchor for public and deployment behaviors
Create the grant
or if you are using a Ledger Wallet
And the granted token as a require anchor
The first command grants nunet authorized users the capability to invoke public and deployment behaviors until December 31, 2025, and outputs a token.
The second command consumes the token and adds the require anchor for your DMS
Ask NuNet for a public network capability token
To request tokens for participating in the testnet, please go to did.nunet.io and submit the did you generated along with your gitlab username and an email address to receive the token. It's highly recommended that you use a Ledger hardware wallet for your keys.
Use the NuNet granted token to authorize public and deployment behavior invocations in the public network
3.1 Add the provide anchor to your personal context
or if you are using a Ledger Wallet
3.2 Delegate to your DMS
or if you are using a Ledger Wallet
3.3 Add the delegation token as a provide anchor in your DMS
The first command ingests the NuNet provided token and the last two commands use this token to delegate the public and deployment behavior capabilities to your DMS.
Running DMS
If everything was setup properly, you should be able to run:
Darwin users: If you plan to onboard your computer power to the network, You may need to run with
sudo
. See the optional features and permissions section for more information.
By default, DMS runs on port 9999.
If you want to contribute your computer's resources (CPU, RAM, GPU, storage) to the network, you have to onboard your machine.
Follow our Compute Provider Guide to get started.
Every node on the network can deploy workloads across available compute resources, given the necessary capabilities. Learn how deployments work by following our Deployments Guide.
Refer to the api
package README for the list of all endpoints. Head over to project's issue section and create an issue with your question.
The DMS searches for a configuration file dms_config.json
in the following locations, in order of priority whenever it's started:
The current directory (.
)
$HOME/.nunet
/etc/nunet
The configuration file must be in JSON format and it does not support comments. It's recommended that only the parameters that need to be changed are included in the config file so that other parameters can retain their default values.
It's possible to manage configuration using the config
subcommand as well. nunet config set
allows setting each parameter individually and nunet config edit
will open the config file in the default editor from $EDITOR
Run Two DMS Instances Side by Side
As a developer, you might find yourself needing to run two DMS instances, one acting as an SP (Service Provider) and the other as a CP (Compute Provider).
Step 1:
Clone the repository to two different directories. You might want to use descriptive directory names to avoid confusion.
Step 2:
You need to modify some configurations so that both DMS instances do not end up trying to listen on the same port and use the same path for storage. For example, ports on p2p.listen_address
, rest.port
, general.user_dir
etc... neeed to be different for two instances on the same host.
The dms_config.json
file can be used to modify these settings. Here is a sample config file that can be modified to your preference:
Prefer to use absolute paths and have a look at the config structure for more info.
Some packages contain tests, and it is always best to run them to ensure there are no broken tests before submitting any changes. Before running the tests, the Firecracker executor requires some test data, such as a kernel file, which can be downloaded with:
After the download is complete, all unit tests can be run with the following command. It's necessary to include the unit
tag due to the existence of files that contain functional and integration tests.
TODO: talk about e2e tests too and put a bash command on how to execute it: make e2e_test
Help in contributing tests is always appreciated :)
NuNet is a computing platform that provides globally distributed and optimized computing power and storage for decentralized networks, by connecting data owners and computing resources with computational processes in demand of these resources. NuNet provides a layer of intelligent interoperability between computational processes and physical computing infrastructures in an ecosystem which intelligently harnesses latent computing resources of the community into the global network of computations.
Detailed information about the NuNet platform, concepts, architecture, models, stakeholders can be found in these two papers:
DMS (Device Management Service) acts as the foundation of the NuNet platform, orchestrating the complex interactions between various computing resources and users. DMS implementation is structured into packages, creating a more maintainable, scalable, and robust codebase that is easier to understand, test, and collaborate on. Here are the existing packages in DMS and their purposes:
actor
: Contains the NuActor framework for secure actor oriented programming in decentralized systems.
dms
: Responsible for starting the whole application and core DMS functionality such as onboarding, job orchestration, job and resource management, etc.
internal
: Code that will not be imported by any other packages and is used only on the running instance of DMS. This includes all configuration-related code, background tasks, etc.
db
: Database used by the DMS.
storage
: Disk storage management on each DMS for data related to DMS and jobs deployed by DMS. It also acts as an adapter to external storage services.
api
: All API functionality to interact with the DMS.
cmd
: Command line functionality and tools.
network
: All network-related code such as p2p communication, IP over Libp2p, and other networks that might be needed in the future.
executor
: Responsible for executing the jobs received by the DMS. Interface to various executors such as Docker, Firecracker, etc.
observability
: Logs, traces, and everything related to observability.
plugins
: Defined entry points and specs for third-party plugins, registration, and execution of plugin code.
types
: Defines data structures and interfaces that are used across the whole DMS component by different packages.
utils
: Utility tools and functionalities used by other packages.
lib
: External libs being used in DMS.
tokenomics
: Interaction with blockchain for the crypto-micropayments layer of the platform (not yet implemented).
test
: Contains some automated tests, not including unit tests.
maint-scripts
: Utility scripts for building / development assistance and runtime.
examples
: Examples of ensembles to be used to deploy jobs on NuNet platform.
docs
: Documentation about main functionalities in DMS as onboarding, deployments, how to create a restricted network.
specs
: Platform components specifications.
Main concepts of the architecture of DMS, the main component of the NuNet platform, can be found in the Yellow Paper.
Current key functional areas of DMS:
Actor-based system: NuNet's network communication is powered by the NuActor System, a zero-trust system that utilizes fine-grained capabilities, anchored on DIDs, following the UCAN model.
Node management: Supports onboarding/offboarding of nodes and manages peer connections.
Compute ensembles: Defines ensembles as collections of logical nodes and allocations that represent compute workloads (as explained here). Each allocation is a compute job assigned to a node.
Orchestration: Deploys an ensemble across nodes by fulfilling the specified constraints. This is done using a constraint satisfaction process where bids are requested from nodes and evaluated based on the required resources and locations.
Supervision: Once deployed, ensembles are continuously monitored.
VM/container lifecycle management: Allows creation, customization, and management of containers and virtual machines on the network.
Resource management: Controls different types of compute resources (VMs, CPUs, GPUs).
Observability: Collects information of events happening in the network allowing to perform real-time or post-mortem analysis and visualizations.
The global class diagram for the DMS is shown below. Global Class Diagram
Find additional data models within specific packages.
In addition to the relevant links added in the sections above, you can also find useful links here: NuNet Links.
Last updated: 2025-01-23 01:10:44.583855 File source: link on GitLab
This sub package contains CloverDB implementation of the database interfaces.
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the database functionality.
generic_repository: This file implements the methods of GenericRepository
interface.
generic_entity_repository: This file implements the methods of GenericEntityRepository
interface.
deployment: This file contains implementation of DeploymentRequestFlat
interface.
elk_stats: This file contains implementation of RequestTracker
interface.
firecracker: This file contains implementation of VirtualMachine
interface.
machine: This file contains implementation of interfaces defined in machine.go.
utils: This file contains utility functions with respect to clover implementation.
All files with *_test.go
naming convention contain unit tests with respect to the specific implementation.
The class diagram for the clover
package is shown below.
Source file
Rendered from source file
GenericRepository
NewGenericRepository
signature: NewGenericRepository[T repositories.ModelType](db *clover.DB) -> repositories.GenericRepository[T]
input: clover Database object
output: Repository of type db.clover.GenericRepositoryclover
NewGenericRepository
function creates a new instance of GenericRepositoryclover
struct. It initializes and returns a repository with the provided clover database.
Interface Methods
See db
package readme for methods of GenericRepository
interface
query
signature: query(includeDeleted bool) -> *clover_q.Query
input: boolean value to choose whether to include deleted records
output: CloverDB query object
query
function creates and returns a new CloverDB Query object. Input value of False
will add a condition to exclude the deleted records.
queryWithID
signature: queryWithID(id interface{}, includeDeleted bool) -> *clover_q.Query
input #1: identifier
input #2: boolean value to choose whether to include deleted records
output: CloverDB query object
queryWithID
function creates and returns a new CloverDB Query object. The provided inputs are added to query conditions. The identifier will be compared to primary key field value of the repository.
Providing includeDeleted
as False
will add a condition to exclude the deleted records.
GenericEntityRepository
NewGenericEntityRepository
signature: NewGenericEntityRepository[T repositories.ModelType](db *clover.DB) repositories.GenericEntityRepository[T]
input: clover Database object
output: Repository of type db.clover.GenericEntityRepositoryclover
NewGenericEntityRepository
creates a new instance of GenericEntityRepositoryclover
struct. It initializes and returns a repository with the provided clover database instance and name of the collection in the database.
Interface Methods
See db
package readme for methods of GenericEntityRepository
interface.
query
signature: query() -> *clover_q.Query
input: None
output: CloverDB query object
query
function creates and returns a new CloverDB Query object.
db.clover.GenericRepositoryClover
: This is a generic repository implementation using clover as an ORM
db.clover.GenericEntityRepositoryClover
: This is a generic single entity repository implementation using clover as an ORM
For other data types refer to db
package readme.
Refer to *_test.go
files for unit tests of different functionalities.
List of issues
All issues that are related to the implementation of db
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:45.704966 File source: link on GitLab
The hardware package is responsible for handling the hardware related functionalities of the DMS.
Here is quick overview of the contents of this package:
cpu: This package contains the functionality related to the CPU of the device.
ram.go: This file contains the functionality related to the RAM.
disk.go: This file contains the functionality related to the Disk.
gpu: This package contains the functionality related to the GPU of the device.
GetMachineResources()
signature: GetMachineResources() (types.MachineResources, error)
input: None
output: types.MachineResources
output(error): error
GetCPU()
signature: GetCPU() (types.CPU, error)
input: None
output: types.CPU
output(error): error
GetRAM()
signature: GetRAM() (types.RAM, error)
input: None
output: types.RAM
output(error): error
GetDisk()
signature: GetDisk() (types.Disk, error)
input: None
output: types.Disk
output(error): error
The hardware types can be found in the types package.
The tests can be found in the *_test.go
files in the respective packages.
Last updated: 2025-01-23 01:10:42.414871 File source: link on GitLab
NuActor
is a framework designed for secure actor oriented programming in decentralized systems. The framework utilizes zero trust interactions, whereby every message is authenticated individually at the point of interaction. The system supports fine-grained capabilities, anchored in decentralized identifiers (see DID) and effected with user controlled authorization networks (see UCAN).
Decentralized systems are distributed systems where there are different stake holders and controlling entities who are mutually distrustful. Actors are ideally suited for modeling and programming such systems, as they are able to express concurrency, distribution, and agency on behalf of their controllers.
However, given the open ended computing nature of decentralized systems, there is a fundamental problem in securing interactions. Because the system is open, there is effectively no perimeter; the messages are coming from the Internet, and can potentially originate in malicious or hostile actors.
NuActor takes the following approach:
The only entity an actor can fully trust is itself and its controller.
All messages invoking a behavior carry with them capability tokens that authorize them to perform the invocation.
Invocations are checked at dispatch so that it is always verified whether an invocation is allowed, anchored on the entities the actor trusts for the required capabilities.
There is no central authority; every entity (identified by a DID) can issue their own capability tokens and anchor trust wherever they want.
There are certain entities in the open public networks that may be marginally trusted to vet users (KYC) for invoking public behaviors. The set of such entities is open, and everyone is free to trust whoever they want. The creators of the network at bootstrap are good candidates for such entities.
Trust is ephemeral and can be revoked at all times.
In effect, users are in control of authorization in the network (UCAN!)
Capabilities are defined in a hierarchical namespace, akin to the UNIX file system structure. The root capability, which implicitly has all other capabilities, is /
. Every other capability extends this path, separating the namespace with additional /
s. A capability is narrower than another if it is a subpath in the UNIX sense. So /A
imples /A/B
and so on, but /A
does not imply /B
.
Behaviors have names that directly map to capabilities. So the behavior namespace is also hierarchical, allowing for easy automated matching of behaviors to capabilities.
Capabilities are expressed with a token, which is a structured object signed by the private key of the issuer. The issuer is in the token as a DID, which allows any entity inspecting the token to verify by retrieving the public key associated with the DID. Typically these are key DIDs, which embed the public key directly.
The structure of the token is as following:
The Subject
is the DID of the entity to which the Issuer
grants (if the chain is empty) or delegates the capabilities listed in the Capability
field and the broadcast topics listed in the Topic
field. The audience may be empty, but when present it restricts the receiver of invocations to a specified entity.
The Action
can be any of Delegate
, Invoke
or Broadcast
, with revocations to be added in the very near future.
If the Action
is Delegate
then the Issuer
confers to the Subject
the ability to further create new tokens, chained on this one.
If the Action
is Invoke
or Broadcast
, then the token confers to the Subject
the capability to make an invocation or broadcast to a behavior. Such tokens are terminal and cannot be chained further.
The Chain
field of the token inlines the chain of tokens (could be a single one) on which the capability transfer is anchored on.
Note that the delegation spread can be restricted by the issuer of a token using the Depth
field. If set, it is the maximum chain depth at which a token can appear. If it appears deeper in the chain, the token chain fails verification.
Finally, all capabilities have an expiration time (in UNIX nanoseconds). An expired token cannot be used any more and fails verification.
In order to sign and verify token chains, the receiver needs to install some trust anchors. Specifically, we distinguish 3 types of anchors:
root anchors which are DIDs that are fully trusted for input with implicit root capability. Any valid chain anchored on one of our roots will be admissible.
require anchors which are tokens that act as side chains for marginal input trust. These tokens admit a chain anchored in their subject, as long as the capability and depth constraints are satisfied.
provide anchors which are tokens that anchour the actor's output invocation and broadcast tokens. These are delegations which the actor can use to prove that it has the required capabilities, beside self-signing.
The token chain is verified with strict rules:
The entire chain must not have expired.
Each token in the chain cannot expire before its chain.
Each token must match the Issuer with the Subject of its chain.
Each token in the chain can only narrow (attenuate) the capabilities of its chain.
Each token in the chain can only narrow the audience; an empty audience ("to whom it may concern") can only be narrowed once to an audience DID and all chains build on top must concern the same audience.
The chain of a token can only delegate.
The signature must verify.
The whole chain must recursively verify.
actor
packageThe Go implementation of NuActor lives in the actor
package of DMS.
To use it:
The network substrate for NuActor is currently implemented with libp2p, with broadcast using gossipsub.
Each actor has a key pair for signing its messages; the actor's id is the public key itself and is embedded in every message it sends. The private key for the actor lives inside the actor's SecurityContext
.
In general:
each actor has its own SecurityContext
; however, if the actor wants to create multiple subactors and act as an ensemble, it can share it.
the key pair is ephemeral; however, the root actor in the process has a persistent key pair, which matches the libp2p key and Peer ID. This makes the actor reachable by default given its Peer ID or DID.
every actor in the process shares a DID, which is the ID of the root actor.
Each Security Context
is anchored in a process wide CapabilityContext
, which stores anchors of trust and ephemeral tokes consumed during actor interactions.
The CapabilityContext
itself is anchored on a TrustCotext, which contains the private key for the root actor and the process itself.
The following code shows how to send a message at the call site:
At the receiver this is how we can react to the message:
Notice the _
for errors, please don't do this in production.
Interactive invocations are a combinations of a synchronous send and wait for a reply.
At the call site:
At the receiver this is how we can create an interactive behavior:
Again, notice the _
for errors, please don't do this in production.
We can easily broadcast messages to all interested parties in a topic.
At the broadcast site:
At the receiver:
Notice all these defer msg.Discard()
in the examples above; this is necessary to ensure deterministic cleanup of tokens exchanged during the interaction. Please do not forget that.
Last updated: 2025-01-23 01:10:44.316556 File source:
The db
package contains the configuration and functionality of database used by the DMS
Here is quick overview of the contents of this pacakge:
Files
Subpackages
The class diagram for the db
package is shown below.
Source file
Rendered from source file
There are two types of interfaces defined to cover database operations:
GenericRepository
GenericEntityRepository
These interfaces are described below.
GenericRepository Interface
GenericRepository
interface defines basic CRUD operations and standard querying methods. It is defined with generic data types. This allows it to be used for any data type.
interface definition: type GenericRepository[T ModelType] interface
The methods of GenericRepository
are as follows:
Create
signature: Create(ctx context.Context, data T) -> (T, error)
input #1: Go context
input #2: Data to be added to the database. It should be of type used to initialize the repository
output (success): Data type used to initialize the repository
output (error): error message
Create
function adds a new record to the database.
Get
signature: Get(ctx context.Context, id interface{}) -> (T, error)
input #1: Go context
input #2: Identifier of the record. Can be any data type
output (success): Data with the identifier provided. It is of type used to initialize the repository
output (error): error message
Get
function retrieves a record from the database by its identifier.
Update
signature: Update(ctx context.Context, id interface{}, data T) -> (T, error)
input #1: Go context
input #2: Identifier of the record. Can be any data type
input #3: New data of type used to initialize the repository
output (success): Updated record of type used to initialize the repository
output (error): error message
Update
function modifies an existing record in the database using its identifier.
Delete
signature: Delete(ctx context.Context, id interface{}) -> error
input #1: Go context
input #2: Identifier of the record. Can be any data type
output (success): None
output (error): error message
Delete
function deletes an existing record in the database using its identifier.
Find
signature: Find(ctx context.Context, query Query[T]) -> (T, error)
input #1: Go context
input #2: Query of type db.query
output (success): Result of query having the data type used to initialize the repository
output (error): error message
Find
function retrieves a single record from the database based on a query.
FindAll
signature: FindAll(ctx context.Context, query Query[T]) -> ([]T, error)
input #1: Go context
input #2: Query of type db.query
output (success): Lists of records based on query result. The data type of each record will be what was used to initialize the repository
output (error): error message
FindAll
function retrieves multiple records from the database based on a query.
GetQuery
signature: GetQuery() -> Query[T]
input: None
output: Query of type db.query
GetQuery
function returns an empty query instance for the repository's type.
GenericEntityRepository Interface
GenericEntityRepository
defines basic CRUD operations for repositories handling a single record. It is defined with generic data types. This allows it to be used for any data type.
interface definition: type GenericEntityRepository[T ModelType] interface
The methods of GenericEntityRepository
are as follows:
Save
signature: Save(ctx context.Context, data T) -> (T, error)
input #1: Go context
input #2: Data to be saved of type used to initialize the database
output (success): Updated record of type used to initialize the repository
output (error): error message
Save
function adds or updates a single record in the repository
Get
signature: Get(ctx context.Context) -> (T, error)
input: Go context
output (success): Record of type used to initialize the repository
output (error): error message
Get
function retrieves the single record from the database.
Clear
signature: Clear(ctx context.Context) -> error
input: Go context
output (success): None
output (error): error
Clear
function removes the record and its history from the repository.
History
signature: History(ctx context.Context, qiery Query[T]) -> ([]T, error)
input #1: Go context
input #2: query of type db.query
output (success):List of records of repository's type
output (error): error
History
function retrieves previous records from the repository which satisfy the query conditions.
GetQuery
signature: GetQuery() -> Query[T]
input: None
output: New query of type db.query
GetQuery
function returns an empty query instance for the repository's type.
db.Query
: This contains parameters related to a query that is passed to the database.
db.QueryCondition
: This contains parameters defining a query condition.
GenericRepository
has been initialised for the following data types:
types.DeploymentRequestFlat
types.VirtualMachine
types.PeerInfo
types.Machine
types.Services
types.ServiceResourceRequirements
types.Connection
types.ElasticToken
GenericEntityRepository
has been initialised for the following data types:
types.FreeResources
types.AvailableResources
types.Libp2pInfo
types.MachineUUID
The unit tests for utility functions are defined in utils_test.go
. Refer to *_test.go
files for unit tests of various implementations covered in subpackages.
List of issues
All issues that are related to the implementation of db
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:46.065248 File source:
Table of Contents
In NuNet, compute workloads are structured as compute ensembles. Here, we discuss how an ensemble can be created, deployed, and supervised in the NuNet network.
An ensemble is a collection of logical nodes and allocations. Nodes represent the hardware where the compute workloads run. Allocations are the individual compute jobs that comprise the workload. Each allocation is assigned to a node, and a node can have multiple allocations assigned to it.
All allocations in the ensemble are assigned a private IP address in the 10/8 range and are connected with a virtual private network, implemented using IP over libp2p. All allocations can reach each other through the VPN. Allocation IP addresses can be discovered internally in the ensemble using DNS: each allocation has a name and a DNS name, which by default is just the allocation name in the .internal
domain.
Allocation and Node names within an ensemble must be unique. The ensemble as a whole has a globally unique ID (a randomn UUID).
Fundamentally the ensemble configuration has the following structure:
A map of allocations, mapping allocation names to configuration for individual allocations.
A map of nodes, mapping node names to configuration for individual nodes.
A list of edges between nodes, encoding specific logical edge constraints.
There are additional fields in the data structure which allows us to include ssh keys and scripts in the configuration, as well as supervision strategies policies.
An allocation's configuration has the following structure:
The name of the allocation executor; this is the environment in which the actual compute job is executed. We currently support docker and firecracker VMs, but we plan to also support WASM and generally any sandbox/VM that makes sense for users.
The resources required to run the allocation, such as memory, cpu cores, gpus, and so on.
The execution details, which encodes the executor specific configuration of the allocation.
The DNS name for internal name resolution of the allocation. This can be omitted, in which case the allocation's name becomes the DNS name.
The list of ssh keys ton drop in the allocation, so that administrators can ssh into the allocation.
The list of scripts to execute during provisioning, in execution order.
Finally, the user can also specify the application specific health check to be performede by the supervisor, so that the health of the application can be ascertained and failures detected.
A node's configuration has the following structure:
The list of allocations that are assigned to the node
The configuration of mapping public ports to ports in allocations
The Location constraints for the node
An optional field for explicitly specifying the peer on which the node should be assigned, allowing users and organizations to bring their own nodes into the mix, for instance for hosting sensitive data.
In the near future, we also plan to support directly parsing kubernetes job description files. We also plan to provide a declarative format for specifying large ensembles so that it is possible to succinctly describe a 10k GPU ensemble for training an LLM and so on.
It is worth reiterating that ensembles carry with the constraints, as specified by the user. This allows the user to have finegrained control of their ensemble deployment and ensure that certain requirements are met.
In DMS v0.5 we support the following constraints:
Resources for an allocation, such as memory, core count, gpu details, and so on.
Location for nodes; the user can specify the region, city, etc all the way to choosing a particular ISP. Location constraints can also be negative, so that a node will not be deployed in certain locations e.g. because of regulatory considerations such as GPDR.
Edge Constraints, which specify the relationship between nodes in the allocation in terms of available bandwidth and round trip time.
In subsequent releases we plan to add additional constraints (e.g. existence of a contract, price range, explicit datacenter placement, energy sources and so on) and generalize the constraint expression language as graphs.
Given an ensemble specification, the core functionality of the NuNet network is to find and assign peers to nodes that satisfies the constraints of the ensemble. The system treats the deployment as a constraint satisfaction problem over permutations of available peers (compute nodes) on which the user is authorized to deploy. The process of deploying an ensemble is called orchestration. In the following we summarize how deployment orchestration is performed.
Ensemble deployment is initiated with a user invoking the /dms/node/deployment/new
behavior on the node which is willing to run an orchestrator for them; this can be just the user's private DMS running on his laptop. The node accepting the invocation creates the orchestrator actor inside its process space, initiates the deployment orchestration, and return to the user the ensemble identifier. The user can use this identifier to poll the status of the deployment and control of the ensemble through the orchestrator actor. The user also specifies a timeout on how long the deployment process should take before declaring failure. This is simply the expiration on the message that invokes /dms/node/deployment/new
.
The orchestrator then proceeds to request bids for each node in the ensemble. This is accomplished by broadcasting a message to the /dms/deployment/request
behavior in the /nunet/deployment
broadcast topic. The deployment request contains a mapping of node names in the ensemble, together with their aggregate (for all allocations to be assigned in the node) resource constraints, together with location and other constraints that can restrict the search space.
In order for this to proceed, the orchestrator must have the appropriate capabilities; only provider nodes that accept the user's capabilities will respond to the broadcast message. The response to the bid request is a bid for a node in the ensemble, by sending a message to the /dms/deployment/bid
behavior in the orchestrator. This also implies that the nodes that submit such bids must have appropriate capabilities accepted by the orchestrator.
Given the appropriate capabilities, the orchestrator collects bids until it has a sufficient number of bids or a timeout that ensures prompt progress in the deployment. If the orchestrator doesn't have bids for all nodes, then it rebroadcasts its bid request, excluding peers that have already submitted a bid. This continues until there are bids for all nodes or the deployment times out, at which point a deployment failure is declared.
Note that in the case of node pinning, where a specific peer is assigned to an ensemble node in advance (ie when a user brings their own nodes into the ensemble), bid requests are not broadcast but rather directly invoked on the peer.
Next, the orchestrator generates permutations of assignments of peers to nodes and evaluates the constraints. Some constraints can be directly rejected without measurement, for instance round trip latency constraints can be rejected by using speed of light calculations that provide a lower bound on physically realizable latency. We plan to do the same with bandwidth constraints, given the node measured link capacity and the throughput bound equation that governs TCP's behavior given bottleneck bandwidth and RTT.
Once a candidate assignment is deemed viable, the orchestrator proceeds to measure specific constraints for satisfiability. This involves measuring round trip time and bandwidth between node pairs, and is accomplished by invoking the /dms/deployment/constraint/edge
behavior.
If a candidate assignment satisfies the constraints, the orchestrator proceeds with committing and provisioning the deployment. This is done with a two phase commit process: first the orchestrator sends a commit message to all peers to ensure that the resources are still available (nodes don't lock resources when submitting a bid), by invoking the /dms/deployment/commit
behavior. If any node fails to commit, the candidate deployment is reverted and the orchestrator starts anew; revert happens with the /dms/deployment/revert
behavior.
If all nodes successfully commit, the orchestrator proceeds to provision the deployment by sending allocation details to the relevant nodes and creating the VPN. This is initiated by invoking the /dms/deployment/allocate
behavior on the provider nodes, which creates a new allocation actor. Subsequently, the orchestrator assigns IP addresses to allocations and creates the VPN (what we call the subnet) by invoking the appropriate behaviors on the allocation actors, and then starts the allocations. Once all nodes provision, the deployment is now considered running and enters supervision.
The deployment will keep running until the user shuts it down, as long as the user's agreement with the provider is active; in the near future we will also support explicitly specifying durations for running ensembles, and the ability to modify running ensembles in order to support mechanisms like auto scaling.
TODO
In order to discuss authorization flow for deployment in the NuNet network, we need to distinguish certain actors in the system in the course of an ensembles lifetime.
Specifically, we introduce the following notation:
Let's call U
, the user as an actor.
Let's call O
the orchestrator, which is an actor living inside a DMS instance (node) for which the user is authorized to initiate a deployment. We call the node where the orchestrator runs N_o
. Note that the DID of the orchestrator actor will be the same as the DID of the node on which it runs, but it will have an ephemeral actor ID.
Let's call P_i
the set of compute providers that are willing to accept deployment requests from U
.
Let's call N_{P_i,j}
the DMS nodes controlled by the providers that are willing to accept deployments from users.
And finally let's call A_i
the allocation actor for each running allocation. The DID of each allocation actor will be the same as the DID of the node on which the allocation is running, but it will have an ephemeral actor ID.
Also note that we have certain identifiers pertaining to these actors; let's define the following notation:
DID(x)
is the DID of actor x
; in general this is the DID that identifies the node on which the actor is running.
ID(x)
is the ID of actor x
; this is generally ephemeral, except for node root actors which have persistent identities matching their DID.
Peer(x)
is the peer ID of a node/actor x
.
Root(x)
is the DID of the root anchor of trust for the node/actor x
.
Using the notation above we can enumerate the behavior namespaces and requisite capabilities for deployment of an ensemble:
Invocations from U
to N_o
are in the /dms/node/deployment
namespace
Invocations from O
to N_{P_i,j}
for deployment bids:
broadcast /dms/deployment/request
via the /nunet/deployment
topic
unicast /dms/deployment/request
for pinned ensemble nodes
Messages from N_{P_i,j}
to O
:
/dms/deployment/bid
as the reply to a bid request
Invocations from O
to N_{P_i,j}
for deployment control are in the /dms/deployment
namespace.
Invocations from O
to A_i
are in the /dms/allocation
namespace and are dynamically granted programmatically.
Invocations from O
to N_{P_i,j}
for allocation control are in the dynamic /dms/ensemble/<ensemble-id>
namespace and are dynamically granted programatically.
This creates the following structure:
U
must be authorized with /dms/node/deployment
capability in N_o
N_o
must be authorized with /dms/deployment
capability in N_{P_i,j}
so that the orchestrator can make the appropriate invocations.
N_{P_i,j}
must be authorized with /dms/deployment/bid
capability on N_o
so that it can submit bids to the orchestrator.
Note that the decentralized structure and fine grained capability model of the NuActor system allows for very tight access control. This ensures that:
Orchestrators can only run on DMS instances where the user is authorized to initiate deployment.
Bid requests will only be accepted by provider DMS instances where the user is authorized to deploy.
Bids will only be accepted by provider DMS instances whom the user has authorized.
In the following we examine common functional scenarios on how to set up the system so that deployments are properly authorized.
TODO
TODO
TODO
TODO
TODO
Last updated: 2025-01-23 01:10:46.903981 File source:
The orchestrator is responsible for job scheduling and management (manages jobs on other DMSs).
A key distinction to note is the option of two types of orchestration mechanisms: push
and pull
. Broadly speaking pull
orchestration works on the premise that resource providers bid for jobs available in the network, while push
orchestration works when a job is push
ed directly to a known resource provider -- constituting to a more centralized orchestration. push
orchestration develops on the idea that users choose from the available providers and their resources. However, given the decentralized and open nature of the platform, it may be required to engage the providers to get their current (latest) state and preferences. This leads to an overlap with the pull
orchestration approach.
The default setting is to use pull
based orchestration, which is developed in the present proposed specification.
proposed
Job Orchestration
The proposed lifecyle of a job on Nunet platform consists of various operations from job posting to settlement of the contract. Below is a brief explanation of the steps involved in the job orchestration:
Job Posting: The user posts a job request to the DMS. The job request is validated and a Nunet job is created in the DMS.
Search and Match:
a. The Service provider DMS requests for bids from other nodes in the network.
b. DMS on compute provider compares the capability of the available resources against job requirements. If all the requirements are met, it then decides whether to submit a bid.
c. The received bids are assessed and the best bid is selected.
Job Request: In case the shortlisted compute provider has not locked the resources while submitting the bid, the job request workflow is executed. This requires the compute provider DMS to lock the necessary resources required for the job and re-submit the bid. Note that at this stage compute provider can still decline the job request.
Contract Closure: The service provider and the shortlisted compute provider verify that the counterparty is a verified entity and approved by Nunet Solutions to participate in the network. This in an important step to establish trust before any work is performed.
If job does not require any payment (Volunteer Compute), contract is generated by both Service Provider and Compute Provider DMS. This is then verified by Contract-Database
. Otherwise, proof of contract needs to be received from the Contract-Database
before start of work.
Invocation and Allocation: When the contract closure workflow is completed, both the service provider and compute provider DMS have an agreement and proof of contract with them. Then the service provider DMS will send an invocation to the compute provider DMS which results in job allocation being created. Allocation can be understood as an execution space / environment on actual hardware that enables a Job to be executed.
Job Execution: Once allocation is created, the job execution starts on the compute provider machine.
Contract Settlement: After job is completed, service provider DMS verifies the work done. If the work is correct, the Contract-Database
makes the necessary transactions to settle the the contract.
Here is quick overview of the contents of this directory:
Subpackages
Source
Rendered from source file
TBD
TBD
TBD
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Interfaces & Methods
proposed
Orchestrator interface
publishBidRequest
: sends a request for bid to the network for a particular job. This will depend on the network
package for propagation of the request to other nodes in the network.
compareCapability
: compares two capabilities and returns a CapabilityComparison
object. Expected usage is to compare capability required in a job with the available capability of a node.
acceptJob
: looks at the comparison between capabilities and preferences of a node in the form of CapabilityComparator
object and decides whether to accept a job or not.
sendBid
: sends a bid to the node that propagated the BidRequest
.
selectBestBid
: looks at all the bids received and selects the best one.
sendJobRequest
: sends a job request to the shortlisted node whose bid was selected. The compute provider node needs to accept the job request and lock its resources for the job. In case resources are already locked while submitting the bid, this step may be skipped.
sendInvocation
: sends an invocation request (as a message) to the node that accepted the job. This message should have all the necessary information to start an Allocation
for the job.
orchestrateJob
: this will be called when a job is received via postJob endpoint. It will start the orchestration process. It is also possible that this method could be called via a timer for jobs scheduled in the future.
proposed
Actor interface
sendMessage
: sends a message to another actor (Node / Allocation).
processMessage
: processes the message received and decides on what action to take.
proposed
Mailbox interface
receiveMessage
: receives a message from another Node and converts it into a telemetry.Message
object.
handleMessage
: processes the message received.
triggerBehavior
: this is where actions taken by the actor based on the message received will be defined.
getKnownTopics
: retrieves the gossip sub topics known to the node.
getSubscribedTopics
: retrieves the gossip sub topics subscribed by the node.
subscribeToTopic
: subscribes to a gossip sub topic.
unsubscribeFromTopic
: un-subscribes from a gossip sub topic.
proposed
Other methods
Methods for job request functionality a. check whether resources are locked b. lock resources c. accept job request
Methods for contract closure a. validate other node as a registered entity b. generate contract c. kyc validation
Methods for job exeuction a. handle job updates
Methods for contract settlement a. job verification
Note that the above methods not an exhaustive list. These are to be considered as suggestions. The developer implementing the orchestrator functionality is free to make modifications as necessary.
Data types
proposed
dms.orchestrator.Actor
: Actor has a identifier and a mailbox to send/receive messages.
proposed
dms.orchestrator.Bid
: Consists of information sent by the compute provider node to the requestor node as a bid for the job broadcasted to the network.
proposed
dms.orchestrator.BidRequest
: A bid request is a message sent by a node to the network to request for bids.
proposed
dms.orchestrator.PriceBid
: Contains price related information of the bid.
proposed
dms.orchestrator.TimeBid
: Contains time related information of the bid.
proposed
dms.orchestrator.CapabilityComparator
: Preferences of the node which has an influence on the comparison operation.
TBD
proposed
dms.orchestrator.CapabilityComparison
: Result of the comparison operation.
TBD
proposed
dms.orchestrator.Invocation
: An invocation is a message sent by the orchestrator to the node that accepted the job. It contains the job details and the contract.
proposed
dms.orchestrator.Mailbox: A mailbox is a communication channel between two actors. It uses network
package functionality to send and receive messages.
proposed
Other data types
Data types related to allocation, contract settlement, job updates etc are currently omitted. These should be added as applicable while implementation.
Orchestration steps research blogs
The orchestrator functionality of DMS is being developed based on the research done in the following blogs:
Last updated: 2025-01-23 01:10:46.630643 File source:
This file explains the onboarding functionality of Device Management Service (DMS). This functionality is catered towards compute providers who wish provide their hardware resources to Nunet for running computational tasks as well as developers who are contributing to platform development.
Here is quick overview of the contents of this directory:
The class diagram for the onboarding
package is shown below.
Source file
Rendered from source file
Onboard
signature: Onboard(ctx context.Context, config types.OnboardingConfig) error
input #1: Context object
input #2: types.OnboardingConfig
output (error): Error message
Onboard
function executes the onboarding process for a compute provider based on the configuration provided.
signature: Offboard(ctx context.Context) error
input #1: Context object
output: None
output (error): Error message
Offboard
removes the resources onboarded to Nunet.
signature: IsOnboarded(ctx context.Context) (bool, error)
input #1: Context object
output #1: bool
output #2: error
IsOnboarded
checks if the compute provider is onboarded.
signature: Info(ctx context.Context) (types.OnboardingConfig, error)
input #1: Context object
output #1: types.OnboardingConfig
output #2: error
Info
returns the configuration of the onboarding process.
types.OnboardingConfig
: Holds the configuration for onboarding a compute provider.
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:43.217589 File source:
The Nunet Actor System CLI provides a set of commands for interacting with the Nunet actor system, enabling you to send messages to actors, invoke behaviors, and broadcast messages across the network.
nunet actor msg
: Constructs a message for an actor.
nunet actor send
: Sends a constructed message to an actor.
nunet actor invoke
: Invokes a behavior in an actor and returns the result.
nunet actor broadcast
: Broadcasts a message to a topic, potentially reaching multiple actors.
nunet actor cmd
: Invokes a predefined public behavior on an actor.
nunet actor msg
This command is used to create a message that can be sent to an actor. It encapsulates the behavior to be invoked and the associated payload data.
Usage
Arguments
<behavior>
: The specific behavior you want the actor to perform upon receiving the message
<payload>
: The data accompanying the message, providing context or input for the behavior
Flags
-b, --broadcast string
: Designates the topic for broadcasting the message.
-c, --context string
: Specifies the capability context name
-d, --dest string
: Identifies the destination handle for the message.
-e, --expiry time
: Sets an expiration time for the message.
-h, --help
: Displays help information for the msg
command.
-i, --invoke
: Marks the message as an invocation, requesting a response from the actor.
-t, --timeout duration
: Sets a timeout for awaiting a response after invoking a behavior.
nunet actor send
This command delivers a previously constructed message to an actor.
Usage
Arguments
<msg>
: The message, created using the nunet actor msg
command, to be sent.
Flags
-h, --help
: Displays help information for the send
command.
nunet actor invoke
This command directly invokes a specific behavior on an actor and expects a response.
Usage
Arguments
<msg>
: The message, crafted with nunet actor msg
, containing the behavior and payload
Flags
-h, --help
: Displays help information for the invoke
command
nunet actor broadcast
This command disseminates a message to a designated topic, potentially reaching multiple actors who have subscribed to that topic.
Usage
Arguments
<msg>
: The message to be broadcasted
Flags
-h, --help
: Displays help information for the broadcast
command.
Please let me know if you have any other questions.
nunet actor cmd
This command invokes a behavior on an actor.
Usage
Available Commands
/broadcast/hello
: Invoke /broadcast/hello behavior on an actor.
/dms/node/onboarding/offboard
: Invoke /dms/node/onboarding/offboard behavior on an actor.
/dms/node/onboarding/onboard
: Invoke /dms/node/onboarding/onboard behavior on an actor.
/dms/node/onboarding/resource
: Invoke /dms/node/onboarding/resource behavior on an actor.
/dms/node/onboarding/status
: Invoke /dms/node/onboarding/status behavior on an actor.
/dms/node/peers/connect
: Invoke /dms/node/peers/connect behavior on an actor.
/dms/node/peers/dht
: Invoke /dms/node/peers/dht behavior on an actor.
/dms/node/peers/list
: Invoke /dms/node/peers/list behavior on an actor.
/dms/node/peers/ping
: Invoke /dms/node/peers/ping behavior on an actor.
/dms/node/peers/score
: Invoke /dms/node/peers/score behavior on an actor.
/dms/node/peers/self
: Invoke /dms/node/peers/self behavior on an actor.
/dms/node/vm/list
: Invoke /dms/node/vm/list behavior on an actor.
/dms/node/vm/start/custom
: Invoke /dms/node/vm/start/custom behavior on an actor.
/dms/node/vm/stop
: Invoke /dms/node/vm/stop behavior on an actor.
/public/hello
: Invoke /public/hello behavior on an actor
/public/status
: Invoke /public/status behavior on an actor
Flags
-c, --context string
: Capability context name.
-d, --dest string
: Destination DMS DID, peer ID or handle.
-e, --expiry time
: Expiration time.
-h, --help
: Help for the cmd
command.
-t, --timeout duration
: Timeout duration.
/broadcast/hello
Description: Invokes the /broadcast/hello
behavior on an actor. This sends a "hello" message to a broadcast topic for polite introduction.
Usage: nunet actor cmd /broadcast/hello [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /broadcast/hello
command
/dms/node/onboarding/offboard
Description: Invokes the /dms/node/onboarding/offboard
behavior on an actor. This is used to offboard a node from the DMS (Device Management Service).
Usage: nunet actor cmd /dms/node/onboarding/offboard [<param> ...]
Flags:
-h, --help
: Display help information for the /dms/node/onboarding/offboard
command.
/dms/node/onboarding/onboard
Description: Invokes the /dms/node/onboarding/onboard
behavior on an actor. This is used to onboard a node to the DMS, making its resources available for use.
Usage: nunet actor cmd /dms/node/onboarding/onboard [<param> ...] [flags]
Flags:
-C, --cpu float32
: CPU cores to allocate
-R, --ram uint
: Memory to allocate
-D, --disk uint
: Disk space to allocate
-G, --gpus string
: Comma-separated list of GPU Index and VRAM in GB to allocate e.g. "0:4,1:8". The gpu index can be obtained from 'nunet gpu list' command
--no-gpu
: Do not allocate any GPU
/dms/node/onboarding/status
Description: Invokes the /dms/node/onboarding/status
behavior on an actor. This is used to check the onboarding status of a node.
Usage: nunet actor cmd /dms/node/onboarding/status [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /dms/node/onboarding/status
command
/dms/node/peers/connect
Description: Invokes the /dms/node/peers/connect
behavior on an actor. This initiates a connection to a specified peer.
Usage: nunet actor cmd /dms/node/peers/connect [<param> ...] [flags]
Flags:
-a, --address string
: The peer address to connect to
-h, --help
: Display help information for the /dms/node/peers/connect
command.
/dms/node/peers/dht
Description: Invokes the /dms/node/peers/dht
behavior on an actor. This interacts with the Distributed Hash Table (DHT) used for peer discovery and content routing
Usage: nunet actor cmd /dms/node/peers/dht [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /dms/node/peers/dht
command.
/dms/node/peers/list
Description: Invokes the /dms/node/peers/list
behavior on an actor. This retrieves a list of connected peers
Usage: nunet actor cmd /dms/node/peers/list [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /dms/node/peers/list
command.
/dms/node/peers/ping
Description: Invokes the /dms/node/peers/ping
behavior on an actor. This sends a ping message to a specified host to check its reachability
Usage: nunet actor cmd /dms/node/peers/ping [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /dms/node/peers/ping
command
-H, --host string
: The host address to ping
/dms/node/peers/score
Description: Invokes the /dms/node/peers/score
behavior on an actor. This retrieves a snapshot of the peer's gossipsub broadcast score.
Usage: nunet actor cmd /dms/node/peers/score [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /dms/node/peers/score
command
/dms/node/peers/self
Description: Invokes the /dms/node/peers/self
behavior on an actor. This retrieves information about the node itself, such as its ID or addresses
Usage: nunet actor cmd /dms/node/peers/self [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /dms/node/peers/self
command.
/dms/node/vm/list
Description: Invokes the /dms/node/vm/list
behavior on an actor. This retrieves a list of virtual machines (VMs) running on the node
Usage: nunet actor cmd /dms/node/vm/list [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /dms/node/vm/list
command.
/dms/node/vm/start/custom
Description: Invokes the /dms/node/vm/start/custom
behavior on an actor. This starts a new VM with custom configurations.
Usage: nunet actor cmd /dms/node/vm/start/custom [<param> ...] [flags]
Flags:
-a, --args string
: Arguments to pass to the kernel
-z, --cpu float32
: CPU cores to allocate (default 1)
-h, --help
: Display help information for the /dms/node/vm/start/custom
command.
-i, --initrd string
: Path to initial ram disk
-k, --kernel string
: Path to kernel image file.
-m, --memory uint
: Memory to allocate (default 1024)
-r, --rootfs string
: Path to root fs image file
/dms/node/vm/stop
Description: Invokes the /dms/node/vm/stop
behavior on an actor. This stops a running VM
Usage: nunet actor cmd /dms/node/vm/stop [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /dms/node/vm/stop
command
-i, --id string
: Execution id of the VM
/public/hello
Description: Invokes the /public/hello
behavior on an actor. This broadcasts a "hello" for a polite introduction.
Usage: nunet actor cmd /public/hello [<param> ...] [flags]
Flags:
-h, --help
: Display help information for the /public/hello
command
/public/status
Description: Invokes the /public/status
behavior on an actor. This retrieves the status or health information of the actor or system
Flags:
-h, --help
: Display help information for the /public/status
command
These flags can be used with any of the above commands:
-c, --context string
: Specifies the capability context name. This is used for authorization or access control.
-d, --dest string
: Specifies the destination for the command. This can be a DMS DID (Decentralized Identifier), a peer ID, or a handle.
-e, --expiry time
: Sets an expiration time for the message or command.
-t, --timeout duration
: Sets a timeout duration for the command. If the command does not complete within the specified duration, it will time out.
-h, --help
: Display help information for the commands
Last updated: 2025-01-23 01:10:46.365557 File source:
proposed
DescriptionThis package is responsible for creation of a Node
object which is the main actor residing on the machine as long as DMS is running. The Node
gets created when the DMS is onboarded.
The Node
is responsible for:
Communicating with other actors (nodes and allocations) via messages. This will include sending bid requests, bids, invocations, job status etc
Checking used and free resource before creating allocations
Continuous monitoring of the machine
Here is quick overview of the contents of this pacakge:
The class diagram for the node
package is shown below.
Source file
Rendered from source file
TBD
TBD
proposed
Refer to *_test.go
files for unit tests of different functionalities.
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Interfaces & Methods
proposed
Node_interface
getAllocation
method retrieves an Allocation
on the machine based on the provided AllocationID
.
checkAllocationStatus
method will retrieve status of an Allocation
.
routeToAllocation
method will route a message to the Allocation
of the job that is running on the machine.
benchmarkCapability
method will perform machine benchmarking
setRegisteredCapability
method will record the benchmarked Capability of the machine into a persistent data store for retrieval and usage (mostly in job orchestration functionality)
getRegisteredCapability
method will retrieve the benchmarked Capability of the machine from the persistent data store.
setAvailableCapability
method changes the available capability of the machine when resources are locked
getAvailableCapability
method will return currently available capability of the node
lockCapability
method will lock certain amount of resources for a job. This can happen during bid submission. But it must happen once job is accepted and before invocation.
getLockedCapabilities
method retrieves the locked capabilities of the machine.
setPreferences
method sets the preferences of a node as dms.orchestrator.CapabilityComparator
getPreferences
method retrieves the node preferences as dms.orchestrator.CapabilityComparator
getRegisteredBids
method retrieves list of bids receieved for a job.
startAllocation
method will create an allocation based on the invocation received.
Data types
proposed
dms.node.Node
An initial data model for Node
is defined below.
proposed
dms.node.NodeID
Last updated: 2025-01-23 01:10:48.261559 File source:
Every node on the NuNet network has the ability to deploy jobs across the network's available compute resources, given the necessary capabilities. You can leverage the distributed computing power offered by compute providers to run your workloads efficiently.
Ensure your DMS is running and properly connected to the network
Create an ensemble configuration file that defines your deployment requirements
Use the DMS CLI to deploy your jobs
An ensemble configuration defines the resources and requirements for your deployment. Here's a basic example:
To deploy a job:
Save your ensemble configuration to a file (e.g., ensemble.yaml
)
Use the DMS CLI to create a new deployment:
You can monitor your deployments using these commands:
Monitor resource usage and costs
Use appropriate resource constraints in your ensemble configurations
Implement proper error handling in your deployments
Common issues and their solutions:
Deployment fails to start
Verify your ensemble configuration
Ensure proper network connectivity (e.g.: list peers using nunet actor cmd --context user /dms/node/peers/list
)
Capabilities
Ensure you have the required capabilities to invoke deployment behaviors upon peers on the network
Last updated: 2025-01-23 01:10:49.053617 File source:
By default, in NuNet's p2p network, all nodes share the same underlying network infrastructure for communication.
However, the capability system acts as an access control layer that determines which peers can invoke specific behaviors (e.g.: deploying an allocation) on other peers.
When following the main README usage guide, you're connecting to NuNet's official network where the capability pool is built upon KYC-verified peers. NuNet acts as the root of trust, issuing capability tokens to verified participants.
This guide demonstrates how any organization or entity can create their own restricted network which is independent from NuNet Foundation's trust pool.
If you want to be able to participate on both NuNet and any other restricted networks at once, as an user, you just have to set up the capabilities for each root entity controlling the capability pool.
In practice, you just have to follow the user guide side of this documentation and also the guide on DMS readme.
Creating a restricted network involves:
Setting up an organization key/context that acts as the root of trust
With the organization key/context, grant capabilities and send the generated token to each user
This guide consider users having a key/context for one user and n
dmses.
Also, let's suppose an organization named myorg
.
First, create a key and capability context for your organization:
Important: Securely store your organization's private key
Grant each user capabilities to invoke certain behaviors:
Send the generated <token-1>
to user of did <did-user>
.
Considering an user with keys/contexts for both user
and dms
:
Grant and set up the necessary require and provide anchors for user's DMS:
Following the previous guide, you successfully built your own capability pool:
Nodes will only be able to deploy allocations in nodes that have been granted the /dms/deployment
capability by your organization.
Though, as explained in the introduction, your nodes would still share the same underlying network. When broadcasting bid requests for a deployment, all peers in NuNet network would receive the request, even peers outside your capability pool.
To enhance privacy, security and scability of your restricted network, you can use your own bootstrap nodes instead of NuNet's.
For that, you have to customize your dms_config.json
. This is how your bootstrap_peers
section looks now:
To edit your dms_config.json
, run:
Modify the bootstrap peers to use your own nodes.
Multiaddresses: If using a domain name:
/dnsaddr/bootstrap.p2p.nunet.io/p2p/QmTkWP72uECwCsiiYDpCFeTrVeUM9huGTPsg3m6bHxYQFZ
If using an IP address:/ip4/<IP_ADDRESS>/p2p/QmTkWP72uECwCsiiYDpCFeTrVeUM9huGTPsg3m6bHxYQFZ
Qm...
is the bootstrap's peer ID.
To retrieve a peer ID from a given node, run:
Root of Trust: Your organization key becomes the root of trust instead of NuNet
Token Distribution: You control token distribution and can add/revoke access as needed
Last updated: 2025-01-23 01:10:47.465800 File source:
resources
deals with resource management for the machine. This includes calculation of available resources for new jobs or bid requests.
Here is quick overview of the contents of this pacakge:
All files with *_test.go
contains unit tests for the corresponding functionality.
The class diagram for the resources
package is shown below.
Source file
Rendered from source file
Manager Interface
The interface methods are explained below.
AllocateResources
signature: AllocateResources(context.Context, ResourceAllocation) error
input: Context
output (error): Error message
AllocateResources
allocates the resources to the job.
DeallocateResources
signature: DeallocateResources(context.Context, string) error
input: Context
output (error): Error message
DeallocateResources
deallocates the resources from the job.
GetTotalAllocation
signature: GetTotalAllocation() (Resources, error)
input: Context
output: types.Resource
output (error): Error message
GetTotalAllocation
returns the total resources allocated to the jobs.
GetFreeResources
signature: GetFreeResources() (FreeResources, error)
input: None
output: FreeResources
output (error): Error message
GetFreeResources
returns the available resources in the allocation pool.
GetOnboardedResources
signature: GetOnboardedResources(context.Context) (OnboardedResources, error)
input: Context
output: OnboardedResources
output (error): Error message
GetOnboardedResources
returns the resources onboarded to dms.
UpdateOnboardedResources
signature: UpdateOnboardedResources(context.Context, OnboardedResources) error
input: Context
input: OnboardedResources
output (error): Error message
UpdateOnboardedResources
updates the resources onboarded to dms.
UsageMonitor
signature: UsageMonitor() types.UsageMonitor
input: None
output: types.UsageMonitor
instance
output (error): None
UsageMonitor
returns the types.UsageMonitor
instance.
This interface defines methods to monitor the system usage. The methods are explained below.
GetUsage
signature: GetUsage(context.Context) (types.Resource, error)
input: Context
output: types.Resource
output (error): Error message
GetUsage
returns the resources currently used by the machine.
types.Resources
: resources defined for the machine.
types.AvailableResources
: resources onboarded to Nunet.
types.FreeResources
: resources currently available for new jobs.
types.ResourceAllocation
: resources allocated to a job.
types.MachineResources
: resources available on the machine.
types.GPUVendor
: GPU vendors available on the machine.
types.GPU
: GPU details.
types.GPUs
: A slice of GPU
.
types.CPU
: CPU details.
types.RAM
: RAM details.
types.Disk
: Disk details.
types.NetworkInfo
: Network details.
Last updated: 2025-01-23 01:10:47.199459 File source:
This whole package is proposed
status and therefore documentation is missing, save for the proposed functionality part.
TBD
TBD
Source
Rendered from source file
TBD
TBD
TBD
List of issues
All issues that are filed in GitLab related to the implementation of dms/orchestrator
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Proposed functionalities
TBD
Data types
proposed
LocalNetworkTopology
more complex deployments may need a data structure, which considers local network topology of a node / dms -- i.e. for reasoning about speed of connection (as well as capabilities) between neighbors.
Related research blogs
TBD
Last updated: 2025-01-23 01:10:48.554083 File source:
As a compute provider in the NuNet network, you can offer your computer's resources (CPU, RAM, GPU, and storage) to other network participants.
This guide will walk you through the process of onboarding your resources to the network.
Before onboarding, ensure:
Your DMS is properly installed and running
You have met the
You have the necessary configured
(Optional) For GPU providers: GPU drivers are correctly installed
Use the following command to onboard your resources:
Example:
If a GPU is detected on the machine, an interactive prompt will be displayed that allows choosing the GPU and the amount of VRAM to onboard from it.
To onboard a GPU without the interactive prompt, --gpus "<GPU_INDEX>:<VRAM_IN_GB>"
can be used where index can be obtained from the gpu list
command. In case of multiple GPUs, the pair can be expanded for each GPU separated by a comma.
Example:
If GPU is detected on the machine but it shouldn't be onboarded, use the --no-gpu
flag.
Check your onboarding status:
Monitor your resources allocation:
To remove the availability of your machine's resources from the network, execute:
Note: we'll facilitate this process in the future so that users do not have to reonboard again.
To modify your resource allocation:
First, offboard your current resources
Then onboard again with new resource values
Resource Allocation
Don't onboard all available resources
Leave enough resources for system operations
Consider your system's stability and cooling capabilities
System Maintenance
Regularly update your DMS instance
Monitor system health and performance
Maintain stable internet connectivity
Security
Monitor system logs for unusual activity
Keep your system updated with security patches
Common issues and solutions:
Onboarding fails
Verify system requirements are met
Ensure proper permissions are set
Resource allocation issues
Verify if other allocations are still running (check allocated resources)
Ensure proper GPU drivers if using GPUs
Verify available resources
Check for competing processes
Last updated: 2025-01-23 01:10:43.490134 File source:
The Nunet Capability Management CLI provides commands to manage capabilities within the Nunet ecosystem. Capabilities define the permissions and authorizations granted to different entities within the system. These commands allow you to create, modify, delegate, and list capabilities.
nunet cap anchor
: Add or modify capability anchors in a capability context.
nunet cap delegate
: Delegate capabilities for a subject.
nunet cap grant
: Grant (delegate) capabilities as anchors and side chains from a capability context.
nunet cap list
: List all capability anchors in a capability context
nunet cap new
: Create a new persistent capability context for DMS or personal usage
nunet cap remove
: Remove capability anchors in a capability context.
nunet cap anchor
This command is used to add new or modify existing capability anchors within a specific capability context. Anchors serve as the foundation for defining capabilities, establishing the core permissions and restrictions
Usage
Flags
-c, --context string
: Specifies the operation context, defining the key and capability context to use (defaults to "user").
-h, --help
: Displays help information for the anchor
command
--provide string
: Adds tokens as a "provide" anchor in JSON format, defining capabilities the context can offer
--require string
: Adds tokens as a "require" anchor in JSON format, defining capabilities the context demands
--root string
: Adds a DID as a "root" anchor, establishing the root authority for the context
nunet cap delegate
This command delegates specific capabilities to a subject, granting them permissions within the system
Usage
Arguments
<subjectDID>
: The Decentralized Identifier (DID) of the entity receiving the delegated capabilities
Flags
-a, --audience string
: (Optional) Specifies the audience DID, restricting the delegation to a specific recipient
--cap strings
: Defines the capabilities to be granted or delegated (can be specified multiple times)
-c, --context string
: Specifies the operation context (defaults to "user")
-d, --depth uint
: (Optional) Sets the delegation depth, controlling how many times the capabilities can be further delegated (default 0)
--duration duration
: Sets the duration for which the delegation is valid
-e, --expiry time
: Sets an expiration time for the delegation
-h, --help
: Displays help information for the delegate
command
--self-sign string
: Specifies self-signing options: 'no' (default), 'also', or 'only'
-t, --topic strings
: Defines the topics for which capabilities are granted or delegated (can be specified multiple times)
nunet cap grant
This command grants (delegates) capabilities as anchors and side chains from a specified capability context
Usage
Arguments
<subjectDID>
: The Decentralized Identifier (DID) of the entity receiving the granted capabilities
Flags
-a, --audience string
: (Optional) Specifies the audience DID
--cap strings
: Defines the capabilities to be granted or delegated
-c, --context string
: Specifies the operation context (defaults to "user")
-d, --depth uint
: (Optional) Sets the delegation depth
--duration duration
: Sets the duration for which the grant is valid
-e, --expiry time
: Sets an expiration time for the grant
-h, --help
: Displays help information for the grant
command
-t, --topic strings
: Defines the topics for which capabilities are granted
nunet cap list
This command lists all capability anchors within a specified capability context
Usage
Flags
-c, --context string
: Specifies the operation context (defaults to "user")
-h, --help
: Displays help information for the list
command
nunet cap new
This command creates a new persistent capability context, which can be used for DMS or personal purposes
Usage
Arguments
<name>
: The name for the new capability context
Flags
-h, --help
: Displays help information for the new
command
nunet cap remove
This command removes capability anchors from a specified capability context
Usage
Flags
-c, --context string
: Specifies the operation context (defaults to "user")
-h, --help
: Displays help information for the remove
command
--provide string
: Removes tokens from the "provide" anchor in JSON format
--require string
: Removes tokens from the "require" anchor in JSON format
--root string
: Removes a DID from the "root" anchor
Last updated: 2025-01-23 01:10:45.141435 File source:
This package is responsible for starting the whole application. It also contains various core functionality of DMS:
Onboarding compute provider devices
Job orchestration and management
Resource management
Actor implementation for each node
Here is quick overview of the contents of this pacakge:
Subpackages
proposed
: All files with *_test.go
naming convention contain unit tests with respect to the specific implementation.
The class diagram for the dms
package is shown below.
Source file
Rendered from source file
TBD
Supervision
TBD as per proposed implementation
Supervisor
SupervisorStrategy
Statistics
TBD
proposed
Refer to *_test.go
files for unit tests of different functionalities.
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Interfaces & Methods
proposed
Capability_interface
add
method will combine capabilities of two nodes. Example usage - When two jobs have to be run on a single machine, the capability requirements of each will need to be combined.
subtract
method will subtract two capabilities. Example usage - When resources are locked for a job, the available capability of a machine will need to be reduced.
Data types
proposed
dms.Capability
The Capability
struct will capture all the relevant data that defines the capability of a node to perform the job. At the same time this will be used to define capability requirements that a job requires from a node.
An initial data model for Capability
is defined below.
proposed
dms.Connectivity
type Connectivity struct {
}
proposed
dms.PriceInformation
proposed
dms.TimeInformation
type TimeInformation struct { // Units holds the units of time ex - hours, days, weeks Units string
}
Last updated: 2025-01-23 01:10:44.877920 File source:
This sub package contains Gorm implementation of the database interfaces.
Here is quick overview of the contents of this pacakge:
All files with *_test.go
naming convention contain unit tests with respect to the specific implementation.
The class diagram for the gorm
package is shown below.
Source file
Rendered from source file
GenericRepository
NewGenericRepository
signature: NewGenericRepository[T repositories.ModelType](db *gorm.DB) -> repositories.GenericRepository[T]
input: Gorm Database object
output: Repository of type db.gorm.GenericRepositoryGORM
NewGenericRepository
function creates a new instance of GenericRepositoryGORM
struct. It initializes and returns a repository with the provided GORM database.
Interface Methods
GenericEntityRepository
NewGenericEntityRepository
signature: NewGenericEntityRepository[T repositories.ModelType](db *gorm.DB) -> repositories.GenericEntityRepository[T]
input #1: Gorm Database object
output: Repository of type db.gorm.GenericEntityRepositoryGORM
NewGenericEntityRepository
creates a new instance of GenericEntityRepositoryGORM
struct. It initializes and returns a repository with the provided GORM database.
Interface Methods
db.gorm.GenericRepositoryGORM
: This is a generic repository implementation using GORM as an ORM.
db.gorm.GenericEntityRepositoryGORM
: This is a generic single entity repository implementation using GORM as an ORM
For other data types refer to db
package readme.
Refer to *_test.go
files for unit tests of different functionalities.
List of issues
All issues that are related to the implementation of db
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
The following diagram depicts this relathionship:
: Current file which is aimed towards developers who wish to use and modify the database functionality.
: This file defines the interface defining the main methods for db pacakge. It is designed using generic types and can be adapted to specific data type as needed.
: This file contains the interface for those databases which will hold only a single record.
: This file specifies a database interface having types.DeploymentRequestFlat
data type.
: This file specifies a database interface having types.RequestTracker
data type.
: This file specifies the different types of errors.
: This file specifies a database interface having types.VirtualMachine
data type.
: This file defines database interfaces of various data types.
: This file contains some utility functions with respect to database operations.
: This file contains unit tests for functions defined in file.
: This folder contains SQlite database implementation using gorm.
: This folder contains CloverDB database implementation.
In order to deploy an ensemble, the user must specify its structure and constraints; this is done with a YAML file encoding the ; the fields of the configuration structure are described in detail in this .
See section for research blogs with more details on this topic.
: Current file which is aimed towards developers who wish to use and modify the orchestrator
functionality.
: Directory containing package specifications, including package class diagram.
: Defines and implements interfaces of Graph logic for network topology awareness (proposed).
Note: the functionality of DMS is being currently developed. See the section for the suggested design of interfaces and methods.
Note: the functionality of DMS is being currently developed. See the section for the suggested data types.
: Current file which is aimed towards developers who wish to modify the onboarding functionality and build on top of it.
: This is main file where the code for onboarding functionality exists.
: This file houses functions to generate Cardano wallet addresses along with its private key.
: This file houses functions to test the address generation functions defined in .
: This file houses functions to get the total capacity of the machine being onboarded.
: This files initializes the loggers associated with onboarding package.
All the tests for the onboarding package can be found in the file.
: Current file which is aimed towards developers who wish to use and modify the DMS functionality.
Note: the functionality of DMS is being currently developed. See the section for the suggested design of interfaces and methods.
Note: the functionality of DMS is being currently developed. See the section for the suggested data types.
For more details about the ensemble configuration format and all possible fields, see the .
: Current file which is aimed towards developers who wish to use and modify the DMS functionality.
: Contains the initialization of the package.
: Contains the resource manager which is responsible for managing the resources of dms.
: Contains the implementation of the UsageMonitor
interface.
: Contains the implementation of the store
for the resource manager.
: Current file which is aimed towards developers who wish to use and modify the dms functionality.
: This file contains code to initialize the DMS by loading configuration, starting REST API server etc
: This file creates a new logger instance.
: This file defines a method for performing consistency check before starting the DMS. proposed
Note that the functionality of this method needs to be developed as per refactored DMS design.
: Deals with the management of local jobs on the machine.
: Contains implementation of Node
as an actor.
: Code related to onboarding of compute provider machines to the network.
: Contains job orchestration logic.
: Deals with the management of resources on the machine.
Note: the functionality of DMS is being currently developed. See the section for the suggested design of interfaces and methods.
Note: the functionality of DMS is being currently developed. See the section for the suggested data types.
: Current file which is aimed towards developers who wish to use and modify the database functionality.
: This file implements the methods of GenericRepository
interface.
: This file implements the methods of GenericEntityRepository
interface.
: This file contains implementation of DeploymentRequestFlat
interface.
: This file contains implementation of RequestTracker
interface.
: This file contains implementation of VirtualMachine
interface.
: This file contains implementation of interfaces defined in .
: This file contains utility functions with respect to Gorm implementation.
See db
package for methods of GenericRepository
interface.
See db
package for methods of GenericEntityRepository
interface.
Last updated: 2025-01-23 01:10:50.959468 File source: link on GitLab
This package contains all configuration related code such as reading config file and functions to configure at runtime.
proposed
There are two sides to configuration:
default configuration, which has to be loaded for the fresh installation of new dms;
dynamic configuration, which can be changed by a user that has access to the DMS; this dynamic configuration may need to be persistent (or not).
proposed
Default configuration
Default configuration should be included into DMS distribution as a config.yaml
in the root directory. The following is loosely based on general practice of passing yaml configuration to Go programs (see e.g. A clean way to pass configs in a Go application). DMS would parse this file during onboarding and populate the internal.config.Config
variable that will be imported to other packages and used accordingly.
proposed
Dynamic configurationDynamic configuration would use the same internal.config.Config
variable, but would allow for adding new values or changing configuration by an authorized DMS user -- via DMS CLI or REST API calls.
The mechanism of dynamic configuration will enable to override or change default values. For enabling this functionality, the internal.config.Config
variable will have a synchronized copy in the local DMS database, defined with db
package.
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the package functionality.
config: This file contains data structures for this package.
load: This file establishes a configuration loader using Viper
, supports loading JSON files from various locations, applies defaults, and exposes functions to manage and obtain the loaded configuration.
Source
Rendered from source file
The methods of this package are explained below:
getViper
signature: getViper() *viper.Viper
input: None
output: A pointer to a viper.Viper
struct
getViper
function creates a new viper.Viper
instance used for configuration management with search paths.
setDefaultConfig
signature: setDefaultConfig() *viper.Viper
input: None
output: A pointer to a viper.Viper
struct
Sets default values for various configuration options in a viper.Viper
instance.
LoadConfig
signature: LoadConfig()
input: None
output: None
LoadConfig
loads the configuration from the search paths.
SetConfig
signature: SetConfig(key string, value interface{})
input #1 : key for the configuration value
input #2 : value corresponding to the key provided
output: None
SetConfig
Sets a specific configuration value using key value pair provided. In case of any error, the deafult configuration values are applied.
GetConfig
signature: GetConfig() *Config
input: None
output: internal.config.Config
GetConfig
returns the loaded configuration data.
findConfig
signature: findConfig(paths []string, filename string) ([]byte, error)
input # 1: list of search paths
input # 1: name of the configuration file
output: contents of the configuration file
output(error): error message
findConfig
Searches for the configuration file in specified paths and returns content or error.
removeComments
signature: removeComments(configBytes []byte) []byte
input: The byte array containing the configuration file content
output: byte array with comments removed
removeComments
Removes comments from the configuration file content using a regular expression.
internal.config.Config
: holds the overall configuration with nested structs for specific sections
internal.config.General
: Configuration related to general application behavior (data paths, debug mode)
internal.config.Rest
: Configuration for the REST API (port number)
internal.config.P2P
: Configuration for the P2P network
internal.config.Job
: Configuration for background tasks (log update interval, target peer for deployments, container cleanup interval)
proposed
Unit tests for each functionality are defined in files with *_test.go
naming convention.
List of issues
All issues that are related to the implementation of internal
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
proposed
Functionalities
Following Gherkin feature files describe the proposed functionality for config
package.
Load default DMS configuration: see scenario definition
Restore default DMS configuration: see scenario definition
Load existing DMS configuration: see scenario definition
Last updated: 2025-01-23 01:10:49.339454 File source: link on GitLab
The executor package is responsible for executing the jobs received by the device management service (DMS). It provides an unified interface to run various executors such as docker, firecracker etc
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the executor functionality.
init: This file initializes a logger instance for the executor package.
types: This file contains the interfaces that other packages in the DMS call to utilise functionality offered by the executor package.
docker: This folder contains the implementation of docker executor.
firecracker: This folder contains the implementation of firecracker executor.
Source
Rendered from source file
The main functionality offered by the executor
package is defined via the Executor
interface.
Its methods are explained below:
Start
signature: Start(ctx context.Context, request executor.ExecutionRequest) -> error
input #1: Go context
input #2: executor.ExecutionRequest
output: error
Start
function takes a Go context
object and a executor.ExecutionRequest
type as input. It returns an error if the execution already exists and is in a started or terminal state. Implementations may also return other errors based on resource limitations or internal faults.
Run
signature: Run(ctx context.Context, request executor.ExecutionRequest) -> (executor.ExecutionResult, error)
input #1: Go context
input #2: executor.ExecutionRequest
output (success): executor.ExecutionResult
output (error): error
Run
initiates and waits for the completion of an execution for the given Execution Request. It returns a executor.ExecutionResult
and an error if any part of the operation fails. Specifically, it will return an error if the execution already exists and is in a started or terminal state.
Wait
signature: Wait(ctx context.Context, executionID string) -> (<-chan executor.ExecutionResult, <-chan error)
input #1: Go context
input #2: executor.ExecutionRequest.ExecutionID
output #1: Channel that returns executor.ExecutionResult
output #2: Channel that returns error
Wait
monitors the completion of an execution identified by its executionID
. It returns two channels:
A channel that emits the execution result once the task is complete;
An error channel that relays any issues encountered, such as when the execution is non-existent or has already concluded.
Cancel
signature: Cancel(ctx context.Context, executionID string) -> error
input #1: Go context
input #2: executor.ExecutionRequest.ExecutionID
output: error
Cancel
attempts to terminate an ongoing execution identified by its executionID
. It returns an error if the execution does not exist or is already in a terminal state.
GetLogStream
signature: GetLogStream(ctx context.Context, request executor.LogStreamRequest, executionID string) -> (io.ReadCloser, error)
input #1: Go context
input #2: executor.LogStreamRequest
input #3: executor.ExecutionRequest.ExecutionID
output #1: io.ReadCloser
output #2: error
GetLogStream
provides a stream of output for an ongoing or completed execution identified by its executionID
. There are two flags that can be used to modify the functionality:
The Tail
flag indicates whether to exclude historical data or not.
The follow
flag indicates whether the stream should continue to send data as it is produced.
It returns an io.ReadCloser
object to read the output stream and an error if the operation fails. Specifically, it will return an error if the execution does not exist.
types.ExecutionRequest
: This is the input that executor
receives to initiate a job execution.
types.ExecutionResult
: This contains the result of the job execution.
executor.LogStreamRequest
: This contains input parameters sent to the executor
to get job execution logs.
types.SpecConfig
: This allows arbitrary configuration/parameters as needed during implementation of specific executor.
types.Resources
: This contains resources to be used for execution.
storage.StorageVolume
: This contains parameters of storage volume used during execution.
Unit tests are defined in subpackages which implement the interface defined in this package.
List of issues
All issues that are related to the implementation of executor
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:49.889499 File source: link on GitLab
This sub-package contains functionality including drivers and api for the Firecracker executor.
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the Firecracker functionality.
client: This file provides a high level wrapper around the Firecracker library.
executor: This is the main implementation of the executor interface for Firecracker. It is the entry point of the sub-package. It is intended to be used as a singleton.
handler: This file contains a handler implementation to manage the lifecycle of a single job.
init: This file is responsible for initialization of the package. Currently it only initializes a logger to be used through out the sub-package.
types: This file contains Models that are specifically related to the Firecracker executor. Mainly it contains the engine spec model that describes a Firecracker job.
Files with *_test.go
suffix contain unit tests for the functionality in corresponding file.
Source
Rendered from source file
Below methods have been implemented in this package:
NewExecutor
signature: NewExecutor(_ context.Context, id string) -> (executor.firecracker.Executor, error)
input #1: Go context
input #2: identifier of the executor
output (sucess): Executor instance of type executor.firecracker.Executor
output (error): error
NewExecutor
function initializes a new Executor instance for Firecracker VMs.
It is expected that NewExecutor
would be called prior to calling any other executor functions. The Executor instance returned would then be used to call other functions like Start
, Stop
etc.
Start
For function signature refer to the package readme
Start
function begins the execution of a request by starting a Firecracker VM. It creates the VM based on the configuration parameters provided in the execution request. It returns an error message if
execution is already started
execution is already finished
there is failure is creation of a new VM
Wait
For function signature refer to the package readme
Wait
initiates a wait for the completion of a specific execution using its executionID
. The function returns two channels: one for the result and another for any potential error.
If the executionID
is not found, an error is immediately sent to the error channel.
Otherwise, an internal goroutine is spawned to handle the asynchronous waiting. The entity calling should use the two returned channels to wait for the result of the execution or an error. If there is a cancellation request (context is done) before completion, an error is relayed to the error channel. When the execution is finished, both the channels are closed.
Cancel
For function signature refer to the package readme
Cancel
tries to terminate an ongoing execution identified by its executionID
. It returns an error if the execution does not exist.
Run
For function signature refer to the package readme
Run
initiates and waits for the completion of an execution in one call. This method serves as a higher-level convenience function that internally calls Start
and Wait
methods. It returns the result of the execution as executor.ExecutionResult
type.
It returns an error in case of:
failure in starting the VM
failure in waiting
context is cancelled
signature: Cleanup(ctx context.Context) -> error
input: Go context
output (sucess): None
output (error): error
Cleanup
removes all firecracker resources associated with the executor. This includes stopping and removing all running VMs and deleting their socket paths. It returns an error it it is unable to remove the containers.
executor.firecracker.Executor
: This is the instance of the executor created by NewExecutor
function. It contains the firecracker client and other resources required to execute requests.
executor.firecracker.executionHandler
: This contains necessary information to manage the execution of a firecracker VM.
Refer to package readme for other data types.
Unit tests for each functionality are defined in files with *_test.go
naming convention.
List of issues
All issues that are related to the implementation of executor
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:51.703672 File source: link on GitLab
This directory contains utility scripts for building / development assistance and runtime; It is included into final build;
There is a systmed unit file in the debian package but it's not enabled or started by the post installation script because the DMS daemon requires a passphrase on run
. It is possible to assign the passphrase to env var $DMS_PASSPHRASE and add it to the unit file but that's not recommended.
Note: lets see if this functionality can be split to relevant packages and leave only the functionality that cannot be moved elsewhere;
There are 3 files/subdirectory in this directory. Here are what they are for:
nunet-dms/
: This is a template directory to build deb file. This direcotry is used by build.sh
to write the binary file in the nunet-dms_$version_$arch/usr/bin
, update architecture and version in the control file. And then build a .deb file out of the direcotry.
build.sh
: This script is intended to be used by CI/CD server. This script creates .deb package for amd64
and arm64
.
clean.sh
: This script is intended to be used by developers. You should be using the apt remove nunet-dms
otherwise. Use this clean script only if installation is left broken.
golang
is required to build the nunet binary
dpkg-deb
is required to build the debian package
Build is supposed to be invoked from the root of the project. Please comment out the publish command from the build script, it is intended to be called from a GitLab CI environment and will fail locally.
A build can be invoked by:
Last updated: 2025-01-23 01:10:50.401479 File source: link on GitLab
This package contains all code that is very specific to the whole of the dms, which will not be imported by any other packages and used only on the running instance of dms (like config and background task).
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the package functionality.
init: This file handles controlled shutdown and initializes OpenTelemetry-based Zap logger.
websocket: This file contains communication protocols for a websocket server including message handling and command execution.
subpackages
config: This sub-package contains the configuration related data for the whole dms.
background_tasks: This sub-package contains functionality that runs in the background.
Source
Rendered from source file
TBD
internal.WebSocketConnection
internal.Command
Note: The data types are expected to change during refactoring of DMS
TBD
List of issues
All issues that are related to the implementation of internal
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:50.668670 File source: link on GitLab
The background_tasks
package is an internal package responsible for managing background jobs within DMS. It contains a scheduler that registers tasks and run them according to the schedule defined by the task definition.
proposed
Other packages that have their own background tasks register through this package:
Registration
The task itself, the arguments it needs
priority
event (time period or other event to trigger task)
Start , Stop, Resume
Algorithm that accounts for the event and priority of the task (not yet clear)
Monitor resource usage of tasks (not yet clear)
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the package functionality.
init: This file initializes OpenTelemetry-based Zap logger.
scheduler: This file This file defines a background task scheduler that manages task execution based on triggers, priority, and retry policies.
task: This file contains background task structs and their properties.
trigger: This file defines various trigger types (PeriodicTrigger, EventTrigger, OneTimeTrigger) for background tasks, allowing execution based on time intervals, cron expressions, or external events
Files with *_test.go
naming convention contain unit tests of the functionality in corresponding file.
Source
background_tasks class diagram
Rendered from source file
NewScheduler
signature: NewScheduler(maxRunningTasks int) *Scheduler
input: maximum no of running tasks
output: internal.background_tasks.Scheduler
NewScheduler
function creates a new scheduler which takes maxRunningTasks
argument to limit the maximum number of tasks to run at a time.
Scheduler methods
Scheduler
struct is the orchestrator that manages and runs the tasks. If the Scheduler
task queue is full, remaining tasks that are triggered will wait until there is a slot available in the scheduler.
It has the following methods:
AddTask
signature: AddTask(task *Task) *Task
input: internal.background_tasks.Task
output: internal.background_tasks.Task
AddTask
registers a task to be run when triggered.
RemoveTask
signature: RemoveTask(taskID int)
input: identifier of the Task
output: None
RemoveTask
removes a task from the scheduler. Tasks with only OneTimeTrigger will be removed automatically once run.
Start
signature: Start()
input: None
output: None
Start
starts the scheduler to monitor tasks.
Stop
signature: Stop()
input: None
output: None
Stop
stops the scheduler.
runTask
signature: runTask(taskID int)
input: identifier of the Task
output: None
runTask
executes a task and manages its lifecycle and retry policy.
runTasks
signature: runTasks()
input: None
output: None
runTasks
checks and runs tasks based on their triggers and priority.
runningTasksCount
signature: runningTasksCount() int
input: None
output: number of running tasks
runningTasksCount
returns the count of running tasks.
Trigger Interface
Its methods are explained below:
IsReady
signature: IsReady() bool
input: None
output: bool
IsReady
should return true if the task should be run.
Reset
signature: Reset()
input: None
output: None
Reset
resets the trigger until the next event happens.
There are different implementations for the Trigger
interface.
PeriodicTrigger
: Defines a trigger based on a duration interval or a cron expression.
EventTrigger
: Defines a trigger that is set by a trigger channel.
OneTimeTrigger
: A trigger that is only triggered once after a set delay.
internal.background_tasks.Scheduler
internal.background_tasks.RetryPolicy
internal.background_tasks.Execution
internal.background_tasks.Task
Task is a struct that defines a job. It includes the task's ID, Name, the function that is going to be run, the arguments for the function, the triggers that trigger the task to run, retry policy, etc.
internal.background_tasks.PeriodicTrigger
internal.background_tasks.EventTrigger
internal.background_tasks.OneTimeTrigger
Unit tests for each functionality are defined in files with *_test.go
naming convention.
List of issues
All issues that are related to the implementation of internal
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:49.605802 File source: link on GitLab
This sub-package contains functionality including drivers and api for the Docker executor.
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the docker functionality.
client: This file provides a high level wrapper around the docker library.
executor: This is the main implementation of the executor interface for docker. It is the entry point of the sub-package. It is intended to be used as a singleton.
handler: This file contains a handler implementation to manage the lifecycle of a single job.
init: This file is responsible for initialization of the package. Currently it only initializes a logger to be used through out the sub-package.
types: This file contains Models that are specifically related to the docker executor. Mainly it contains the engine spec model that describes a docker job.
Files with *_test.go
suffix contain unit tests for the functionality in corresponding file.
Source
Rendered from source file
Below methods have been implemented in this package:
NewExecutor
signature: NewExecutor(ctx context.Context, id string) -> (executor.docker.Executor, error)
input #1: Go context
input #2: identifier of the executor
output (sucess): Executor instance of type executor.docker.Executor
output (error): error
NewExecutor
function initializes a new Executor instance with a Docker client. It returns an error if Docker client initialization fails.
It is expecte that NewExecutor
would be called prior to calling any other executor functions. The Executor instance returned would then be used to call other functions like Start
, Stop
etc.
Start
For function signature refer to the package readme
Start
function begins the execution of a request by starting a Docker container. It creates the container based on the configuration parameters provided in the execution request. It returns an error message if
container is already started
container execution is finished
there is failure is creation of a new container
Wait
For function signature refer to the package readme
Wait
initiates a wait for the completion of a specific execution using its executionID
. The function returns two channels: one for the result and another for any potential error.
If the executionID
is not found, an error is immediately sent to the error channel.
Otherwise, an internal goroutine is spawned to handle the asynchronous waiting. The entity calling should use the two returned channels to wait for the result of the execution or an error. If there is a cancellation request (context is done) before completion, an error is relayed to the error channel. When the execution is finished, both the channels are closed.
Cancel
For function signature refer to the package readme
Cancel
tries to terminate an ongoing execution identified by its executionID
. It returns an error if the execution does not exist.
GetLogStream
For function signature refer to the package readme
GetLogStream
provides a stream of output logs for a specific execution. Parameters tail
and follow
specified in executor.LogStreamRequest
provided as input control whether to include past logs and whether to keep the stream open for new logs, respectively.
It returns an error if the execution is not found.
Run
For function signature refer to the package readme
Run
initiates and waits for the completion of an execution in one call. This method serves as a higher-level convenience function that internally calls Start
and Wait
methods. It returns the result of the execution as executor.ExecutionResult
type.
It returns an error in case of:
failure in starting the container
failure in waiting
context is cancelled
ConfigureHostConfig
signature: configureHostConfig(vendor types.GPUVendor, params *types.ExecutionRequest, mounts []mount.Mount) container.HostConfig
input #1: GPU vendor (types.GPUVendor)
input #2: Execution request parameters (types.ExecutionRequest)
input #3: List of container mounts ([]mount.Mount)
output: Host configuration for the Docker container (container.HostConfig)
The configureHostConfig
function sets up the host configuration for the container based on the GPU vendor and resources requested by the execution. It supports configurations for different types of GPUs and CPUs.
The function performs the following steps:
NVIDIA GPUs:
Configures the DeviceRequests
to include all GPUs specified in the execution request.
Sets the memory and CPU resources according to the request parameters.
AMD GPUs:
Binds the necessary device paths (/dev/kfd
and /dev/dri
) to the container.
Adds the video
group to the container.
Sets the memory and CPU resources according to the request parameters.
Intel GPUs:
Binds the /dev/dri
directory to the container, exposing all Intel GPUs.
Sets the memory and CPU resources according to the request parameters.
Default (CPU-only):
Configures the container with memory and CPU resources only, without any GPU-specific settings.
The function ensures that the appropriate resources and device paths are allocated to the container based on the available and requested GPU resources.
signature: Cleanup(ctx context.Context) -> error
input: Go context
output (sucess): None
output (error): error
Cleanup
removes all Docker resources associated with the executor. This includes removing containers including networks and volumes with the executor's label. It returns an error it if unable to remove the containers.
executor.docker.Executor
: This is the instance of the executor created by NewExecutor
function. It contains the Docker client and other resources required to execute requests.
executor.docker.executionHandler
: This contains necessary information to manage the execution of a docker container.
Refer to package readme for other data types.
Unit tests for each functionality are defined in files with *_test.go
naming convention.
List of issues
All issues that are related to the implementation of executor
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:53.077670 File source: link on GitLab
The tracing.go
file is part of an observability framework for the DMS, integrating Elastic APM (Application Performance Monitoring) to track and monitor system performance and trace the flow of operations. Here’s an explanation of the key components and functionalities:
Elastic APM Setup: Initialize, manage, and shut down the Elastic APM tracer for monitoring and tracing operations in the system.
System Metrics: The code collects key system metrics (CPU, memory, disk, etc.) and sends them as custom metrics to the APM server.
Tracing Operations: The StartTrace
function enables the tracing of operations with a name, logging the start and end of operations along with detailed performance metrics (e.g., execution time).
Transaction and Span Management: Transactions and spans are created to trace and measure specific operations in the service, such as HTTP requests or background tasks.
No-Op Mode: If tracing is disabled (in "no-op" mode), no data is sent to the APM server.
The observability.go
file contains various functions and configurations for setting up observability in the application, including logging, tracing, event handling, and Elasticsearch logging integration. Below is a breakdown of the key elements and explanations of how they work together:
Logger: The logging system is configured with multiple outputs (console, file, and Elasticsearch). It uses zap with different zapcore components to handle logging.
Elasticsearch Syncer: Buffered writing to Elasticsearch is implemented using a custom syncer, which also includes a short “preflight” check to gracefully disable ES logging if credentials or connectivity fail.
APM Tracing: Integration with Elastic APM allows for tracing of application events.
Event Bus: Facilitates the emission and listening of custom events.
DID: Provides a unique identifier for logs and events, helping correlate them across services.
When the system starts, Initialize()
sets up the logger, event bus, and APM tracing based on the configuration.
The logger is initialized with a set of cores, including console, file, and potentially Elasticsearch logging.
A short preflight check is performed for Elasticsearch to ensure connectivity and valid credentials, helping avoid indefinite blocking.
If Elasticsearch fails (e.g., invalid key, unreachable), logging gracefully falls back to console and file logs.
Tracing is activated if an APM server URL is provided, and traces are tied to the application’s operations.
Event bus integration allows for custom events to be emitted and handled asynchronously.
The system metrics (CPU, memory, disk, etc.) are collected and, if tracing is enabled, sent to the APM server. Otherwise, they’re skipped in no-op mode.
Last updated: 2025-01-23 01:10:53.597661 File source: link on GitLab
The whole package is TBD
This package will contain defined entry points and specs for third party plugins, registration and execution of plugin code
TBD
The class diagram for the plugins sub-package is shown below.
Source file
Rendered from source file
TBD
TBD
TBD
TBD
TBD
Last updated: 2025-01-23 01:10:53.874048 File source:
Context / links:
The goal of the process is to expose all developed functionalities with frozen scope, as described above, to a rigorous internal and community testing via succession of release candidate builds.
We will operate on two branches: main
(for trunk development) and release
(for minor / final releases and patches).
Interim builds between scheduled release candidates will be tagged with suffix -p{patch_number}
, e.g. v0.5.0-rc1-p1
, etc.
Feature scope freeze
Specification and documentation system launch
Project management portal launch
Test management system launch (feature environment, CI/CD pipeline and QA visibility)
Release candidates testing and delivering full feature scope
The currently scheduled release candidates are:
v0.5.0-boot
-- bootstrap release;
v0.5.0-rc1
-- first release candidate;
v0.5.0-rc2
-- second release candidate;
v0.5.0-rc3
-- third release candidate;
v0.5.0-rc4
-- fourth release candidate;
v0.5.0
-- final release
Feature scope freeze
Specification and documentation portal launch
NuNet test management system consists of three main components, which have been developed for months now and are close to completion, but need to be finally coordinated and aligned in the preparation for the start of the DMS v0.5 release process and release candidate testing internally as well as via the community contributions. The components of the test management system are:
CI/CD pipeline that publishes and makes available for further publishing all testing artifacts and reports; These are further used by developers, QA engineers and shall be exposed to community developers publicly;
Feature environment features a small network of geographically distributed virtual machines connected via public internet, and allows for the execution of selected CI/CD pipeline stages automatically on heterogenous hardware environments -- testing functionality of the fully built DMS; Most of the acceptance tests of the frozen feature scope will be running via the
QA management portal is a web portal (Testmo) which exposes all test artifacts via single interface and provides all information for NuNet QA team to see which test are passing / failing for every built of the DMS.
All three components are tightly coupled and will be internally released in quick succession, the last week of August - first week of September, targeting finalizing the test management system launch at the end of first week of September, so that the quality of release candidates can be fully validated by QA team.
Both feature environment and CI/CD process are instrumental to reach release level quality of the developed features. Both have been in development for months prior to current moment and are close to completion. Both are moving targets as they will develop together with the platform. Nevertheless, we aim to launch them as described above in the first half of August.
Feature environment
CI/CD pipeline
Test management
Specification and documentation system
Project management portal
Actor model with Object/Security Capabilities
Object/Security Capabilities (UCAN)
Dynamic method dispatch/invocation
Local access (API and CMD)
Local database interface and implementation
Executor interface and implementation
Machine benchmarking
p2p network and routing
Storage interface
IP over Libp2p
Observability and Telemetry
Definition of compute workflows/recursive jobs
Job deployment and orchestration model
Hardware capability model
Supervision model
Tokenomics interface
Last updated: 2025-01-23 01:10:54.919170 File source:
This sub package is an implementation of StorageProvider
interface for S3 storage.
Here is quick overview of the contents of this pacakge:
The class diagram for the s3
sub-package is shown below.
Source file
Rendered from source file
NewClient
signature: NewClient(config aws.Config, volController storage.VolumeController) (*S3Storage, error)
input #1: AWS configuration object
from AWS SDK
input #2: Volume Controller object
output (sucess): new instance of type storage.s3.S3Storage
output (error): error
NewClient
is a constructor method. It creates a new instance of storage.s3.S3Storage
struct.
Upload
Upload
uploads all files (recursively) from a local volume to an S3 bucket. It returns an error when
there is error in decoding input specs
there is an error in file system operations (ex - opening files on the local file system etc)
there are errors from the AWS SDK during the upload process
Download
Download
fetches files from a given S3 bucket. It can handle folders but except for x-directory. It depends on the file system provided by storage.VolumeController
. It will return an error when
there is error in decoding input specs
there is error in creating a storage volume
there is an issue in determining the target S3 objects based on the provided key
there are errors in downloading individual S3 objects
there are issues in creating directories or writing files to the local file system
the storage.VolumeController
fails to lock the newly created storage volume
Size
Size
determines the size of an object stored in an S3 bucket.
It will return an error when
there is error in decoding input specs
there is an issue in AWS API call to retrieve the object size
storage.s3.S3Storage
: TBD
storage.s3.s3Object
: TBD
storage.s3.S3InputSource
: TBD
Refer to package readme for other data types.
The various unit tests for the package functionality are defined in s3_test.go
file.
List of issues
All issues that are related to the implementation of storage
package can be found below. These include any proposals for modifications to the package or new data structures needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:55.446379 File source:
This package contains any tests that are not unit tests (at least this is how it was defined when Obsidian built these tests)
Note: position these tests as per our test-matrix and triggers in to the correct place attn: @gabriel
Note by Dagim: suggest if security.go is deleted entirely with utils/cardano because those tests do not apply to the new dms.
./test
directory of the package contains full test suite of DMS.
Here is quick overview of the contents of this pacakge:
Run CLI Test Suite
This command will run the Command Line Interface Test suite inside the ./test
directory: go test -run
Run Security Test Suite
This command will run the Security Test suite inside the ./test
directory: go test -ldflags="-extldflags=-Wl,-z,lazy" -run=TestSecurity
Run all tests
This command will run all tests from root directory: sh run_all.sh
After developing a new test suite or a test make sure that they are properly included with approprate flags and parameters into the run_all.sh
file.
Last updated: 2025-01-23 01:10:54.561115 File source:
This sub package offers a default implementation of the volume controller.
Here is quick overview of the contents of this pacakge:
The class diagram for the basic_controller
sub-package is shown below.
Source file
Rendered from source file
NewDefaultVolumeController
signature: NewDefaultVolumeController(db *gorm.DB, volBasePath string, fs afero.Fs) -> (storage.basic_controller.BasicVolumeController, error)
input #1: local database instance of type *gorm.DB
input #2: base path of the volumes
input #3: file system instance of type afero.FS
output (sucess): new instance of type storage.basic_controller.BasicVolumeController
output (error): error
NewDefaultVolumeController
returns a new instance of storage.basic_controller.BasicVolumeController
struct.
BasicVolumeController
is the default implementation of the VolumeController
interface. It persists storage volumes information in the local database.
CreateVolume
CreateVolume
creates a new storage volume given a storage source (S3, IPFS, job, etc). The creation of a storage volume effectively creates an empty directory in the local filesystem and writes a record in the database.
The directory name follows the format: <volSource> + "-" + <name>
where name
is random.
CreateVolume
will return an error if there is a failure in
creation of new directory
creating a database entry
LockVolume
LockVolume
makes the volume read-only, not only changing the field value but also changing file permissions. It should be used after all necessary data has been written to the volume. It optionally can also set the CID and mark the volume as private
LockVolume
will return an error when
No storage volume is found at the specified
There is error in saving the updated volume in the database
There is error in updating file persmissions
DeleteVolume
DeleteVolume
deletes a given storage volume record from the database. The identifier can be a path of a volume or a Content ID (CID). Therefore, records for both will be deleted.
It will return an error when
Input has incorrect identifier
There is failure in deleting the volume
No volume is found
ListVolumes
ListVolumes
function returns a list of all storage volumes stored on the database.
It will return an error when no storage volumes exist.
GetSize
GetSize
returns the size of a volume. The input can be a path or a Content ID (CID).
It will return an error if the operation fails due to:
error while querying database
volume not found for given identifier
unsupported identifed provided as input
error while caculating size of directory
Custom configuration Parameters
Both CreateVolume
and LockVolume
allow for custom configuration of storage volumes via optional parameters. Below is the list of available parameters that can be used:
WithPrivate()
- Passing this as an input parameter designates a given volume as private. It can be used both when creating or locking a volume.
WithCID(cid string)
- This can be used as an input parameter to set the CID of a given volume during the lock volume operation.
storage.basic_controller.BasicVolumeController
: This struct manages implementation of VolumeController
interface methods.
Refer to package readme for other data types.
The unit tests for the package functionality are defined in *_test.go
file.
List of issues
All issues that are related to the implementation of storage
package can be found below. These include any proposals for modifications to the package or new data structures needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:52.789080 File source:
This package implements Network
interface defined in root level network dir.
proposed
Requirements
proposed: @sam
Peer discovery and handshake
Instead of keeping track of all the peers. Peers should only in touch with peers of their types in terms of network latency, resources, or uptime.
A reason for this is, if some low performing peer is with some high performing peers, and job is distributed among them, it can slow others peers as well overall.
Max number of handshake peers
Different nodes will have different requirements regarding the number of peers that they should remain handshaking with. e.g. a small node on a mobile network will not need to maintain a large list of peers. But, a node acting as network load balancer in a data center might need to maintain a large list of peers.
Filter list
We can have filters that ensures that the only peers that are handshaked with are ones that meet certain criteria. The following list is not exhaustive:
Network latency. Have a set of fastest peers.
Resource. Relates to job types.
Uptime. Connect to peers who are online for certain period of time.
Network latency
For the network latency part, DMS should also be able to keep latency table between ongoing jobs on different CPs. The network package should be able to report it to the master node (SP). Orchestrator can then make decision on whether to replace workers or not.
Thoughts:
Filter peers at the time of discovery. Based on above parameters.
SP/orchestrator specifies what pool of CP it is looking for.
CP connects to same kind of CP.
Can use gossipsub.
Here is quick overview of the contents of this pacakge:
The class diagram for the libp2p sub-package is shown below.
Source file
Rendered from source file
As soon as DMS starts, and if it is onboarded to the network, libp2p.RunNode
is executed. This gets up entire thing related to libp2p. Let's run down through it to see what it does.
RunNode calls NewHost
. NewHost in itself does a lot of things. Let's dive into the NewHost:
It then defines a multiaddr filter which is used to deny discovering on local network. This was added to stop scanning local network in a data center.
NewHost then sets various options for the and passes it to libp2p.New to create a new host. Options such as NAT traversal is configured at this point.
Getting back to other RunNode, it calls p2p.BootstrapNode(ctx)
. Bootstrapping basically is connecting to initial peers.
Then the function continues setting up streams. Streams are bidirectional connection between peers. More on this in next section. Here is an example of setting up a stream handler on host for particular protocol:
After that, we have discoverPeers
,
After that, we have DHT update and get functions to store information about peer in peerstore.
Streams
Communication between libp2p peers, or more generally DMS happens using libp2p streams. A DMS can have one or many stream with one or more peer. We currently we have adopted following streams for our usecases.
Ping
Chat
A utility functionality to enable chat between peers.
VPN
Most recent addition to DMS, where we send IP packets through libp2p stream.
File Transfer
File transfer is generally used to carry files from one DMS to another. Most notably used to carry checkpoint files from a job from CP to SP.
Deployment Request (DepReq)
Used for deployment of a job and for getting their progress.
Current DepReq Stream Handler
Each stream need to have a handler attached to it. Let's get to know more about deployment request stream handler. Deployment request handler handles incoming deployment request from the service provider side. Similarly, some function has to listen for update from the service provider side as well. More on that in the next in a minute.
Following is a sequence of event happening on compute provider side:
Checks if InboundDepReqStream
variable is set. And if it is, reply to service provider: DepReq open stream length exceeded
. Currently we have only 1 job allowed per dep req stream.
If above is not the case, we go and read from the stream. We are expecting a
Now is the point to set the InboundDepReqStream
to the incoming stream value.
In unmarshal the incoming message into types.DeploymentRequest
struct. If it can't, it informs the other party about the it.
Otherwise, if everything is going well till now, we check the txHash
value from the depReq. And make sure it exist on the blockchain before proceeding. If the txHash is not valid, or it timed out while waiting for validation, we let the other side know.
Final thing we do it to put the depReq inside the DepReqQueue
.
Deployment Request stream handler can further be segmented into different message types:
Above message types are used by various functions inside the stream. Last 4 or above is handled on the SP side. Further by the websocket server which started the deployment request. This does not means CP does not deals with them.
Relation between libp2p and docker modules
When DepReq streams receives a deployment request on the stream, it does some json validation, and pushes it to DepReqQueue
. This extra step instead of directly passing the command to docker package was for decoupling and scalibility.
There is a messaging.DeploymentWorker()
goroutine which is launched at DMS startup in dms.Run()
.
This messaging.DeploymentWorker()
is the crux of the job deployment, as what is done in current proposed version of DMS. Based on executor type (currently firecracker and docker), it was passed to specific functions on different modules.
PeerFilter Interface
PeerFilter
is an interface for filtering peers based on a specified criteria.
types.DeploymentResponse
: DeploymentResponse is initial response from the Compute Provider (CP) to Service Provider (SP). It tells the SP that if deployment was successful or was declined due to operational or validational reasons. Most of the validation is just error check at stream handling or executor level.
types.DeploymentUpdate
: DeploymentUpdate update is used to inform SP about the state of the job. Most of the update is handled using libp2p stream on network level and websocket on the user level. There is no REST API defined. This should change in next iteration. See the proposed section for this.
On the service provider side, we have DeploymentUpdateListener
listening to the stream for any activity from the computer provider for update on the job.
Based on the message types, it does specific actions, which is more or less sending it to websocket client. These message types are MsgJobStatus
, MsgDepResp
, MsgLogStdout
and MsgLogStderr
network.libp2p.DHTValidator
: TBD
network.libp2p.SelfPeer
: TBD
network.libp2p.NoAddrIDFilter
: filters out peers with no listening addresses // and a peer with a specific ID
network.libp2p.Libp2p
: contains the configuration for a Libp2p instance
network.libp2p.Advertisement
: TBD
type Advertisement struct { PeerID string json:"peer_id"
Timestamp int64 json:"timestamp,omitempty"
Data []byte json:"data"
}
network.libp2p.OpenStream
: TBD
Note: Data types are expected to change due to DMS refactoring
TBD
List of issues
All issues that are related to the implementation of network
package can be found below. These include any proposals for modifications to the package or new data structures needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:54.156984 File source:
Here is quick overview of the contents of this pacakge:
subpackages
The class diagram for the storage package is shown below.
Source file
Rendered from source file
The functionality with respect to storage
package is offered by two main interfaces:
StorageProvider
VolumeController
These interfaces are described below.
StorageProvider Interface
The StorageProvider
interface handles the input and output operations of files with remote storage providers such as AWS S3 and IPFS. Basically it provides methods to upload or download data and also to check the size of a data source.
Its functionality is coupled with local mounted volumes, meaning that implementations will rely on mounted files to upload data and downloading data will result in a local mounted volume.
Notes:
If needed, the availability-checking of a storage provider should be handled druing instantiation of the implementation.
Any necessary authentication data should be provided within the types.SpecConfig
parameters
The interface has been designed for file based transfer of data. It is not built with the idea of supporting streaming of data and non-file storage operations (e.g.: some databases). Assessing the feasiblity of such requirement if needed should be done while implementation.
The methods of StorageProvider
are as follows:
Upload
signature: Upload(ctx context.Context, vol StorageVolume, target *types.SpecConfig) (*types.SpecConfig, error)
input #1: Context object
input #2: storage volume from which data will be uploaded of type storage.StorageVolume
input #3: configuration parameters of specified storage provider of type types.SpecConfig
output (sucess): parameters related to storage provider like upload details/metadata etc of type types.SpecConfig
output (error): error message
Upload
function uploads data from the storage volume provided as input to a given remote storage provider. The configuration of the storage provider is also provided as input to the function.
Download
signature: Download(ctx context.Context, source *types.SpecConfig) (StorageVolume, error)
input #1: Context object
input #2: configuration parameters of specified storage provider of type types.SpecConfig
output (sucess): storage volume which has downloaded data of type storage.StorageVolume
output (error): error message
Download
function downloads data from a given source, mounting it to a certain local path. The input configuration received will vary from provider to provider and hence it is left to be detailed during implementation.
It will return an error if the operation fails. Note that this can also happen if the user running DMS does not have access permission to the given path.
Size
signature: Size(ctx context.Context, source *types.SpecConfig) (uint64, error)
input #1: Context object
input #2: configuration parameters of specified storage provider of type types.SpecConfig
output (sucess): size of the storage in Megabytes of type uint64
output (error): error message
Size
function returns the size of a given storage provider provided as input. It will return an error if the operation fails.
Note that this method may also be useful to check if a given source is available.
VolumeController Interface
The VolumeController
interface manages operations related to storage volumes which are data mounted to files/directories.
The methods of VolumeController
are as follows:
CreateVolume
signature: CreateVolume(volSource storage.VolumeSource, opts ...storage.CreateVolOpt) -> (storage.StorageVolume, error)
input #1: predefined values of type string
which specify the source of data (ex. IPFS etc)
input #2: optional parameter which can be passsed to set attributes or perform an operation on the storage volume
output (sucess): storage volume of type storage.StorageVolume
output (error): error message
CreateVolume
creates a directory where data can be stored, and returns a StorageVolume
which contains the path to the directory. Note that CreateVolume
does not insert any data within the directory. It's up to the caller to do that.
VolumeSource
contains predefined constants to specify common sources like S3 but it's extensible if new sources need to be supported.
CreateVolOpt
is a function type that modifies storageVolume
object. It allows for arbitrary operations to be performed while creating volume like setting permissions, encryption etc.
CreateVolume
will return an error if the operation fails. Note that this can also happen if the user running DMS does not have access permission to create volume at the given path.
LockVolume
signature: LockVolume(pathToVol string, opts ...storage.LockVolOpt) -> error
input #1: path to the volume
input #2: optional parameter which can be passsed to set attributes or perform an operation on the storage volume
output (sucess): None
output (error): error message
LockVolume
makes the volume read-only. It should be used after all necessary data has been written to the volume. It also makes clear whether a volume will change state or not. This is very useful when we need to retrieve volume's CID which is immutable given a certain data.
LockVolOpt
is a function type that modifies storageVolume
object. It allows for arbitrary operations to be performed while locking the volume like setting permissions, encryption etc.
LockVolume
will return an error if the operation fails.
DeleteVolume
signature: DeleteVolume(identifier string, idType storage.IDType) -> error
input #1: path to the volume or CID
input #2: integer value associated with the type of identifier
output (error): error message
DeleteVolume
function deletes everything within the root directory of a storage volume. It will return an error if the operation fails. Note that this can also happen if the user running DMS does not have the requisite access permissions.
The input can be a path or a Content ID (CID) depending on the identifier type passed.
IDType
contains predefined integer values for different types of identifiers.
ListVolumes
signature: ListVolumes() -> ([]storage.StorageVolume, error)
input: None
output (sucess): List of existing storage volumes of type storage.StorageVolume
output (error): error message
ListVolumes
function fetches the list of existing storage volumes. It will return an error if the operation fails or if the user running DMS does not have the requisite access permissions.
GetSize
signature: GetSize(identifier string, idType storage.IDType) -> (int64, error)
input #1: path to the volume or CID
input #2: integer value associated with the type of identifier
output (success): size of the volume
output (error): error message
GetSize
returns the size of a volume. The input can be a path or a Content ID (CID) depending on the identifier type passed. It will return an error if the operation fails.
IDType
contains predefined integer values for different types of identifiers.
storage.StorageVolume
: This struct contains parameters related to a storage volume such as path, CID etc.
TBD
Note: EncryptionType is not yet defined in types package
types.SpecConfig
: This allows arbitrary configuration/parameters as needed during implementation of a specific storage provider. The parameters include authentication related data (if applicable).
storage.VolumeSource
: This represents the source of data for a storage volume, for example IPFS, S3 etc.
storage.CreateVolOpt
: This allows arbitrary operations on storage.StorageVolume
to passed as input during volume creation.
storage.LockVolOpt
: This allows arbitrary operations on StorageVolume
to passed as input during locking of volume.
storage.IDType
: This defines integer values for different types of identifiers of a storage volume.
types.EncryptionType
: TBD
Note: The definition below should be moved to types package
TBD
List of issues
All issues that are related to the implementation of storage
package can be found below. These include any proposals for modifications to the package or new data structures needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:55.744891 File source:
Here is quick overview of the contents of this directory:
Subpackages
Source File
Rendered from source file
Unit Tests
TBD
Functional Tests
To be determined (TBD
).
List of issues related to the design of the tokenomics package can be found below. These include proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Proposed Contract Interface
NewContract(): Creates new contract
InitiateContractClosure: function initializes and closes a contract between two nodes within the system. It follows the sequence:
Creates a new contract instance.
Populates the contract with job ID and payment details extracted from the provided bid.
Signs and notarizes the contract.
Persists the contract in the contract lists of both nodes (n1 and n2) and the central database.
InitiateContractSettlement: function initiates the settlement process for a specified contract between two nodes (n1 and n2). It executes the following steps:
Updates the contract with the provided verification result.
Handles settlement based on the job status and processes payments.
Notifies both nodes (n1 and n2) about the settlement.
Updates the contract details in the central database.
ProcessContractSettlement: processes the contract settlement based on the pricing method and verification result
Calculates payment based on the pricing method and processes it.
Handles job failure by issuing refunds if required.
Proposed Proof Interface
The InitiateContractApproval(): initiates the contract approval process, starting necessary workflows.
The CreateContractProof(): generates a cryptographic proof for a contract, ensuring transaction integrity.
The SaveProof(contractID, proof string) error: stores the contract proof in a simulated database, maintaining audit trails and historical records.
The VerifyProof(contractID, proof string) (bool, error): verifies the authenticity of a contract proof, ensuring its validity before further processing.
Proposed Payment Interface
Deposit: manages the deposit logic for payments, distinguishing between direct and escrow methods. It ensures that only valid payment types (fiat or crypto) are accepted for escrow payments. This function is crucial for initiating the payment process based on the specified method and type.
Parameters:
contractID (int): Identifier of the contract associated with the payment.
payment (Payment): Struct containing details of the payment, including its method (direct or escrow) and payment type (fiat or crypto).
SettleContract: manages the settlement process for contracts based on job verification results. It calculates the payment amount based on the job's completion percentage and processes payments either directly or via escrow, depending on the contract's payment method (direct or escrow). It also handles scenarios where job verification fails and ensures appropriate actions such as refunds for escrow payments.
FixedJobPricing: It holds pricing details for jobs with fixed payment terms.
PeriodicPricing: It is used for jobs with periodic payment structures. It also includes usage limits to define quotas.
proposed
proposed tokenomics.Contract: Consists of detailed information regarding an agreement between a requestor and a provider within the network. This data type includes the following fields:
tokenomics.Payment: Consists of details related to a payment transaction between a requestor and a provider, specifying the type, channel, currency, pricing method, and timestamp of the transaction.
tokenomics.FixedJobPricing: Consists of information related to the fixed pricing for a job, detailing the cost and platform fee involved.
tokenomics.PeriodicPricing: Consists of information related to the periodic pricing model, including the cost, period, usage limits, and platform fee.
tokenomics.UsageLimits: Consists of information regarding the resource usage limits or quotas associated with periodic pricing, specifying the maximum allowable usage for various resources.
tokenomics.Authentication: type is designed to handle the authentication details necessary for secure transaction processing within the payment gateway system. This type includes:
Encryption: Specifies the encryption method or protocol used to protect the data involved in the authentication process, ensuring that data is transmitted securely and is kept confidential from unauthorized parties.
ZKProof: Contains the zero-knowledge proof (ZKProof) which allows the verification of the transaction's authenticity without exposing sensitive information. This proof ensures that the transaction is valid while preserving privacy.
OffChain: Represents off-chain data that supports the authentication process. This data includes information not stored directly on the blockchain but is essential for validating and processing transactions securely.
The private_ledger
sub package provides a DatabaseManager
to manage PostgreSQL databases for contracts database. It allows users to initialize database connections, insert contract data, retrieve contract, and close connections safely.
Database Initialization: Create and manage a connection to a PostgreSQL database.
Contract Retrieval: Fetch all records from a specified table.
Storing Contract: Insert contract records into table.
Connection Management: Close database connections safely.
Last updated: 2025-01-23 01:10:56.018891 File source:
types
package defines and keeps data structures and interfaces that are used across the whole DMS component by different packages.
Here is quick overview of the contents of this pacakge:
Source
Rendered from source file
types
package holds interfaces and methods that are used by multiple packages. The functionality of these interfaces/methods are typically implemented in other packages.
Here are some methods defined in types
package:
NewExecutionResult
signature: NewExecutionResult(code int) *ExecutionResult
input: exit code
output: types.ExecutionResult
NewExecutionResult
creates a new ExecutionResult
object.
NewFailedExecutionResult
signature: NewFailedExecutionResult(err error) *ExecutionResult
input: error
output: types.ExecutionResult
NewFailedExecutionResult
creates a new ExecutionResult
object for a failed execution. It sets the error message from the provided error and sets the exit code to -1.
Config interface TBD
GetNetworkConfig
will return the network configuration parameters.
NewSpecConfig
signature: NewSpecConfig(t string) *SpecConfig
input: type for the configuration object
output: types.SpecConfig
NewSpecConfig
creates new SpecConfig
with the given type and an empty params map.
WithParam
signature: (s *SpecConfig) WithParam(key string, value interface{}) *SpecConfig
input #1 : key
input #2 : value associated with the key
output: types.SpecConfig
WithParam
adds a new key-value pair to the spec parameters and returns the updated SpecConfig
object.
Normalize
signature: (s *SpecConfig) Normalize()
input: None
output: None
Normalize
ensures that the spec config is in a valid state by trimming whitespace from the Type field and initializing the Params map if empty.
Validate
signature: (s *SpecConfig) Validate() error
input: None
output: error
Validate
checks if the spec config is valid. It returns an error if the SpecConfig
is nil or if the Type field is missing (blank). Otherwise, it returns no error indicating a valid configuration.
IsType
signature: (s *SpecConfig) IsType(t string) bool
input: type
output: bool
IsType
checks if the SpecConfig
matches the provided type, ignoring case sensitivity. It returns true
if there's a match, otherwise false.
IsEmpty
signature: (s *SpecConfig) IsEmpty() bool
input: None
output: bool
IsEmpty
checks if the SpecConfig
is empty, meaning it's either nil or has an empty Type and no parameters. It returns true if empty, otherwise false.
GetID
signature: (m types.BaseDBModel) GetID() string
input: None
output: identifier of the entity
GetID
returns the identifier of the entity.
BeforeCreate
signature: (m *Model) BeforeCreate(tx *gorm.DB) error
input: gorm database object to be created
output: bool
BeforeCreate
sets the ID
and CreatedAt
fields before creating a new entity.
BeforeUpdate
signature: (m *Model) BeforeUpdate(tx *gorm.DB) error
input: gorm database object to be updated
output: bool
BeforeUpdate
returns true if the spec config is empty.
LoadConfigFromEnv
signature: LoadConfigFromEnv() (*TelemetryConfig, error)
input: none
output: types.TelemetryConfig
output (error): error message
LoadConfigFromEnv
loads the telemetry configuration from environment variables. This includes observabilitty level and collector configuration. It returns the final configuration as types.TelemetryConfig
object.
parseObservabilityLevel
signature: parseObservabilityLevel(levelStr string) int
input: observability level
output: int value corresponding to the input observability level
parseObservabilityLevel
returns the integer representation of the provided observability level string. When the input string does not match the defined observability levels, default observability level INFO
is considered and its integer value is returned.
Deployment
types.DeploymentRequest
types.DeploymentResponse
types.DeploymentUpdate
types.DeploymentRequestFlat
types.BlockchainTxStatus
ELK
types.NewDeviceOnboarded
types.DeviceStatusChange
types.DeviceResourceChange
types.DeviceResourceConfig
types.NewService
// NewService defines the schema of the data to be sent to stats db when a new service gets registered in the platform type NewService struct { ServiceID string ServiceName string ServiceDescription string Timestamp float32 }
types.ServiceCall
types.ServiceStatus
types.ServiceRemove
types.NtxPayment
Executor
types.Executor
: This defines the type of executor that is used to execute the job.
types.ExecutionRequest
: This is the input that executor
receives to initiate a job execution.
types.ExecutionResult
: This contains the result of the job execution.
types.SpecConfig
TBD
: This allows arbitrary configuration/parameters as needed during implementation of specific executor.
types.LogStreamRequest
: This is the input provided when a request to stream logs of an execution is made.
Firecracker
types.BootSource
: This contains configuration parameters for booting a Firecracker VM.
types.Drives
: This contains properties of a virtual drive for Firecracker VM.
types.MachineConfig
: This defines the configuration parameters of the machine to be used while creating a new Firecracker VM.
types.NetworkInterfaces
: This defines the network configuration parameters.
types.MMDSConfig
: This contains a list of the network configuration parameters defined by NetworkInterfaces
struct.
types.MMDSMsg
TBD
: This contains the latest metadata of the machine.
types.MMDSMetadata
TBD
: This contains the metadata of the machine.
types.Actions
TBD
: This contains the type of action to be performed on the Firecracker VM.
types.VirtualMachine
: This contains the configuration parameters of Firecracker virtual machine.
Machine
types.IP
types.PeerInfo
TBD
: This contains parameters of the peer node.
types.Machine
: This contains the configuration parameters of the machine.
types.FreeResources
: This contains the resources currently available for a job.
types.AvailableResources
: This contains the resources onboarded to Nunet by the user.
types.Services
TBD
: This contains the details of the services running on the machine.
types.ServiceResourceRequirements
: This contains the resource requirements for a service.
types.ContainerImages
: This contains parameters of a container image.
types.Libp2pInfo
: This contains parameters of Libp2p node.
types.MachineUUID
: This defines the unique identifier for the machine.
types.Gpu
: This contains the GPU parameters of the machine.
types.resources
: This defines the resource parameters of the machine.
types.PeerData
: This contains the details of the peer node.
types.Connection
: TBD
types.PingResult
: The contains the details of the ping result.
types.Machines
: TBD
types.KadDHTMachineUpdate
: This contains machine info for KAD-DHT.
types.ElasticToken
: TBD
Onboarding
types.BlockchainAddressPrivKey
types.CapacityForNunet
types.Provisioned
types.Metadata
types.OnboardingStatus
types.LogBinAuth
: This stores the authorisation token for LogBin.
Resource
types.Resources
: resources defined for the machine.
types.AvailableResources
: resources onboarded to Nunet.
types.FreeResources
: resources currently available for new jobs.
types.RequiredResources
: resources required by the jobs running on the machine.
types.GPUVendor
: GPU vendors available on the machine.
types.GPU
: GPU details.
types.GPUList
: A slice of GPU
.
types.CPUInfo
: CPU information of the machine.
types.SpecInfo
: detailed specifications of the machine.
types.CPU
: CPU details.
types.RAM
: RAM details.
types.Disk
: Disk details.
types.NetworkInfo
: Network details.
types.Resource
: resources resources required to execute a task
Spec_config
types.SpecConfig
: This allows arbitrary configuration to be defined as needed.
Storage
types.StorageVolume
: This contains the parameters related to the storage volume that is created by the DMS on the local machine.
Telemetry Config
types.CollectorConfig
: This contains the parameters for a collector.
types.TelemetryConfig
: This defines the telemetry parameters such as obervability level, collector configurations etc.
ObservabilityLevel is an enum that defines the level of observability. Currently logging is done at these observability levels.
Types
types.BaseDBModel
Test are defined in other packages where functionality is implemented.
List of issues
All issues that are related to the implementation of types
package can be found below. These include any proposals for modifications to the package or new data structures needed to cover the requirements of other packages.
proposed
Encryption interfaces
These are placeholder interface definitions which will be developed in the future.
Encryptor
must support encryption of files/directories
Decryptor
must support decryption of files/directories
proposed
Network types and methods
This section contains the proposed data types and methods related to network functionality.
types.NetworkSpec
types.NetConfig
types.NetConfig
struct will implement a GetNetworkConfig
method which returns network configuration parameters.
types.NetworkStats
types.MessageInfo
proposed
Network configuration data type
types.MessageEnvelope
types.NetworkConfig
types.Libp2pConfig
// Libp2pConfig holds the libp2p configuration type Libp2pConfig struct { DHTPrefix string //crypto is Go-libp2p package that implements various cryptographic utilities PrivateKey crypto.PrivKey BootstrapPeers []multiaddr.Multiaddr Rendezvous string Server bool Scheduler *bt.Scheduler CustomNamespace string ListenAddress []string PeerCountDiscoveryLimit int PrivateNetwork types.PrivateNetworkConfig GracePeriodMs int GossipMaxMessageSize int }
types.PrivateNetworkConfig
Last updated: 2025-01-23 01:10:56.572418 File source:
TBD
Utils specifically used for the validation of different types.
Here is quick overview of the contents of this directory:
Source File
Rendered from source file
This package contains helper methods that perform validation check for different data types. Refer to strings.go
file for more details.
This package does not define any new data types.
Unit tests for the functionality are defined in files with *_test.go
in their names.
List of issues related to the implementation of the utils
package can be found below. These include proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2025-01-23 01:10:56.293529 File source:
This package contains utility tools and functionalities used by other packages
Here is quick overview of the contents of this directory:
Files with *_test.go
naming contains unit tests of the specified functionality.
Source File
Rendered from source file
Blockchain data models
utils.UTXOs
: TBD
utils.TxHashResp
: TBD
utils.ClaimCardanoTokenBody
: TBD
utils.rewardRespToCPD
: TBD
utils.UpdateTxStatusBody
: TBD
progress_io data models
utils.IOProgress
: TBD
utils.Reader
: TBD
utils.Writer
: TBD
syncmap data models
utils.SyncMap
: a concurrency-safe sync.Map that uses strongly-typed method signatures to ensure the types of its stored data are known.
The unit tests for the functionality are defined in network_test.go
and utils_test.go
files.
List of issues related to the implementation of the utils
package can be found below. These include proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
In preparation for this release process we will be changing the branching strategy as discussed and agreed in this and tracked by this . In summary:
The public dms binaries will be built from release
branch tags in the form of v{version_number}-{suffix}, e.g. v0.5.0-rc1
, etc. and released via as usual.
The release process of is scheduled to start September 15, 2024 and finish December 15, 2024. It involves the following steps in chronological order:
The internal latest updated document with release process dates and countdown is available to view . Feature scope management is done via .
We first define the final scope of the dms features that we are releasing within this milestone, which is done within of the document. The scope of each feature should defined by functional test cases / scenarios associated with each feature and linked in Acceptance tests row of each table. All required test cases and scenarios pertaining to the feature scope freeze shall be written by the time of the last last release candidate release. The work is tracked by the work package integrated into the test management system. Until then we will be continuing to implement all the features as they are explained in this document.
Documentation portal was operating internally since end of 2023, but was not fully aligned with the specification and documentation process that involves the whole team and including updates of documentation into acceptance criteria of each pull request. Documentation portal shall be launched in August. The work involving finishing and launching documentation portal are tracked by .
Project management portal has been operating internally since Q4 2023, but was not exposed publicly as of yet. It will be launched publicly by the end of August 2024. The work involved in finishing and launching the project management portal for the dms v.0.5 milestone is tracked by
Given that all prerequisites are launched and prepared, we aim at starting the release process September 15, 2024 by releasing the first release candidate and exposing it to testing. We will release at least 3/4 release candidates and then final release, but the process will be fluid as explained and most probably fill feature more minor release candidate releases per the adopted branching strategy.
After testing the frozen feature scope of the release, we aim at releasing the 0.5.0 version of device-management-service during the second half of December 2024. For current / updated schedule and details, see and .
: Current file which is aimed towards developers who wish to use and modify the functionality.
: This file defines functionality to allow download of files and directories from S3 buckets.
: This file defines helper functions for working with AWS S3 storage.
: This file initializes an Open Telemetry logger for the package.
: This file contains unit tests for the package functionality.
: This file defines methods to interact with S3 buckets using the AWS SDK.
: This file defines S3 input source configuration with validation and decoding logic.
: This file defines S3 storage implementation for uploading files (including directories) from a local volume to an S3 bucket.
For function signature refer to the package
For function signature refer to the package
For function signature refer to the package
TBD
: Current file which is aimed towards developers who wish to use and modify the package functionality.
: This file contains script to run CLI test and security test.
: The file is a Plutus script encoded in CBOR hexadecimal format.
: TBD
: TBD
: TBD
: TBD
: Current file which is aimed towards developers who wish to use and modify the functionality.
: This file implements the methods for VolumeController
interface.
: This file contains the unit tests for the methods of VolumeController
interface.
For function signature refer to the package
For function signature refer to the package
For function signature refer to the package
For function signature refer to the package
For function signature refer to the package
TBD
: Current file which is aimed towards developers who wish to use and modify the package functionality.
: This file defines the method to ping a peer.
: This file contains functionalities of a libp2p node. It includes functionalities for setting up libp2p hosts, performing pings between peers, fetching DHT content, checking the status of a peer and validating data against their signatures.
: This file contains methods for peer discovery in a libp2p node.
: This file defines functionalities for peer filtering and connection management in a libp2p node.
: This file defines configurations and initialization logic for a libp2p node.
: This file defines stubs for Libp2p peer management functionalities, including configuration, initialization, events, status, stopping, cleanup, ping, and DHT dump.
: This file defines a Libp2p node with core functionalities including discovery, peer management, DHT interaction, and communication channels, with several stub implementations.
It creates a . This defines what is the upper and lower limit of peers current peer will connect to.
We can count this as internal to libp2p and is used for operational purposes. Unlike ICMP pings, libp2p , and is closed after the ping.
After this step, the command is handed over to executor
module. Please refer to .
The storage package is responsible for disk storage management on each DMS (Device Management Service) for data related to DMS and jobs deployed by DMS. It primarily handles storage access to remote storage providers such as , etc. It also handles the control of storage volumes.
: Current file which is aimed towards developers who wish to use and modify the package functionality.
: This file defines the interface responsible for handling input/output operations of files with remote storage providers.
: This file contains the interfaces and structs related to storage volumes.
: This folder contains the basic implementation of VolumeController
interface.
: This contains implementation of storage functionality for S3 storage bucket.
TBD
This repository contains implementations for managing contracts, proofs, and payments in tokenomics. Initiated within milestone , it offers a comprehensive set of interfaces and methods. To implement these functions, we first define key datatypes and interfaces.
: Current file which is aimed towards developers who wish to use and modify the package functionality.
Defines the main interface for managing and executing contracts within the tokenomics system.
Implements the interface and logic for proof handling within the tokenomics framework.
Contains the main interface and functions for processing payments in the tokenomics system.
: Defines the core functionalities and main interface for the tokenomics package, integrating contracts, proofs, and payments.
: Directory containing package specifications, including package class diagram.
: Directory containing implementation of private ledger.
Contains the sequence diagram for the tokenomics package
Note: the functionality of Tokenomics is being currently developed. See the section for the suggested design of interfaces and methods.
Note: the functionality of Tokenomics is being currently developed. See the section for the suggested data types.
: Current file which is aimed towards developers who wish to use and modify the package functionality.
: This file contains data structures to describe machine capability.
: This file contains constans and types used for capability comparison.
: This file contains constants that are used across different packages.
: This file contains data structure related to job deployment.
: This file contains data structure to be sent to elasticsearch collector.
: This file contains data structure related to encryption in DMS.
: This file contains data structure related to executor functionality.
: This file contains data structure related to firecracker.
: This file contains data structure related to the machine - resources, peer details, services etc.
: This file contains data structure related to networking functionality of DMS
: This file defines message types, network types (libp2p, NATS) with configurations, and libp2p specific configurations (DHT, keys, peers, scheduling etc).
: This file contains data structure related to compute provider onboarding.
: This file contains data structures of GPU and execution resources.
: This file defines a SpecConfig
struct for configuration data with type, parameters, normalization, validation, and type checking functionalities.
: This file contains data structures related to storage.
: This file defines structs related to telemetry configuration and methods to load configuration from environment variables.
: This file defines a base model for entities in the application with auto-generated UUIDs, timestamps, and soft delete functionality using GORM hooks.
TBD
: Current file which is aimed towards developers who wish to use and modify the package functionality.
: This file contains method for conversion of numerical data to float64
type.
: This file contains method for validation check of data types.
: This folder contains the class diagram of the package.
TBD
: Current file which is aimed towards developers who wish to use and modify the package functionality.
: This file contains methods and data types related to interaction with blockchain.
: This file contains a method to retrieve the size of the volume.
: This file initializes an Open Telemetry logger for this package. It also defines constants to reflect the status of transaction.
: This file contains helper methods for DMS API calls and responses.
: This file defines wrapper functions for readers and writers with progress tracking capabilities.
: This file defines a SyncMap
type which is a thread-safe version of the standard Go map
with strongly-typed methods and functions for managing key-value pairs concurrently.
: This file contains various utility functions for the DMS functionality.
: This contains helper functions that perform different kinds of validation checks and numeric conversions.
: This folder contains the class diagram for the package.
utils
package defines various helper methods for functionality defined in the different packages of DMS. Refer to for details.
TBD
Feature name
Bootstrap feature environment composed of geographically distributed machines connected via public internet
Work package
Code reference
https://gitlab.com/nunet/test-suite/-/tree/develop/environments/feature?ref_type=heads
Description / definition of done
1) Selected stages of CI/CD pipeline are automatically executed on the fully functional feature environment consisting of geographically dispersed virtual machines 2) Developers are able to test functionalities requiring more than one machine running code; 3) testing artifacts are collected and made accessible from centralized place (GitLab, Testmo, etc) for all tested scenarios / configurations and machines. 4) Interface / process for configure and manage feature environment configurations and reason about complex scenarios involving multiple machines
Timing
1) Capability exposed to development team as soon as possible along in the preparation of release process (could comply to definition of done partially); 2) Full definition of done met and all impacted funcionalities considered by the September 1, 2024 at the latest
Status
On track
Team
Lead: @gabriel.chamon; Supporting: @sam.lake, @abhishek.shukla3143438
Strategic alignment
1) Enables advanced testing and quality assurance on limited hardware; 2) Enables scaling the testing environment from internally onwed to community developers machines, and eventually into testnet; 3) Builds grounds for release management now and in the future (including release candidates, testnet configurations involving third party hardware, canary releases, etc)
Who it benefits
1) NuNet development team ensuring next level quality of CI/CD process and released software 2) Community testers and early compute providers 3) Solution providers who are already considering using NuNet for their business processes and will start testing in release candidate phase 4) Eventually all platform users will benefit from the quality of software
User challenge
1) Developers want to test software as early as possible and access testing data / different hardware / software configurations in real network setting; 2) Community compute providers want to contribute to the development by providing machines for testnet
Value score
n/a
Design
n/a
Impacted functionality
This does not affect any feature (except possibly ability to launch testnet in the future) but rather deals with quality assurance of the whole platform, therefore indirectly but fundamentally affects quality of all features now and in the future.
Acceptance tests
Developers are able to spawn and tear down testing networks on demand for testing custom platform builds and complex functionalities involving more than one machine; CI/CD pipeline incorporates complex functional and integration tests which run on automatically spawned and teared down small (but geographically distributed) real networks of virtual machines
Feature name
Set the CI/CD pipeline able to run full test suite on all environments as defined in text matrix
Work package
Code reference
https://gitlab.com/nunet/test-suite/-/tree/develop/cicd?ref_type=heads
Description / definition of done
1) CI/CD pipeline is structured as per general test management and test matrix based-approaches and executes the minimal configuration for this milestone; 2) The CI/CD process explained and documented so that it could be evolved further via developers contributions 3) The minimal configuration is: 3.1) unit tests run per each defined test matrix trigger / VCS event; 3.2) functional tests (the ones that require code build, but no network) are executed per each defined test matrix trigger / VCS event 4) Each stage of CI/CD pipeline collects and publishes full test artifacts for 4.1) developers for easy access via GitLab flow; 4.2) code reviewers for easy access via GitLab pull requests; 4.3) QA team via Testmo interface; 5) QA requirements and availabilities provided by CI/CD pipeline are integrated into issue Acceptance Criteria
Timing
1, 2, 3 implemented within first half of August 2024; 4 implemented within August; 5 implemented as soon as all is finished, but not later than first week of September;
Status
On track
Team
Lead: @gabriel.chamon; Supporting: @abhishek.shukla3143438, @ssarioglunn
Strategic alignment
1) CI/CD pipeline integrates to the development process by enabling automatic testing of code against formalized functional requirements / scope of each release / milestone; 2) CI/CD pipeline process integrates into the test management process (of integrating new tests into the pipeline and ensuring that all testing domain is covered with each build) 3) Proper CI/CD pipeline and its visibility is instrumental for reaching our business goals of delivering quality software that is evolvable
Who it benefits
1) NuNet development team ensuring next level quality of CI/CD process and released software 2) Internal QA team having the interface to easily access all built errors / issues and update requirements as needed 3) Eventually all platform users will benefit from the quality of software
User challenge
device-management-service code with the same build will run in diverse scenarios and diverse machine / os configurations while providing robust functionality; CI/CD pipeline process (together with test management process) will need to ensure that a) all QA requirements for implemented functionality scope are properly formalized and placed in correct test stages for exposing each build to them b) updated and/or newly defined functionality scope (during release process), including tests included as a result of community bug reports, are properly integrated into QA scope abd the CI/CD pipeline
Value score
n/a
Design
n/a
Impacted functionality
This does not affect any feature (except possibly ability to launch testnet in the future) but rather deals with quality assurance of the whole platform, therefore indirectly but fundamentally affects quality of all features now and in the future.
Acceptance tests
n/a
Feature name
Set up test management system, visibility tools and integration with CI/CD and Security pipelines
Work package
Code reference
https://gitlab.com/nunet/test-suite
Description / definition of done
As detailed in work package deliverables: https://gitlab.com/nunet/architecture/-/issues/207
Status
With challenges
Team
Lead: @abhishek.shukla3143438; Supporting: @gabriel.chamon, @ssarioglunn
Strategic alignment
1) Test management system will enable 1.1) Define quality requirements and functionalities formally 1.2) Release software that precisely corresponds to the defined quality / functional requirements; 2) Test management system will enable for the functional and quality requirements to evolve together with the code; 3) Test management system implementation will enable for the community developers and testers to be part of the development process and greatly enhance it; 4) Participation of community in test management process will ideally be related to contributors program
Who it benefits
1) NuNet development team ensuring next level quality of the released software 2) Internal QA team having the interface to easily access all built errors / issues and update requirements as needed 3) Eventually all platform users will benefit from the quality of software
User challenge
1) Test management process should be flexible to evolve together with the code and constantly get updated with test vectors / scenarios; These can come both from core team and community developers / testers; Our test management system is fairly complex, involving different stages and frameworks (go for unit tests; Gherkin for other stages) as well as multiple machine environments, simulating real network behavior. In order to reach our goals, careful coordination (fitting together) of all aspects is needed. 2) Special challenge is inclusion of manual tests into the same framework of visibility, but enabling QA team to coordinate manual testing schedules and inclusion of their results into the test artifacts; 3) Community developers and testers should be included into the testing process both by public access to the test suite definitions as well as manual testing process, which is related to contributors programme;
Value score
n/a
Design
as described https://gitlab.com/nunet/test-suite
Impacted functionality
This does not affect any feature (except possibly ability to launch testnet in the future) but rather deals with quality assurance of the whole platform, therefore indirectly but fundamentally affects quality of all features now and in the future.
Acceptance tests
n/a
Feature name
Set up specification and documentation system for all nunet repos
Work package
Code reference
Description / definition of done
1) The documentation process is set up and described, including 1.1) requirement to have README.md files in each package and subpackage, 1.2) general structure and required contents of README.md files; 2) All packages and subpackages contain class diagram written in plantuml and their hierarchy is correctly constructed; 3) Documentation system allows to include package / subpackage specific descriptions or extensions via links in README.md and possibly using additionally files (also in ./specs directory) as needed; 4) Updating specification and documentation of respective packages is included into acceptance criteria of each issue and checked during merge request reviews; 5) Documentation system includes process of requesting and proposing new functionalities and architecture; 6) Regular builds of documentation portal which is based on README.md files
Status
Close to completion
Team
Lead: @0xPravar; Supporting: @kabir.kbr, @janaina.senna
Strategic alignment
As explained in tech discussion 2024/07/25 the goals of specification and documentation system are: 1. Clarity and understanding of: 1.1 architecture; 1.2 functionality; 1.3 tests [i.e. functionality with some emphasis on quality]; 2. [by that enabling/helping] coordination [of development activities] by: 2.1 the core team; 2.2 community; 2.3 integration partners; 3. Making documentation part of the code, i.e. living documentation, which changes with the code and reflects both current status and planned / proposed directions;
Who it benefits
1) NuNet technical board by having space where conceptual architecture and priorities are translated into architectural and functional descriptions 2) Internal platform development team as well as community developers have clearly defined structure architecture and proposed functionality guiding the work and collaboration 4) Internal integrations team (developing use-cases) and third party solution providers have clear documentation of the platform to build solutions on top 5) integration partners, using the system can understand the current and future integration potential on a deep technical level and propose required functionality directly into the development process
User challenge
In order for all stakeholder groups to benefit, the documentation and specification system should be 1) structured enough for all stakeholders to quickly find aspects that are specifically needed for them; 2) flexible enough to express package architecture and implementation specific for maximal clarity and understanding; 3) integrated into development process for easy access by and contribution from internal development team 4) conveniently accessible for communication to community developers, integration partners (for this milestone) and other stakeholders from broader ecosystem (in the future);
Value score
n/a
Design
as described during tech discussion 2024/07/25
Impacted functionality
This does not affect any feature directly but fundamentally enables the quality and integrity of the whole platform functionality, alignment with the business goals, use-case based platform development model and evolvability of the software.
Acceptance tests
Feature name
Actor model and interface; Node and Allocation actors implementations; feature::actor-model; feature::general
Work packages
Code reference
https://gitlab.com/nunet/test-suite/-/tree/develop/environments/feature?ref_type=heads
Description / definition of done
1) Machine onboarded on NuNet via dms act as separate actors (via Node interface); 2) Jobs deployed and orchestrated on the platform act as separate Actors (via Allocation interface); 3) Nodes and Allocations (both implementing the Actor interface) communicate only via the immutable messages (via Message interface and network package's transport layer) and have no direct access to each other private state;
Timing
Actor, Node and Allocation interfaces are correctly implemented at the start of the release process; correct interface implementation means that the send messages to each other, can read and process received messages and initiate arbitrary behaviors via RPC model;
Status
With challenges
Team
Lead: @mobin.hosseini; Supporting: @gugachkr
Strategic alignment
1) Enables the unlimited horizontal scalability of the platform with potentially minimal manual intervention; 2) Enables any eligible entity to join the platform without central control point; 3) Enables concurrency and non-locking by design which is independent of the scale of the platform at any point in time
Who it benefits
1) Solution integrators (providers) who need to use Decentralized Physical Infrastructure; 2) Hardware owners who wish to utilize their infrastructure in a more flexible manner than possible with established orchestration and deployment technologies and without direct involvement with the compute users or building compute marketplaces
User challenge
1) All compute users want to maximize efficient and fast access to optimal hardware for doing a computing job at hand and not to overpay for that; 2) Hardware resource owners want the maximal utilization of their hardware resources without idle time;
Value score
n/a
Design
Impacted functionality
All functionality of the platform is fundamentally affected implementation of actor model; This is especially true for the future projected functionalities involving edge computing, IoT deployments and decentralized physical infrastructure in general.
Acceptance tests
Functional and integration tests defined in node package, dms package related to Actor interface and jobs package related to Allocation interface; tracking issue
Feature name
Implementation of User Controlled Authorization Network (UCAN); DIDs and key management; feature::ucan
Work packages
Code reference
Description / definition of done
Timing
Closely integrated with the Actor system implementation; Every message requires UCAN tokens to be included and verified;
Status
Close to completion
Team
Lead: @vyzo
Strategic alignment
Who it benefits
User challenge
Value score
n/a
Impacted functionality
Implementation of the fundamental zero trust security model.
Acceptance tests
Feature name
Dynamic method dispatch logic for initiating behaviors in actors; feature::remote-invocation
Work packages
Implemented within the scope of Node package
Code reference
issue with minimal description;
Description / definition of done
Methods / functions can be run remotely by sending a message from one Actor to another
Timing
Status
In progress
Team
Strategic alignment
Who it benefits
User challenge
Value score
n/a
Design
Impacted functionality
Fundamental functionality that enables the full realization of the Actor model potential
Acceptance tests
Unit tests of around 90%; Functional / integration tests: sending rpc call from one actor (node or allocation) on different network configuration to another Actor (node or allocation); and initiate chosen method; ; tracking issue
Feature name
Local access to running dms from the machine on which it is running; feature::cli-access
Work packages
Code reference
Description / definition of done
Timing
Status
Almost complete
Team
Strategic alignment
Who it benefits
User challenge
Value score
n/a
Design
Impacted functionality
Configuration of dms; Access to NuNet network from external applications via REST-API;
Acceptance tests
Unit tests of around 90%; Functional / integration tests: api responds to locally issued commands; api does not respond to remotely issued commands; tracking issue
Feature name
Local NoSQL database interface and implementation; feature::local-db
Work packages
Code reference
Description / definition of done
Timing
Status
Almost completed
Team
Strategic alignment
Who it benefits
User challenge
Value score
n/a
Design
Impacted functionality
Configuration management; Local telemetry and logging management (possibly);
Acceptance tests
Unit tests of around 90%; Functional / integration tests: Arbitrary information can be stored, retrieved and searched via the implemented interface; ; tracking issue
Feature name
Executor interface and implementation of docker and firecracker executors; feature::execution-and-resources
Work packages
Code reference
Description / definition of done
Timing
Status
Finished
Team
Strategic alignment
Who it benefits
User challenge
Value score
n/a
Design
Impacted functionality
Definition of generic interface for easy plugging third party developed executables to dms; Full implementation of docker and firecracker executables;
Acceptance tests
Unit tests of around 90%; Functional / integration tests: starting a compute job with docker / firecracker executables; observing the runtime; finishing and receiving results; ; tracking issue
Feature name
Machine [Capability] benchmarking ; feature::machine-benchmarking
Work packages
Code reference
DMS package code and subpackages (mostly node, onboarding and resources subpackages)
Description / definition of done
Timing
Status
In progress
Team
Strategic alignment
Who it benefits
User challenge
Value score
n/a
Design
Impacted functionality
Machine benchmarking is needed for the Capability / Comparison interface implemented in dms.orchestrator.matching subpackage
Acceptance tests
Unit tests; Functional tests: Machines are benchmarked while onboarding, the benchmarking data is stored / accessed via database interface; ; tracking issue
Feature name
p2p network and routing; feature::p2p-network
Work packages
Code reference
Description / definition of done
implemented network package design
Timing
Status
Close to completion
Team
Strategic alignment
Who it benefits
User challenge
Sending and receiving messages directly or via broadcasts;
Value score
n/a
Design
Messages and routing partially explained in research blog on gossipsub, DHT and pull/push mechanisms
Impacted functionality
Fundamental functionality of NuNet -- connecting dms's into p2p neworks and subnetworks;
Acceptance tests
Unit tests; Functional tests: Actors (nodes and allocations) are able to see peers / neighbours; It is possible to send and receive messages from other Actors (nodes and allocations) either directly (addressed) or via gossip routing indirectly; tracking issue
Feature name
Storage interface definition and s3 storage implementation; feature::storage
Work packages
Code reference
Description / definition of done
Timing
Status
Finished
Team
Strategic alignment
Who it benefits
User challenge
Ability for running executors to access data via storage interface and its implementations
Value score
n/a
Design
Impacted functionality
Fundamental functionality of NuNet -- providing input and output data storage for computation processes
Acceptance tests
Unit tests; Functional tests: all executors are able to read and write data to the provided storage, as allowed and via the interface; tracking issue
Feature name
Observability and Telemetry design and implementation; feature::observability
Work packages
Code reference
Description / definition of done
Timing
Status
In progress;
Team
Strategic alignment
Who it benefits
Development team by having full visibility of code logs and traces across the testing networks; Potential integrators and use-case developers for debugging their applications running on decentralized hardware; QA team accessing test logs;
User challenge
Value score
n/a
Design
Impacted functionality
Logging, tracing and monitoring of decentralized computing framework on any level of granularity; Constitutes a part of developer tooling of NuNet, which will be used by both internal team as well as community contributors
Acceptance tests
Unit tests; Functional tests / integration tests: after logging is implemented via telemetry interface and default logging is elasticsearch collector; all telemetry events are stored in elasticsearch database and can be analyzed via API / Kibana dashboard; tracking issue
Feature name
Structure, types and definitions of compute workflows / recursive jobs; feature::workflow-definition
Work packages
Code reference
Description / definition of done
Timing
Status
In progress;
Team
Strategic alignment
Who it benefits
User challenge
Set NuNet job definition format and types in order to be able to orchestrate them via orchestration mechanism
Value score
n/a
Design
Impacted functionality
Used in job orchestration in order to be able to search and match fitting machines that are connected to the network; Related to Capability / Comparison model
Acceptance tests
Unit tests; Functional tests / integration tests: Ability to represent any job that can be represented via kubernetes / nomad in nunet job fomat / convert to inner type and orchestrate its execution; tracking issue
Feature name
Job deployment and orchestration model; feature::job-orchestration
Work packages
Code reference
Description / definition of done
Timing
Status
In progress;
Team
Strategic alignment
Who it benefits
User challenge
Submit a job description and rely on NuNet platform to execute it optimally (including finding fitting machines, connecting them to subnetworks, etc.)
Value score
n/a
Design
Impacted functionality
Related to all sub-packages in the dms package and defines Capability / Comparison model
Acceptance tests
Unit tests; Functional tests / integration tests: Submit a job described in NuNet job description format, observe its deployment and execution and returning results; tracking issue
Feature name
Capability / Comparator model; feature::hardware-capability
Work packages
Part of the orchestrator work package
Code reference
Description / definition of done
Timing
Status
Close to completion
Team
Strategic alignment
Who it benefits
User challenge
Define requested compute Capabilities (included in job definition) and machine Capabilities for searching and matching fitting machines in the network
Value score
n/a
Design
Impacted functionality
(1) Ability to match compute requirements with available Capabilities of machines, considering not only hard (hardware, etc), but also soft requirements (price, time, etc preferences from both sides of matching process); (2) Comparing different machine Capabilities (selecting best machine for a job); (3) Adding / subtracting Capabilities in order to be able to calculate machine usage in real time; See also mentions of Capability model in other functionality descriptions in this document
Acceptance tests
Unit tests; Functional tests / integration tests: via job orchestration integration tests; tracking issue
Feature name
Supervision model; feature::supervision-model
Work packages
Part of the orchestrator work package
Code reference
Description / definition of done
Timing
Status
Not started
Team
Strategic alignment
Who it benefits
User challenge
Value score
n/a
Design
Impacted functionality
Ability to build a 'decentralized' control plane on NuNet; error propagation between Actors participating in the same compute workflow; heartbeat and health-check functionalities; conceptually, supervisor model enables failure recovery and fault tolerance features in the network; related to 'remote procedure calls' functionality;
Acceptance tests
Unit tests; Functional tests / integration tests: build hierarchies of actors (nodes and allocations) that can observe each other; tracking issue
Feature name
Tokenomics interface; feature::tokenomics
Work packages
Code reference
Description / definition of done
Minimal implementation of the interface in order to be able to implement micro-payments layer in the next milestone
Timing
Status
In progress
Team
Strategic alignment
Who it benefits
User challenge
Value score
n/a
Design
Impacted functionality
Ability to conclude peer to peer contracts between machines requesting a job and machines accepting job execution (eventually); Ability to include explicit contract information into each job invocation request, independently of the type of contract and micro-payment channels implementation
Acceptance tests
Unit tests; Functional tests / integration tests as part of job orchestration; tracking issue
Feature name
Prepare and launch project management portal
Work package
There is not specific work package in the current milestone, as project management portal was developed outside its scope;
Code reference
Description / definition of done
1) Project management pages always reflect the latest status of each milestone; 2) Introductory page explains project management portal and process for internal and external users
Status
Close to completion
Team
Lead: @janaina.senna; Supporting: @0xPravar, @kabir.kbr
Strategic alignment
Project management portal is a part of the Critical Chain Project management process that is the basis of our development flow and is integrated into the team process; The aim of CCPM process is to achieve the level where we are able to plan, schedule and implement milestones and releases without delay; The portal is designed to automatically generate and share all important updates about the process with the team as well as external contributors in order to make the process as efficient and transparent as possible; The process is used internally since mid 2023 and shall be included into the contribution program and community developers process;
Who it benefits
1) Development team follows the CCPM process via the project management portal; 2) Marketing team uses project management portal (as well as other internal resources) to shape communications transparently; 3) Operations and business development team can always see the recent status in development process
User challenge
Value score
n/a
Design
As described in project management portal repository readme
Impacted functionality
This does not affect any feature directly but fundamentally enables alignment with the business goals, use-case based platform development model and evolvability of the software.
Acceptance tests
Feature name
IP over Libp2p; feature::ip-over-libp2p
Work packages
Within the scope of network implementation work package; dependent work package in another milestone
Code reference
Description / definition of done
Timing
Status
Close to completion
Team
Strategic alignment
Who it benefits
User challenge
Run programs and frameworks on nunet-enabled machines that need ipv4 connectivity layer
Value score
n/a
Design
Impacted functionality
Acceptance tests
Unit tests; Functional tests / integration tests: (1) spawn a ipv4 network for containers running on different machines to directly interact with each other; (2) Access compute providers via Kubernetes cluster / orchestrate jobs via Kubernetes cluster (advanced); tracking issue
Drive Folder with related working documents and discussions; project management portal repository; ; gitbook public pages
Ability to integrate with third party frameworks for orchestration (e.g. Kubernetes, others) as well as run distributed software (database clusters, etc.); Will be mostly used for the