Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Last updated: 2024-11-07 21:04:49.492175 File source: link on GitLab
This file explains the onboarding functionality of Device Management Service (DMS). This functionality is catered towards compute providers who wish provide their hardware resources to Nunet for running computational tasks as well as developers who are contributing to platform development.
Here is quick overview of the contents of this directory:
README: Current file which is aimed towards developers who wish to modify the onboarding functionality and build on top of it.
handler: This is main file where the code for onboarding functionality exists.
addresses: This file houses functions to generate Cardano wallet addresses along with its private key.
addresses_test: This file houses functions to test the address generation functions defined in addresses.
available_resources: This file houses functions to get the total capacity of the machine being onboarded.
init: This files initializes the loggers associated with onboarding package.
The class diagram for the onboarding
package is shown below.
Source file
Rendered from source file
Onboard
signature: Onboard(ctx context.Context, config types.OnboardingConfig) error
input #1: Context object
input #2: types.OnboardingConfig
output (error): Error message
Onboard
function executes the onboarding process for a compute provider based on the configuration provided.
signature: Offboard(ctx context.Context, force bool) error
input #1: Context object
input #2: force parameter
output: None
output (error): Error message
Offboard
removes the resources onboarded to Nunet. If the force
parameter is True
, then offboarding process will continue even in the presence of errors.
signature: IsOnboarded(ctx context.Context) (bool, error)
input #1: Context object
output #1: bool
output #2: error
IsOnboarded
checks if the compute provider is onboarded.
signature: Info(ctx context.Context) (types.OnboardingConfig, error)
input #1: Context object
output #1: types.OnboardingConfig
output #2: error
Info
returns the configuration of the onboarding process.
types.OnboardingConfig
: Holds the configuration for onboarding a compute provider.
All the tests for the onboarding package can be found in the onboarding_test.go file.
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Last updated: 2024-11-07 21:04:48.114546 File source: link on GitLab
This package is responsible for starting the whole application. It also contains various core functionality of DMS:
Onboarding compute provider devices
Job orchestration and management
Resource management
Actor implementation for each node
Here is quick overview of the contents of this pacakge:
README: Current file which is aimed towards developers who wish to use and modify the dms functionality.
dms: This file contains code to initialize the DMS by loading configuration, starting REST API server etc
init: This file creates a new logger instance.
sanity_check: This file defines a method for performing consistency check before starting the DMS. proposed
Note that the functionality of this method needs to be developed as per refactored DMS design.
Subpackages
jobs: Deals with the management of local jobs on the machine.
node: Contains implementation of Node
as an actor.
onboarding: Code related to onboarding of compute provider machines to the network.
orchestrator: Contains job orchestration logic.
resources: Deals with the management of resources on the machine.
proposed
: All files with *_test.go
naming convention contain unit tests with respect to the specific implementation.
The class diagram for the dms
package is shown below.
Source file
Rendered from source file
TBD
Note: the functionality of DMS is being currently developed. See the proposed section for the suggested design of interfaces and methods.
TBD
Note: the functionality of DMS is being currently developed. See the proposed section for the suggested data types.
proposed
Refer to *_test.go
files for unit tests of different functionalities.
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Interfaces & Methods
proposed
Capability_interface
add
method will combine capabilities of two nodes. Example usage - When two jobs have to be run on a single machine, the capability requirements of each will need to be combined.
subtract
method will subtract two capabilities. Example usage - When resources are locked for a job, the available capability of a machine will need to be reduced.
Data types
proposed
dms.Capability
The Capability
struct will capture all the relevant data that defines the capability of a node to perform the job. At the same time this will be used to define capability requirements that a job requires from a node.
An initial data model for Capability
is defined below.
proposed
dms.Connectivity
type Connectivity struct {
}
proposed
dms.PriceInformation
proposed
dms.TimeInformation
type TimeInformation struct { // Units holds the units of time ex - hours, days, weeks Units string
}
Last updated: 2024-11-07 21:04:50.078197 File source: link on GitLab
This whole package is proposed
status and therefore documentation is missing, save for the proposed functionality part.
TBD
TBD
Source
Rendered from source file
TBD
TBD
TBD
List of issues
All issues that are filed in GitLab related to the implementation of dms/orchestrator
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Proposed functionalities
TBD
Data types
proposed
LocalNetworkTopology
more complex deployments may need a data structure, which considers local network topology of a node / dms -- i.e. for reasoning about speed of connection (as well as capabilities) between neighbors.
Related research blogs
TBD
Last updated: 2024-11-07 21:04:50.341943 File source:
resources
deals with resource management for the machine. This includes calculation of available resources for new jobs or bid requests.
Here is quick overview of the contents of this pacakge:
All files with *_test.go
contains unit tests for the corresponding functionality.
The class diagram for the resources
package is shown below.
Source file
Rendered from source file
Manager Interface
The interface methods are explained below.
AllocateResources
signature: AllocateResources(context.Context, ResourceAllocation) error
input: Context
output (error): Error message
AllocateResources
allocates the resources to the job.
DeallocateResources
signature: DeallocateResources(context.Context, string) error
input: Context
output (error): Error message
DeallocateResources
deallocates the resources from the job.
GetTotalAllocation
signature: GetTotalAllocation() (Resources, error)
input: Context
output: types.Resource
output (error): Error message
GetTotalAllocation
returns the total resources allocated to the jobs.
GetFreeResources
signature: GetFreeResources() (FreeResources, error)
input: None
output: FreeResources
output (error): Error message
GetFreeResources
returns the available resources in the allocation pool.
GetOnboardedResources
signature: GetOnboardedResources(context.Context) (OnboardedResources, error)
input: Context
output: OnboardedResources
output (error): Error message
GetOnboardedResources
returns the resources onboarded to dms.
UpdateOnboardedResources
signature: UpdateOnboardedResources(context.Context, OnboardedResources) error
input: Context
input: OnboardedResources
output (error): Error message
UpdateOnboardedResources
updates the resources onboarded to dms.
UsageMonitor
signature: UsageMonitor() types.UsageMonitor
input: None
output: types.UsageMonitor
instance
output (error): None
UsageMonitor
returns the types.UsageMonitor
instance.
This interface defines methods to monitor the system usage. The methods are explained below.
GetUsage
signature: GetUsage(context.Context) (types.Resource, error)
input: Context
output: types.Resource
output (error): Error message
GetUsage
returns the resources currently used by the machine.
types.Resources
: resources defined for the machine.
types.AvailableResources
: resources onboarded to Nunet.
types.FreeResources
: resources currently available for new jobs.
types.ResourceAllocation
: resources allocated to a job.
types.MachineResources
: resources available on the machine.
types.GPUVendor
: GPU vendors available on the machine.
types.GPU
: GPU details.
types.GPUs
: A slice of GPU
.
types.CPU
: CPU details.
types.RAM
: RAM details.
types.Disk
: Disk details.
types.NetworkInfo
: Network details.
Last updated: 2024-11-07 21:04:48.695786 File source:
This pacakge manages local jobs and their allocation, including relation to execution environments, etc. It will manage jobs through whatever executor it's running (Vontainer, VM, Direct_exe, Java etc).
Here is quick overview of the contents of this directory:
TBD
The class diagram for the jobs
package is shown below.
Source file
Rendered from source file
TBD
TBD
TBD
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Interfaces & Methods
proposed
Job interface
getPods
: will fetch list of pods currently running on the machine
proposed
Pod interface
combineCapabilities
: will combine the capability requirements of different jobs to calculate total capabality needed for a Pod.
proposed
JobLink interface
validateRelations
: It validates the JobLink properties provided.
proposed
Allocation interface
start
: starts the allocation execution
sendMessage
: sends a message to another actor (Node/Allocation)
register
: registers the allocation with the node that it is running on
Data types
proposed
dms.jobs.Job
: Nunet job which will be sent to the network wrapped as a BidRequest
. If needed it will have child jobs to be executed. The relation between parent and child job needs to be specified.
proposed
dms.jobs.JobLink
: specifies the properties that relate parent and child job.
proposed
dms.jobs.Pod
: collection of jobs that need to be executed on the same machine.
proposed
dms.jobs.Allocation
: maps the job to the process on a executor. Each Allocation is an Actor.
proposed
dms.jobs.AllocationID
: identifier for Allocation objects.
Last updated: 2024-11-07 21:04:49.219066 File source:
proposed
DescriptionThis package is responsible for creation of a Node
object which is the main actor residing on the machine as long as DMS is running. The Node
gets created when the DMS is onboarded.
The Node
is responsible for:
Communicating with other actors (nodes and allocations) via messages. This will include sending bid requests, bids, invocations, job status etc
Checking used and free resource before creating allocations
Continuous monitoring of the machine
Here is quick overview of the contents of this pacakge:
The class diagram for the node
package is shown below.
Source file
Rendered from source file
TBD
TBD
proposed
Refer to *_test.go
files for unit tests of different functionalities.
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Interfaces & Methods
proposed
Node_interface
getAllocation
method retrieves an Allocation
on the machine based on the provided AllocationID
.
checkAllocationStatus
method will retrieve status of an Allocation
.
routeToAllocation
method will route a message to the Allocation
of the job that is running on the machine.
benchmarkCapability
method will perform machine benchmarking
setRegisteredCapability
method will record the benchmarked Capability of the machine into a persistent data store for retrieval and usage (mostly in job orchestration functionality)
getRegisteredCapability
method will retrieve the benchmarked Capability of the machine from the persistent data store.
setAvailableCapability
method changes the available capability of the machine when resources are locked
getAvailableCapability
method will return currently available capability of the node
lockCapability
method will lock certain amount of resources for a job. This can happen during bid submission. But it must happen once job is accepted and before invocation.
getLockedCapabilities
method retrieves the locked capabilities of the machine.
setPreferences
method sets the preferences of a node as dms.orchestrator.CapabilityComparator
getPreferences
method retrieves the node preferences as dms.orchestrator.CapabilityComparator
getRegisteredBids
method retrieves list of bids receieved for a job.
startAllocation
method will create an allocation based on the invocation received.
Data types
proposed
dms.node.Node
An initial data model for Node
is defined below.
proposed
dms.node.NodeID
: Current file which is aimed towards developers who wish to use and modify the DMS functionality.
: Contains the initialization of the package.
: Contains the resource manager which is responsible for managing the resources of dms.
: Contains the implementation of the UsageMonitor
interface.
: Contains the implementation of the store
for the resource manager.
Note: the functionality of DMS is being currently developed. See the section for the suggested design of interfaces and methods.
Note: the functionality of DMS is being currently developed. See the section for the suggested data types.
Allocation as an Actor: As per initial , Allocation
is considered as an Actor. That makes a running job a first class citizen of NuNet's Actor model, so being able to send and receive messages and maintain state.
: Current file which is aimed towards developers who wish to use and modify the DMS functionality.
Note: the functionality of DMS is being currently developed. See the section for the suggested design of interfaces and methods.
Note: the functionality of DMS is being currently developed. See the section for the suggested data types.
Last updated: 2024-11-07 21:04:49.810298 File source: link on GitLab
The orchestrator is responsible for job scheduling and management (manages jobs on other DMSs).
A key distinction to note is the option of two types of orchestration mechanisms: push
and pull
. Broadly speaking pull
orchestration works on the premise that resource providers bid for jobs available in the network, while push
orchestration works when a job is push
ed directly to a known resource provider -- constituting to a more centralized orchestration. push
orchestration develops on the idea that users choose from the available providers and their resources. However, given the decentralized and open nature of the platform, it may be required to engage the providers to get their current (latest) state and preferences. This leads to an overlap with the pull
orchestration approach.
The default setting is to use pull
based orchestration, which is developed in the present proposed specification.
proposed
Job Orchestration
The proposed lifecyle of a job on Nunet platform consists of various operations from job posting to settlement of the contract. Below is a brief explanation of the steps involved in the job orchestration:
Job Posting: The user posts a job request to the DMS. The job request is validated and a Nunet job is created in the DMS.
Search and Match:
a. The Service provider DMS requests for bids from other nodes in the network.
b. DMS on compute provider compares the capability of the available resources against job requirements. If all the requirements are met, it then decides whether to submit a bid.
c. The received bids are assessed and the best bid is selected.
Job Request: In case the shortlisted compute provider has not locked the resources while submitting the bid, the job request workflow is executed. This requires the compute provider DMS to lock the necessary resources required for the job and re-submit the bid. Note that at this stage compute provider can still decline the job request.
Contract Closure: The service provider and the shortlisted compute provider verify that the counterparty is a verified entity and approved by Nunet Solutions to participate in the network. This in an important step to establish trust before any work is performed.
If job does not require any payment (Volunteer Compute), contract is generated by both Service Provider and Compute Provider DMS. This is then verified by Contract-Database
. Otherwise, proof of contract needs to be received from the Contract-Database
before start of work.
Invocation and Allocation: When the contract closure workflow is completed, both the service provider and compute provider DMS have an agreement and proof of contract with them. Then the service provider DMS will send an invocation to the compute provider DMS which results in job allocation being created. Allocation can be understood as an execution space / environment on actual hardware that enables a Job to be executed.
Job Execution: Once allocation is created, the job execution starts on the compute provider machine.
Contract Settlement: After job is completed, service provider DMS verifies the work done. If the work is correct, the Contract-Database
makes the necessary transactions to settle the the contract.
See References section for research blogs with more details on this topic.
Here is quick overview of the contents of this directory:
README: Current file which is aimed towards developers who wish to use and modify the orchestrator
functionality.
specs: Directory containing package specifications, including package class diagram.
Subpackages
graph: Defines and implements interfaces of Graph logic for network topology awareness (proposed).
Source
Rendered from source file
TBD
Note: the functionality of DMS is being currently developed. See the proposed section for the suggested design of interfaces and methods.
TBD
Note: the functionality of DMS is being currently developed. See the proposed section for the suggested data types.
TBD
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Interfaces & Methods
proposed
Orchestrator interface
publishBidRequest
: sends a request for bid to the network for a particular job. This will depend on the network
package for propagation of the request to other nodes in the network.
compareCapability
: compares two capabilities and returns a CapabilityComparison
object. Expected usage is to compare capability required in a job with the available capability of a node.
acceptJob
: looks at the comparison between capabilities and preferences of a node in the form of CapabilityComparator
object and decides whether to accept a job or not.
sendBid
: sends a bid to the node that propagated the BidRequest
.
selectBestBid
: looks at all the bids received and selects the best one.
sendJobRequest
: sends a job request to the shortlisted node whose bid was selected. The compute provider node needs to accept the job request and lock its resources for the job. In case resources are already locked while submitting the bid, this step may be skipped.
sendInvocation
: sends an invocation request (as a message) to the node that accepted the job. This message should have all the necessary information to start an Allocation
for the job.
orchestrateJob
: this will be called when a job is received via postJob endpoint. It will start the orchestration process. It is also possible that this method could be called via a timer for jobs scheduled in the future.
proposed
Actor interface
sendMessage
: sends a message to another actor (Node / Allocation).
processMessage
: processes the message received and decides on what action to take.
proposed
Mailbox interface
receiveMessage
: receives a message from another Node and converts it into a telemetry.Message
object.
handleMessage
: processes the message received.
triggerBehavior
: this is where actions taken by the actor based on the message received will be defined.
getKnownTopics
: retrieves the gossip sub topics known to the node.
getSubscribedTopics
: retrieves the gossip sub topics subscribed by the node.
subscribeToTopic
: subscribes to a gossip sub topic.
unsubscribeFromTopic
: un-subscribes from a gossip sub topic.
proposed
Other methods
Methods for job request functionality a. check whether resources are locked b. lock resources c. accept job request
Methods for contract closure a. validate other node as a registered entity b. generate contract c. kyc validation
Methods for job exeuction a. handle job updates
Methods for contract settlement a. job verification
Note that the above methods not an exhaustive list. These are to be considered as suggestions. The developer implementing the orchestrator functionality is free to make modifications as necessary.
Data types
proposed
dms.orchestrator.Actor
: Actor has a identifier and a mailbox to send/receive messages.
proposed
dms.orchestrator.Bid
: Consists of information sent by the compute provider node to the requestor node as a bid for the job broadcasted to the network.
proposed
dms.orchestrator.BidRequest
: A bid request is a message sent by a node to the network to request for bids.
proposed
dms.orchestrator.PriceBid
: Contains price related information of the bid.
proposed
dms.orchestrator.TimeBid
: Contains time related information of the bid.
proposed
dms.orchestrator.CapabilityComparator
: Preferences of the node which has an influence on the comparison operation.
TBD
proposed
dms.orchestrator.CapabilityComparison
: Result of the comparison operation.
TBD
proposed
dms.orchestrator.Invocation
: An invocation is a message sent by the orchestrator to the node that accepted the job. It contains the job details and the contract.
proposed
dms.orchestrator.Mailbox: A mailbox is a communication channel between two actors. It uses network
package functionality to send and receive messages.
proposed
Other data types
Data types related to allocation, contract settlement, job updates etc are currently omitted. These should be added as applicable while implementation.
Orchestration steps research blogs
The orchestrator functionality of DMS is being developed based on the research done in the following blogs:
Last updated: 2024-11-07 21:04:48.407065 File source: link on GitLab
The hardware package is responsible for handling the hardware related functionalities of the DMS.
Here is quick overview of the contents of this package:
cpu: This package contains the functionality related to the CPU of the device.
ram.go: This file contains the functionality related to the RAM.
disk.go: This file contains the functionality related to the Disk.
gpu: This package contains the functionality related to the GPU of the device.
GetMachineResources()
signature: GetMachineResources() (types.MachineResources, error)
input: None
output: types.MachineResources
output(error): error
GetCPU()
signature: GetCPU() (types.CPU, error)
input: None
output: types.CPU
output(error): error
GetRAM()
signature: GetRAM() (types.RAM, error)
input: None
output: types.RAM
output(error): error
GetDisk()
signature: GetDisk() (types.Disk, error)
input: None
output: types.Disk
output(error): error
The hardware types can be found in the types package.
The tests can be found in the *_test.go
files in the respective packages.