orchestrator

Last updated: 2024-09-17 21:08:44.327009 File source: link on GitLab

orchestrator

Table of Contents

Specification

Description

The orchestrator is responsible for job scheduling and management (manages jobs on other DMSs).

A key distinction to note is the option of two types of orchestration mechanisms: push and pull. Broadly speaking pull orchestration works on the premise that resource providers bid for jobs available in the network, while push orchestration works when a job is pushed directly to a known resource provider -- constituting to a more centralized orchestration. push orchestration develops on the idea that users choose from the available providers and their resources. However, given the decentralized and open nature of the platform, it may be required to engage the providers to get their current (latest) state and preferences. This leads to an overlap with the pull orchestration approach.

The default setting is to use pull based orchestration, which is developed in the present proposed specification.

proposed Job Orchestration

The proposed lifecyle of a job on Nunet platform consists of various operations from job posting to settlement of the contract. Below is a brief explanation of the steps involved in the job orchestration:

  1. Job Posting: The user posts a job request to the DMS. The job request is validated and a Nunet job is created in the DMS.

  2. Search and Match:

    a. The Service provider DMS requests for bids from other nodes in the network.

    b. DMS on compute provider compares the capability of the available resources against job requirements. If all the requirements are met, it then decides whether to submit a bid.

    c. The received bids are assessed and the best bid is selected.

  3. Job Request: In case the shortlisted compute provider has not locked the resources while submitting the bid, the job request workflow is executed. This requires the compute provider DMS to lock the necessary resources required for the job and re-submit the bid. Note that at this stage compute provider can still decline the job request.

  4. Contract Closure: The service provider and the shortlisted compute provider verify that the counterparty is a verified entity and approved by Nunet Solutions to participate in the network. This in an important step to establish trust before any work is performed.

    If job does not require any payment (Volunteer Compute), contract is generated by both Service Provider and Compute Provider DMS. This is then verified by Contract-Database. Otherwise, proof of contract needs to be received from the Contract-Database before start of work.

  5. Invocation and Allocation: When the contract closure workflow is completed, both the service provider and compute provider DMS have an agreement and proof of contract with them. Then the service provider DMS will send an invocation to the compute provider DMS which results in job allocation being created. Allocation can be understood as an execution space / environment on actual hardware that enables a Job to be executed.

  6. Job Execution: Once allocation is created, the job execution starts on the compute provider machine.

  7. Contract Settlement: After job is completed, service provider DMS verifies the work done. If the work is correct, the Contract-Database makes the necessary transactions to settle the the contract.

See References section for research blogs with more details on this topic.

Structure and Organisation

Here is quick overview of the contents of this directory:

  • README: Current file which is aimed towards developers who wish to use and modify the orchestrator functionality.

  • specs: Directory containing package specifications, including package class diagram.

Subpackages

  • graph: Defines and implements interfaces of Graph logic for network topology awareness (proposed).

Class Diagram

Source

orchestrator class diagram

Rendered from source file

!$rootUrlGitlab = "https://gitlab.com/nunet/device-management-service/-/raw/main"
!$packageRelativePath = "/dms/orchestrator"
!$packageUrlGitlab = $rootUrlGitlab + $packageRelativePath
 
!include $packageUrlGitlab/specs/class_diagram.puml

Functionality

TBD

Note: the functionality of DMS is being currently developed. See the proposed section for the suggested design of interfaces and methods.

Data Types

TBD

Note: the functionality of DMS is being currently developed. See the proposed section for the suggested data types.

Testing

TBD

Proposed Functionality / Requirements

List of issues

All issues that are related to the implementation of dms package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.

Interfaces & Methods

proposed Orchestrator interface

type Orchestrator_interface interface {

	publishBidRequest(dms.node.Node, dms.orchestrator.BidRequest, dms.node.NodeID, ..String)

	compareCapability(dms.Capability, dms.Capability) dms.orchestrator.CapabilityComparison
	
    acceptJob(dms.orchestrator.CapabilityComparison, dms.orchestrator.CapabilityComparator) bool
	
    sendBid(dms.node.Node, dms.orchestrator.BidRequest)

    selectBestBid(dms.node.Node, dms.orchestrator.BidRequest) dms.orchestrator.Bid

    sendJobRequest(dms.node.Node, dms.jobs.Pod, dms.node.NodeID)

    sendInvocation(dms.node.Node, dms.jobs.Invocation, dms.node.NodeId)
    
	orchestrateJob(dms.node.Node, dms.jobs.Job)

	// considering that for workflows involving more than one job connected
	// via different levels of connections between nodes
	// an orchestrator needs to be able to calculate network configurations
	// that involve estimation or connections between candidate nodes
	// in the whole workflow. Two next methods are proposed for that purpose.
	
    // takes BidRequest and all bids received as a result of it
	// and calculates all possible network configurations for it
	// and outputting graph structures having all relevant information
	// MOST PROBABLY WILL NOT IMPLEMENT IN dms 0.5.x 
	constructConfigurations(dms.orchestrator.BidRequest, []dms.orchestrator.Bid) []NetworkConfiguration
	
    // takes a set of Network Configurations and Comparator variable and selects best configuration
	based on the comparator supplied
	// MOST PROBABLY WILL NOT IMPLEMENT IN dms 0.5.x 
	selectBestNetworkConfiguration([]NetworkConfiguration, ConfigurationComparator) NetworkConfigurationComparison
}

publishBidRequest: sends a request for bid to the network for a particular job. This will depend on the network package for propagation of the request to other nodes in the network.

compareCapability: compares two capabilities and returns a CapabilityComparison object. Expected usage is to compare capability required in a job with the available capability of a node.

acceptJob: looks at the comparison between capabilities and preferences of a node in the form of CapabilityComparator object and decides whether to accept a job or not.

sendBid: sends a bid to the node that propagated the BidRequest.

selectBestBid: looks at all the bids received and selects the best one.

sendJobRequest: sends a job request to the shortlisted node whose bid was selected. The compute provider node needs to accept the job request and lock its resources for the job. In case resources are already locked while submitting the bid, this step may be skipped.

sendInvocation: sends an invocation request (as a message) to the node that accepted the job. This message should have all the necessary information to start an Allocation for the job.

orchestrateJob: this will be called when a job is received via postJob endpoint. It will start the orchestration process. It is also possible that this method could be called via a timer for jobs scheduled in the future.

proposed Actor interface

type Actor_interface interface {
	// commented as it not shown in the class diagram
	// getMailbox() dms.orchestrator.Mailbox

	sendMessage(message telemetry.Message, ...target any)

    processMessage() telemetry.Message

	// as per actor model, each actor can create another actor and it makes sense to identify this method here 
	// However, this method does not need to be implemented separately 
    // commented as it is not shown in the class diagram
	//createActor()
}

sendMessage: sends a message to another actor (Node / Allocation).

processMessage: processes the message received and decides on what action to take.

proposed Mailbox interface

type Mailbox_interface interface {
	// in order for this to work, we need to have message in the format of telemetry.Message from 
	// coming from the network by implementing 'processMessage()' methods network.P2P and network.Gossipsub  
	// alternatively, we can implement `receiveMessage(payload any)` method here and process both in mailbox interface
	// providing here the proposal for the second option
	
	receiveMessage(payload any) telemetry.Message
	
    handleMessage(telemetry.Message)

	triggerBehavior()

    getKnownTopics()

    getSubscribedTopics()

    subscribeToTopic()

    unsubscribeFromTopic()
}

receiveMessage: receives a message from another Node and converts it into a telemetry.Message object.

handleMessage: processes the message received.

triggerBehavior: this is where actions taken by the actor based on the message received will be defined.

getKnownTopics: retrieves the gossip sub topics known to the node.

getSubscribedTopics: retrieves the gossip sub topics subscribed by the node.

subscribeToTopic: subscribes to a gossip sub topic.

unsubscribeFromTopic: un-subscribes from a gossip sub topic.

proposed Other methods

  • Methods for job request functionality a. check whether resources are locked b. lock resources c. accept job request

  • Methods for contract closure a. validate other node as a registered entity b. generate contract c. kyc validation

  • Methods for job exeuction a. handle job updates

  • Methods for contract settlement a. job verification

Note that the above methods not an exhaustive list. These are to be considered as suggestions. The developer implementing the orchestrator functionality is free to make modifications as necessary.

Data types

  • proposed dms.orchestrator.Actor: Actor has a identifier and a mailbox to send/receive messages.

type Actor struct {
	id types.ID
	mailbox dms.orchestrator.Mailbox
}
  • proposed dms.orchestrator.Bid: Consists of information sent by the compute provider node to the requestor node as a bid for the job broadcasted to the network.

// Bid represents a bid made by the compute provider DMS
type Bid struct {
	// BidRequest is the request for bid against which this bid is made
    BidRequest                  dms.orchestrator.BidRequest
	
    // JobID is ID of the job
    JobID                       int 
	
    // Bidder is the identifier of the node sending the bid
    Bidder                      dms.node.nodeID
	
    // PriceBid is the price information of the bid
    PriceBid                    dms.orchestrator.PriceBid 
	
    // TimeBid is the time information of the bid
    TimeBid                     dms.orchestrator.TimeBid  
	
    // Timeout is the timestamp until which compute provider will be waiting for the job request
    Timeout                     int64    

    // ValidOffer is a flag to indicate whether the bid offer is valid
	ValidOffer                  bool     
	
    // ResourcesLockedUntilTimeout is whether the resources are locked until timeout
    ResourcesLockedUntilTimeout     bool     
}
  • proposed dms.orchestrator.BidRequest: A bid request is a message sent by a node to the network to request for bids.

type BidRequest struct {

    // id is unique identifier for this bid request
	ID types.ID
	
	// request for bids are done with Pods, as pods combine capacities, required from a single machine
	Pod dms.jobs.Pod

	// a requestor could be dms or an allocation; both types are accepted;
	Requestor types.ID

	ResourceRequirements dms.resource.ResourceRequirements

	// comparator specifies the type of constraints and preferences to be used while making a bid
	Comparator dms.orchestrator.CapabilityComparator
}
  • proposed dms.orchestrator.PriceBid: Contains price related information of the bid.

// PriceBid represents the pricing parameters of the bid
type PriceBid struct {
	priceBidType string // perResult or perTimeUnit
	currency     string
	perTimeUnit  string  // if the bid is per time unit, what is unit of time?
	perResult    bool    // if the bid is for the result of computation this is true, default is false
	amount       float64 // the amount in selected currency
}
  • proposed dms.orchestrator.TimeBid: Contains time related information of the bid.

// TimeBid represents time parameters of the bid
type TimeBid struct {
	timeBidType string // duration or fixed time (or other?)
	units       string // units of time
	duration    int
	timeStart   int // or timestamp or whatever
	timeFinish  int // or timestamp or unix time
}
  • proposed dms.orchestrator.CapabilityComparator: Preferences of the node which has an influence on the comparison operation.

TBD

  • proposed dms.orchestrator.CapabilityComparison: Result of the comparison operation.

TBD

  • proposed dms.orchestrator.Invocation: An invocation is a message sent by the orchestrator to the node that accepted the job. It contains the job details and the contract.

type Invocation struct {
	// Pod is a cluster of jobs which are to be run on the same machine
	pod dms.jobs.Pod

	// Contract contains jobID, IDs of both DMS and proof of contract
	contract      tokenomics.Contract
	source        types.ID // since an invocation can in principe be done by any Actor (node or an allocation)
}
  • proposed dms.orchestrator.Mailbox: A mailbox is a communication channel between two actors. It uses network package functionality to send and receive messages.

type Mailbox struct {
	// Access to NuNet p2p network implemented via the network package
	// Whether we need these or not depends on how we implement the send/receive Message methods here;
	// it would be better not to implement them here
	// for proper architectural separation;
	p2p network.P2P	
	gossipsub network.Gossipsub // Access to NuNet gossip network -- same comment as above
}

proposed Other data types

Data types related to allocation, contract settlement, job updates etc are currently omitted. These should be added as applicable while implementation.

References

Orchestration steps research blogs

The orchestrator functionality of DMS is being developed based on the research done in the following blogs:

Last updated