Last updated: 2024-11-21 22:05:51.116314 File source: link on GitLab
The orchestrator is responsible for job scheduling and management (manages jobs on other DMSs).
A key distinction to note is the option of two types of orchestration mechanisms: push
and pull
. Broadly speaking pull
orchestration works on the premise that resource providers bid for jobs available in the network, while push
orchestration works when a job is push
ed directly to a known resource provider -- constituting to a more centralized orchestration. push
orchestration develops on the idea that users choose from the available providers and their resources. However, given the decentralized and open nature of the platform, it may be required to engage the providers to get their current (latest) state and preferences. This leads to an overlap with the pull
orchestration approach.
The default setting is to use pull
based orchestration, which is developed in the present proposed specification.
proposed
Job Orchestration
The proposed lifecyle of a job on Nunet platform consists of various operations from job posting to settlement of the contract. Below is a brief explanation of the steps involved in the job orchestration:
Job Posting: The user posts a job request to the DMS. The job request is validated and a Nunet job is created in the DMS.
Search and Match:
a. The Service provider DMS requests for bids from other nodes in the network.
b. DMS on compute provider compares the capability of the available resources against job requirements. If all the requirements are met, it then decides whether to submit a bid.
c. The received bids are assessed and the best bid is selected.
Job Request: In case the shortlisted compute provider has not locked the resources while submitting the bid, the job request workflow is executed. This requires the compute provider DMS to lock the necessary resources required for the job and re-submit the bid. Note that at this stage compute provider can still decline the job request.
Contract Closure: The service provider and the shortlisted compute provider verify that the counterparty is a verified entity and approved by Nunet Solutions to participate in the network. This in an important step to establish trust before any work is performed.
If job does not require any payment (Volunteer Compute), contract is generated by both Service Provider and Compute Provider DMS. This is then verified by Contract-Database
. Otherwise, proof of contract needs to be received from the Contract-Database
before start of work.
Invocation and Allocation: When the contract closure workflow is completed, both the service provider and compute provider DMS have an agreement and proof of contract with them. Then the service provider DMS will send an invocation to the compute provider DMS which results in job allocation being created. Allocation can be understood as an execution space / environment on actual hardware that enables a Job to be executed.
Job Execution: Once allocation is created, the job execution starts on the compute provider machine.
Contract Settlement: After job is completed, service provider DMS verifies the work done. If the work is correct, the Contract-Database
makes the necessary transactions to settle the the contract.
See References section for research blogs with more details on this topic.
Here is quick overview of the contents of this directory:
README: Current file which is aimed towards developers who wish to use and modify the orchestrator
functionality.
specs: Directory containing package specifications, including package class diagram.
Subpackages
graph: Defines and implements interfaces of Graph logic for network topology awareness (proposed).
Source
Rendered from source file
TBD
Note: the functionality of DMS is being currently developed. See the proposed section for the suggested design of interfaces and methods.
TBD
Note: the functionality of DMS is being currently developed. See the proposed section for the suggested data types.
TBD
List of issues
All issues that are related to the implementation of dms
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Interfaces & Methods
proposed
Orchestrator interface
publishBidRequest
: sends a request for bid to the network for a particular job. This will depend on the network
package for propagation of the request to other nodes in the network.
compareCapability
: compares two capabilities and returns a CapabilityComparison
object. Expected usage is to compare capability required in a job with the available capability of a node.
acceptJob
: looks at the comparison between capabilities and preferences of a node in the form of CapabilityComparator
object and decides whether to accept a job or not.
sendBid
: sends a bid to the node that propagated the BidRequest
.
selectBestBid
: looks at all the bids received and selects the best one.
sendJobRequest
: sends a job request to the shortlisted node whose bid was selected. The compute provider node needs to accept the job request and lock its resources for the job. In case resources are already locked while submitting the bid, this step may be skipped.
sendInvocation
: sends an invocation request (as a message) to the node that accepted the job. This message should have all the necessary information to start an Allocation
for the job.
orchestrateJob
: this will be called when a job is received via postJob endpoint. It will start the orchestration process. It is also possible that this method could be called via a timer for jobs scheduled in the future.
proposed
Actor interface
sendMessage
: sends a message to another actor (Node / Allocation).
processMessage
: processes the message received and decides on what action to take.
proposed
Mailbox interface
receiveMessage
: receives a message from another Node and converts it into a telemetry.Message
object.
handleMessage
: processes the message received.
triggerBehavior
: this is where actions taken by the actor based on the message received will be defined.
getKnownTopics
: retrieves the gossip sub topics known to the node.
getSubscribedTopics
: retrieves the gossip sub topics subscribed by the node.
subscribeToTopic
: subscribes to a gossip sub topic.
unsubscribeFromTopic
: un-subscribes from a gossip sub topic.
proposed
Other methods
Methods for job request functionality a. check whether resources are locked b. lock resources c. accept job request
Methods for contract closure a. validate other node as a registered entity b. generate contract c. kyc validation
Methods for job exeuction a. handle job updates
Methods for contract settlement a. job verification
Note that the above methods not an exhaustive list. These are to be considered as suggestions. The developer implementing the orchestrator functionality is free to make modifications as necessary.
Data types
proposed
dms.orchestrator.Actor
: Actor has a identifier and a mailbox to send/receive messages.
proposed
dms.orchestrator.Bid
: Consists of information sent by the compute provider node to the requestor node as a bid for the job broadcasted to the network.
proposed
dms.orchestrator.BidRequest
: A bid request is a message sent by a node to the network to request for bids.
proposed
dms.orchestrator.PriceBid
: Contains price related information of the bid.
proposed
dms.orchestrator.TimeBid
: Contains time related information of the bid.
proposed
dms.orchestrator.CapabilityComparator
: Preferences of the node which has an influence on the comparison operation.
TBD
proposed
dms.orchestrator.CapabilityComparison
: Result of the comparison operation.
TBD
proposed
dms.orchestrator.Invocation
: An invocation is a message sent by the orchestrator to the node that accepted the job. It contains the job details and the contract.
proposed
dms.orchestrator.Mailbox: A mailbox is a communication channel between two actors. It uses network
package functionality to send and receive messages.
proposed
Other data types
Data types related to allocation, contract settlement, job updates etc are currently omitted. These should be added as applicable while implementation.
Orchestration steps research blogs
The orchestrator functionality of DMS is being developed based on the research done in the following blogs:
Last updated: 2024-11-21 22:05:51.381333 File source: link on GitLab
This whole package is proposed
status and therefore documentation is missing, save for the proposed functionality part.
TBD
TBD
Source
Rendered from source file
TBD
TBD
TBD
List of issues
All issues that are filed in GitLab related to the implementation of dms/orchestrator
package can be found below. These include any proposals for modifications to the package or new functionality needed to cover the requirements of other packages.
Proposed functionalities
TBD
Data types
proposed
LocalNetworkTopology
more complex deployments may need a data structure, which considers local network topology of a node / dms -- i.e. for reasoning about speed of connection (as well as capabilities) between neighbors.
Related research blogs
TBD