Jobs

Last updated 8 months ago

On Clusterone, a job describes the execution of a project. When creating a job, you can define several parameters, such as the data to be trained on, the machine learning framework and its version, and much more.

post
Create job

https://clusterone.com/api
/jobs
Create a new job.
Request
Response
Body Parameters
repository
required
string
Repository with the code that should be executed
display_name
optional
string
Name of the job
description
optional
string
Job description
launched_at
optional
string
terminated_at
optional
string
datasets_set
optional
string
Datasets that should used with the job
git_commit_hash
optional
string
Commit hash of the code repository to be executed
git_branch
optional
string
logs
optional
string
tags
optional
json
user_panel
optional
string
status
optional
string
resources
optional
string
parameters
required
json
JSON directory of the parameters below
parameters:module
optional
string
Module name
parameters:workers
required
json
Workers parameters directory
parameters:workers:replicas
optional
int32
Number of workers instances, 1 for single mode
parameters:workers:slug
optional
string
Instance type for worker instances. See here for available options.
parameters:parameter_servers
optional
json
Parameter servers parameters directory
parameters:parameter_servers:replicas
optional
int32
Number of parameter servers, only for distributed mode
parameters:parameter_server:slug
optional
string
Instance type for parameter server(s). See here for available options. Distributed mode only
parameters:setup_command
optional
string
Command to prepare environment by eg. installation of requirements
parameters:docker_image
required
json
Machine learning framework docker image to use by worker pods. See here for available options.
parameters:docker_image:slug
required
string
Image slug/ID
parameters:docker_image:name
required
string
Image name. Used to be used as container name.
parameters:docker_image:version
required
string
parameters:docker_image:description
required
string
parameters:docker_image:docker_image_path
required
string
Path of image in any format acceptable by Docker
parameters:docker_image:imagepull_secrets
required
string
parameters:parameter_servers_docker_image
optional
json
Machine learning framework to use by parameter server. See here for available options.
parameters:parameter_servers_docker_image:slug
required
string
Image slug/ID
parameters:parameter_servers_docker_image:name
required
string
Image name. Used to be used as ontainer name.
parameters:parameter_servers_docker_image:version
required
string
parameters:parameter_servers_docker_image:description
required
string
parameters:parameter_servers_docker_image:parameter_servers_docker_image_path
required
string
Path of image in any format acceptable by Docker
parameters:parameter_servers_docker_image:imagepull_secrets
required
string
parameters:mode
optional
string
Can be single or distributed
parameters:time_limit
optional
string
Maxiumum runtime of the job. Accepts hours and minutes as input, formatted as <hours>h<minutes>m. Hours are required, minutes are optional. Examples for valid inputs: 2h, 200h, 500m, 2h23m. Defaults to 48h.
200: OK
Job created successfully
{
"job_id": "d9fc12d3-.....-5da787b18251",
"repository":"885eb608-....-ace0141c2e32",
"repository_owner":"some owner",
"repository_name":"some name",
"repository_owner_photo_url":"",
"display_name":"twilight-dream-711",
"description":"",
"created_at":"2018-01-05T13:04:35.448816Z",
"launched_at":null,
"terminated_at":null,
"modified_at":"2018-01-05T13:04:35.456181Z",
"current_time_left_percentage":0,
"datasets_set":[
],
"members":[
{
"id":number,
"access":number,
"access_level":"owner",
"status":"AC",
"user":number,
"username":"username",
"photo_url":"",
"email":"email@domain.com",
"first_name":"",
"last_name":""
}
],
"total_running_time":0,
"unpaid_running_time":0,
"git_commit_hash":"f5235c28723........ca7c75",
"git_commit":null,
"created_by":number,
"current_run":null,
"runs":[
],
"tags":[
],
"user_panel":{
"user":269,
"pinned":false,
"deleted":false,
"archived":false,
"added_to_board":false,
"panel_data":null,
"created_by":269
},
"status":"created",
"parameters":{
"command": "python -m main"
"docker_image": {
"docker_image_path": "quay.io/clusteronecom/tensorflow:1.3.0-cpu-py36",
"imagepull_secrets": "clusteronecom-deploy-ml-pull-secret",
"name": "tensorflow",
"slug": "tensorflow-1.3.0-cpu-py36",
"version": "1.3.0"
},
"mode": "distributed",
"parameter_servers": {
"blessed": true,
"cpu": 16,
"cudann": null,
"gpu": 0,
"memory": 30,
"queue": "job-master-queue",
"replicas": 1,
"slug": "c5.4xlarge",
"type": "c5.4xlarge",
"type_class": "c"
},
"parameter_servers_docker_image": {
"docker_image_path": "quay.io/clusteronecom/tensorflow:1.8.0-cpu-py36",
"imagepull_secrets": "clusteronecom-deploy-ml-pull-secret",
"name": "tensorflow",
"slug": "tensorflow-1.8.0-cpu-py36",
"version": "1.8.0"
},
"setup_command": "pip install -r requirements.txt"
"time_limit": 2880,
"workers": {
"blessed": true,
"cpu": 8,
"cudann": null,
"gpu": 0,
"memory": 30,
"queue": "job-master-queue",
"replicas": 2,
"slug": "c5.2xlarge",
"type": "c5.2xlarge",
"type_class": "c"
}
}
},
"resources":null
}
400: Bad Request
Bad request
{
"error":"status",
}

Create a job and launch it on a dedicated instance using the configurations passed through the payload. Upon creation of a job, a new virtual Python environment is created. Packages installed by default are:

  • If TensorFlow has been chosen as framework: TensorFlow using the specified version

  • If Pytorch has been chosen as framework: PyTorch with the latest version

  • The latest versions of numpy, scipy, pandas, matplotlib, theano, scikit-learn, and keras

  • Additional packages can be specified in a requirements file.

delete
Delete job

https://clusterone.com/api
/jobs/:job_id
Delete a job with the specified ID.
Request
Response
Path Parameters
job_id
required
int32
ID of the job to delete
200: OK
Job successfully deleted
{
"job_id": "37466b76-........-a85359332960",
"repository": "5d2a1cd2-........-2d15f8ca3b41",
"repository_owner": "username",
"repository_name": "project_name",
"repository_owner_photo_url": "",
"display_name": "fancy-name",
"description": "",
"created_at": "2018-11-14T12:38:23.668468Z",
"launched_at": null,
"terminated_at": null,
"modified_at": "2018-11-14T12:38:23.715617Z",
"current_time_left_percentage": 0,
"datasets_set": [
{
"job":"79ae5def-.........-387494485fd2",
"dataset":"c14be02a-.........-07ff3f7c0278",
"git_commit_hash":"64f9fa3c.........c859f0dac8d",
"mount_point":"username/project_name",
"created_at":"2017-11-21T14:59:04.421861Z"
}
],
"members": [
{
"id": 27,
"access": 50,
"access_level": "owner",
"status": "AC",
"user": 1,
"username": "username",
"photo_url": "",
"email": "username@example.com",
"first_name": "",
"last_name": ""
}
],
"total_running_time": 0,
"unpaid_running_time": 0,
"git_commit_hash": "960213cb.........73fef0af4",
"git_commit": null,
"git_branch": "master",
"created_by": 1,
"current_run": null,
"runs": [],
"tags": [],
"user_panel": {
"user": 1,
"pinned": false,
"deleted": false,
"archived": false,
"added_to_board": false,
"panel_data": null,
"created_by": null
},
"status": "created",
"parameters": {
"docker_image": {
"slug": "name:1",
"name": "name-1",
"version": "1",
"docker_image_path": "",
"imagepull_secrets": "tensorport-kubernetes-pull-secret"
},
"mode": "single",
"setup_command": "pip install -r requirements.txt",
"time_limit": 2880,
"command": "python -m main",
"workers": {
"slug": "c5.2xlarge",
"replicas": 1,
"queue": "job-master-queue",
"type": "c5.2xlarge",
"blessed": true,
"type_class": "c",
"gpu": 0,
"cpu": 8,
"memory": 30,
"cudann": null
}
},
"resources": null,
"owner": "admin"
}

get
Read job

https://clusterone.com/api
/jobs/:job_id/
Get the details of a job with the specified job ID.
Request
Response
Path Parameters
job_id
required
int32
ID of the job
200: OK

get
Get file

https://clusterone.com/api
/jobs/:job_id/files/:filename
Get a specific output file from a job
Request
Response
Path Parameters
job_id
required
int32
ID of the job the file belongs to
filename
required
string
Name of the file to get
200: OK

get
Get files

https://clusterone.com/api
/jobs/:job_id/files
Get all output file from a job
Request
Response
Path Parameters
job_id
required
int32
ID of the job the file belongs to
404: Not Found
Files not found
{
"detail": "No Files found"
}

get
List jobs

https://clusterone.com/api
/jobs
Return a list of all jobs owned by the current user, both running and completed.
Request
Response
Query Parameters
page
optional
string
limit
optional
string
search
optional
string
job_id
optional
int32
status
optional
string
repository
optional
string
datasets
optional
string
display_name
optional
string
created_at
optional
string
created_by
optional
string
200: OK
[
{
"job_id": "37466b76-........-a85359332960",
"repository": "5d2a1cd2-........-2d15f8ca3b41",
"repository_owner": "username",
"repository_name": "project_name",
"repository_owner_photo_url": "",
"display_name": "fancy-name",
"description": "",
"created_at": "2018-11-14T12:38:23.668468Z",
"launched_at": null,
"terminated_at": null,
"modified_at": "2018-11-14T12:38:23.715617Z",
"current_time_left_percentage": 0,
"datasets_set": [
{
"job":"79ae5def-.........-387494485fd2",
"dataset":"c14be02a-.........-07ff3f7c0278",
"git_commit_hash":"64f9fa3c.........c859f0dac8d",
"mount_point":"username/project_name",
"created_at":"2017-11-21T14:59:04.421861Z"
}
],
"members": [
{
"id": 27,
"access": 50,
"access_level": "owner",
"status": "AC",
"user": 1,
"username": "username",
"photo_url": "",
"email": "username@example.com",
"first_name": "",
"last_name": ""
}
],
"total_running_time": 0,
"unpaid_running_time": 0,
"git_commit_hash": "960213cb.........73fef0af4",
"git_commit": null,
"git_branch": "master",
"created_by": 1,
"current_run": null,
"runs": [],
"tags": [],
"user_panel": {
"user": 1,
"pinned": false,
"deleted": false,
"archived": false,
"added_to_board": false,
"panel_data": null,
"created_by": null
},
"status": "created",
"parameters": {
"docker_image": {
"slug": "name:1",
"name": "name-1",
"version": "1",
"docker_image_path": "",
"imagepull_secrets": "tensorport-kubernetes-pull-secret"
},
"mode": "single",
"setup_command": "pip install -r requirements.txt",
"time_limit": 2880,
"command": "python -m main",
"workers": {
"slug": "c5.2xlarge",
"replicas": 1,
"queue": "job-master-queue",
"type": "c5.2xlarge",
"blessed": true,
"type_class": "c",
"gpu": 0,
"cpu": 8,
"memory": 30,
"cudann": null
}
},
"resources": null,
"owner": "admin"
},
{
"job_id": "3a8ac947-......9eb17e0349",
"repository": "5d2a1cd2-.......-2d15f8ca3b41",
"repository_owner": "username",
"repository_name": "project_name",
"repository_owner_photo_url": "/static/images/empty_image_400_400.png",
"display_name": "twilight-water-7",
"description": "",
"created_at": "2018-11-14T11:23:39.559173Z",
"launched_at": null,
"terminated_at": null,
"modified_at": "2018-11-14T11:23:39.577653Z",
"current_time_left_percentage": 0,
"datasets_set": [],
"members": [
{
"id": 26,
"access": 50,
"access_level": "owner",
"status": "AC",
"user": 1,
"username": "admin",
"photo_url": "",
"email": "username@example.com",
"first_name": "",
"last_name": ""
}
],
"total_running_time": 0,
"unpaid_running_time": 0,
"git_commit_hash": "960213c......dc73fef0af4",
"git_commit": null,
"git_branch": "master",
"created_by": 1,
"current_run": null,
"runs": [],
"tags": [],
"user_panel": {
"user": 1,
"pinned": false,
"deleted": false,
"archived": false,
"added_to_board": false,
"panel_data": null,
"created_by": null
},
"status": "created",
"parameters": {
"docker_image": {
"slug": "name:1",
"name": "name-1",
"version": "1",
"docker_image_path": "",
"imagepull_secrets": "tensorport-kubernetes-pull-secret"
},
"mode": "single",
"setup_command": "pip install -r requirements.txt",
"time_limit": 2880,
"command": "python -m main",
"workers": {
"slug": "minikube",
"replicas": 1,
"queue": "job-master-queue",
"type": "minikube",
"blessed": false,
"type_class": "c",
"gpu": 0,
"cpu": 1,
"memory": 1,
"cudann": null
}
},
"resources": null,
"owner": "admin"
}
]

get
Members confirm: list

https://clusterone.com/api
/jobs/:job_id/members/:username/confirm
Request
Response
Path Parameters
job_id
required
int32
ID of the job
username
required
string
The member's username
200: OK

get
Members expire: list

https://clusterone.com/api
/jobs/:job_id/members/:username/expire
Request
Response
Path Parameters
job_id
required
int32
ID of the job
username
required
string
The member's username
200: OK

get
Get status

https://clusterone.com/api
/jobs/:job_id/status
Get job-monitor output for given job ID.
Request
Response
Path Parameters
job_id
required
int32
ID of the job
204: No Content
No content
{
"status": "Status Update Pending"
}
404: Not Found
Job not found
{
"detail": "Not found."
}

put
Update job

https://clusterone.com/api
/jobs/:job_id/
Update an existing job.
Request
Response
Path Parameters
job_id
required
int32
ID of the job to update
Body Parameters
display_name
optional
string
Name of the job
description
optional
string
Job description
launched_at
optional
string
terminated_at
optional
string
datasets_set
optional
string
Datasets that should used with the job
git_commit_hash
optional
string
Commit hash of the code repository to be executed
git_branch
optional
string
logs
optional
string
tags
optional
json
user_panel
optional
string
status
optional
string
resources
optional
string
parameters
required
json
JSON directory of the parameters below
parameters:module
optional
string
Module name
parameters:workers
required
json
Workers parameters directory
parameters:workers:replicas
optional
int32
Number of workers instances, 1 for single mode
parameters:workers:slug
optional
string
Instance type for worker instances. See here for available options.
parameters:parameter_servers
optional
json
Parameter servers parameters directory
parameters:parameter_servers:replicas
optional
int32
Number of parameter servers, only for distributed mode
parameters:parameter_server:slug
optional
string
Instance type for parameter server(s). See here for available options. Distributed mode only
parameters:setup_command
optional
string
Command to prepare environment by eg. installation of requirements
parameters:docker_image
required
json
Machine learning framework docker image to use by worker pods. See here for available options.
parameters:docker_image:slug
required
string
Image slug/ID
parameters:docker_image:name
required
string
Image name. Used to be used as container name.
parameters:docker_image:version
required
string
parameters:docker_image:description
required
string
parameters:docker_image:docker_image_path
required
string
Path of image in any format acceptable by Docker
parameters:docker_image:imagepull_secrets
required
string
parameters:parameter_servers_docker_image
optional
json
Machine learning framework to use by parameter server. See here for available options.
parameters:parameter_servers_docker_image:slug
required
string
Image slug/ID
parameters:parameter_servers_docker_image:name
required
string
Image name. Used to be used as ontainer name.
parameters:parameter_servers_docker_image:version
required
string
parameters:parameter_servers_docker_image:description
required
string
parameters:parameter_servers_docker_image:parameter_servers_docker_image_path
required
string
Path of image in any format acceptable by Docker
parameters:parameter_servers_docker_image:imagepull_secrets
required
string
parameters:mode
optional
string
Can be single or distributed
parameters:time_limit
optional
string
Maxiumum runtime of the job. Accepts hours and minutes as input, formatted as <hours>h<minutes>m. Hours are required, minutes are optional. Examples for valid inputs: 2h, 200h, 500m, 2h23m. Defaults to 48h.
200: OK