just create job

Last updated 8 months ago

Create a new job on Clusterone.

Usage

just create job (single|distributed) [options]

Either single or distributed need to be specified. Note that some options only apply to one of the selections.

  • single: Create a job running on only one machine

  • distributed: Create a job that runs on a cluster of machines.

Typical example

While the create job command supports many different options, their default values are sufficient for most scenarios. A typical example for creating a job could look like this:

just create job distributed --project my_project --dataset my_dataset \
--docker-image tensorflow-1.11.0-cpu-py35

Note that --project and --docker-image are required. You can view the list of docker images by executing just create job distributed -h. By default, same docker image will be used for parameter server. --dataset includes a dataset to work on.

Minimum required options to create a job:

just create job (single|distributed) --project my_project --docker-image <value>

Options

--name <job name>

Name of the job to create. Can contain letters (upper- and lowercase), numbers, as well as - and _. Defaults to a random name if not specified. Optional.

--project [username]/<project name>

Name of the Clusterone project containing the code for this job. Has to point to an existing project. Username is optional, defaults to current username if not specified. Required.

--commit <Git commit hash>

Git commit to be used for the project code. Uses the Git commit hash to identify a commit. Defaults to the latest commit if not specified. Optional.

--datasets [username]/<dataset name>:[Git commit hash], ...

Names of datasets that should be used with the job. Has to point to existing datasets. Username is optional, defaults to the current username if not specified. Git commit hash is optional, defaults to the latest commit if not specified. Multiple datasets can be included in a comma-separated list. No datasets are loaded if not specified. Optional.

--command <"python -m main">

Name of the bash command to run. Defaults to python -m main if not specified. Optional.

--docker-image <docker_image>

Machine learning docker image and version to use for worker. See here to learn more about the docker images for framework offered by Clusterone. Required.

--setup-command <"pip install -r requirements.txt">

Command to prepare environment by eg. installation of requirements. Empty by default. Optional.

--time-limit <duration>

Time limit after which the job stops running. Accepts hours and minutes as input, formatted as <hours>h<minutes>m. Hours are required, minutes are optional. Examples for valid inputs: 2h, 200h, 500m, 2h23m. Defaults to 48h if not specified. Optional.

--description <text>

Description for the job. Appears in the GUI below the job. Default to empty. Optional.

--gpu-count <number>

Number of instance GPUs used for the job. Defaults to total number of GPUs available on the instance. Optional.

Options for distributed jobs only

--worker-type <instance_type>

Type of hardware instance for worker nodes. Has to be one of the following options: t2.small, t2.small-spot, p2.xlarge, p3.2xlarge, c4.2xlarge, or c4.2xlarge-spot. See here to learn more about the instance types offered by Clusterone. Defaults to c4.2xlarge if not specified. Optional.

--worker-replicas <number>

Number of workers. Defaults to 2 if not specified. Optional.

--ps-type <instance_type>

Type of hardware instance for parameter servers. Has to be one of the following options: t2.small, t2.small-spot, c4.2xlarge, or c4.2xlarge-spot. See here to learn more about the instance types offered by Clusterone. Defaults to c4.2xlarge if not specified. Optional.

--ps-replicas <number>

Number of parameter servers. Defaults to 1 if not specified. Optional.

--ps-docker-image <docker_image>

Docker image to run on parameter servers. See here to learn more about the docker images for framework offered by Clusterone. By default it will use the same image as passed to --docker-image argument. Optional.

Options for single node jobs only

--instance-type <instance_type>

Type of hardware instance for single node jobs. Has to be one of the following options: t2.small, t2.small-spot, p2.xlarge, p3.2xlarge, c4.2xlarge, or c4.2xlarge-spot. See here to learn more about the instance types offered by Clusterone. Defaults to c4.2xlarge if not specified. Optional.