Jobs on Clusterone

MK
Last updated 3 months ago

Executing code on Clusterone happens in jobs. A job represents the configuration of the environment that the code runs it. This configuration includes which code file to run, what data to operate on, and so forth. See below for a list of available job parameters.

A job can be run many times, each time creating the exact same runtime configuration. To run the code with a different configuration, you have to create a new job.

After a job has finished its execution, the output data from the run is available. If the code supports it, a job can also be analyzed in TensorBoard.

See here to learn more about data output from a job.

Job Parameters

When creating a job, you can configure a variety of parameters. This section discusses the parameters and their impact on the job execution.

You might also want to take a look at the reference manual of the Clusterone CLI.

Project and Code Settings

  • Project: This is the repository where the code job should run is stored in. See here to learn more about projects on Clusterone.

  • Git commit: All repository options are based on Git. You can choose the specific commit you want to run.

  • Command: This is the bash command that will be executed by Clusterone to start the job. It should be the same command that you would use to run the code locally, for example python -m mnist.

  • Setup Command: bash script that will be executed before command on every pod . If this parameter is not provided, Clusterone assumes that the command is

    pip install -r requirements.txt

Datasets

The data that your job operates on. See here to learn more about datasets on Clusterone.

Python Environment and Machine Learning Framework

  • Docker image: Because Clusterone runs jobs in Docker containers, a base image has to be selected for the job. This allows for selecting different frameworks and versions. A list of supported frameworks and their respective docker image names can be found here

Hardware Resources

  • Single or Distributed: Clusterone supports running code on single machines, as well as distributed computing based on data parallelism.

  • Number and type of instances: Clusterone offers different types of hardware instances. See here to learn more.

Job Metadata

  • Name: Give your job a unique name to remember it by.

  • Description: Describe what your job does or how it differs from other jobs.

  • Maximum runtime: The maximum time after which your job is terminated if it didn't finish before.