Runtime Environment

Last updated 4 months ago

Each job on Clusterone runs in its own separate container. The mount points for code and data are always the same, independent of the type of project and the data source.

At runtime, the following folders are available to each job:

  • /data contains datasets

  • /public contains ready-to-use datasets, like ImageNet and CIFAR-10

  • /code contains the code

  • /logs contains job outputs that will be saved after termination

Data

When creating a job, you can select datasets that should be available for this job. They will be mounted in the /data folder and can be accessed by your project code.

Each dataset is mounted in its own folder. The folder structure is as follows: /data/<owner-username>/<dataset-name>.

  • <owner-username> is the username of the owner of the dataset. If you created the dataset, this is your username. If another user has shared a dataset with you, this is the username of the user who owns the dataset.

  • <dataset-name> is the name of the dataset. This is the name you gave the dataset when you created it.

Mounting characteristics by dataset source

Data source

Mount type

Avaliable

Access Level

GitLab

Copied before runtime

/data

Full

GitHub

Copied before runtime

/data

Full

Amazon S3

Copied before runtime

/data

Full

Clusterone Public Datasets

Mounted directly on pod

/public

Read Only

Amazon EFS (Clusterone Enterprise only)

Mounted directly on pod

N/A

Full

NFS

Mounted directly on pod

N/A

Full

As can be seen in the table above, Git and S3 datasets are copied before runtime. Changes to their content will not impact the original dataset. EFS/NFS datasets, on the other hand, are mounted on the pods. Changes to their content directly affect the source

The get_data_path() function

The Clusterone Python package offers the get_data_path() convenience function to simplify switching between local execution of a script and running it on Clusterone.

The function automatically detects if a project is run on a local machine or on Clusterone and returns the correct path to the dataset. It accepts the path to the local file and the owner name and dataset name on Clusterone as inputs.

See the Python Package Documentation for more details on get_data_path().

Projects

The code of the project associated with the job is copied to /code when a job is created.

Log files

All outputs are stored in /logs. These files will persist after the job terminates and are available from the Matrix as outputs.

Clusterone's integrated TensorBoard also reads summaries from the /logs folder of a job.

Environment variables

Only for distributed jobs

Variable

Content

CLUSTERONE_CLOUD

"clusterone_cloud"

PS_HOSTS

list of IPs with port for PS pods

WORKER_HOSTS

list of IPs with port for worker pods

TYPE

"worker" or "ps" or "master"

TASK_INDEX

Node number ex: 0,1,2...

IS_MASTER

"True" if given pod is "Master"

JOB_ID

Job ID from clusterone

MPI Jobs environment variables coming soon!