Python Package Documentation

Last updated 8 months ago

The Clusterone Python package makes it easy to prepare Python code to run on Clusterone. It also provides the just command line interface to interact with Clusterone from the command line.

Installation

The clusterone package is available for download from PyPi.

The easiest way to install clusterone is using the pip package manager. To install the latest version with pip, open a command line and type:

pip install clusterone

Import

Most users will only require the get_data_path() and get_logs_path() functions in their code. Import the functions into your Python code like this:

from clusterone import get_data_path, get_logs_path

To use the entire clusterone package in your Python code, import it like this:

import clusterone

get_data_path()

get_data_path() enables your program to access data on your local machine as well as Clusterone without making changes to the code.

Returns the local dataset path if the code is run locally, or the Clusterone dataset path if the code is run on Clusterone.

Syntax

clusterone.get_data_path(dataset_name, local_root, local_repo, path)

Arguments

  • dataset_name: string, Clusterone dataset path.

    Structure: <owner-username>/<dataset-name>, where

    • <owner-username> is the name of the user who owns the dataset.

    • <dataset-name> is the name of the dataset.

  • local_root: string, root directory for the local dataset, e.g. /home/username/datasets

  • local_repo: string, repository name inside the local root directory, e.g. mnist

  • path: string, path inside the repository, e.g. train.

Returns

  • <local_root>/<local_repo>/<path> if the code is running on the local machine.

  • /data/<dataset_name>/<path> if the code is running on Clusterone.

Example

from clusterone import get_data_path
#...
data_path = get_data_path(
dataset_name = 'my_username/dataset_name', # on ClusterOne
local_root = '~/Documents/datasets/', # path to local dataset
local_repo = 'my_data', # local data folder name
path = 'train' # folder within the data folder
)

In this example, data_path resolve to:

  • ~/Documents/datasets/my_data/train when running the code locally

  • /data/my_username/dataset_name/train when running on Clusterone

get_logs_path()

Returns a local log path if the code is running locally, or the log path on Clusterone if it is running on Clusterone.

Syntax

clusterone.get_logs_path(root)

Arguments

  • root: string, local directory for logs, e.g. /home/username/logs/mnist

Returns

  • <root> if the if the code is running on the local machine.

  • /logs/ if the code is running on Clusterone.

Example

from clusterone import get_logs_path
# ...
logs_path = get_logs_path('~/Documents/tf-logs/')

In this example, logs_path resolve to:

  • ~/Documents/tf-logs/ when running the code locally

  • /logs/ when running on Clusterone