Instance Types

MK
Last updated 14 days ago

Clusterone offers a variety of different instance types to run your models on. On the public platform, Clusterone provides AWS instances. For Clusterone Enterprise, instances from any cloud provider are supported.

Purchasing options

There are two main compute options on Clusterone:

  • blessed instances (default): reliable, with a 99.9% uptime guarantee

  • spot instances: can be interrupted at any time. Clusterone automatically manages restart of interrupted jobs.

Naming conventions

Instances are named after their type. Spot instances are identified by a trailing "-spot": <instance-type>[-spot]. Naming examples:

  • blessed: c4.2xlarge

  • spot: c4.2xlarge-spot

Instance characteristics

The table below provides an overview of the instance types Clusterone offers on its public platform, as well as the supported framework versions. For information on pricing for these instances see our pricing page.

Instance

CPUs

GPUs

Memory

GPU Type

Supported Frameworks

Purchase Options

t2.small

1

None

2GiB

None

All

Blessed, Spot

p2.xlarge

4

1

61GiB

NVIDIA K80

PyTorch, TensorFlow up to v1.4

Blessed

p3.2xlarge

8

1

61GiB

NVIDIA Tesla V100

PyTorch, TensorFlow 1.5 and above

Blessed

c4.2xlarge

8

None

15GiB

None

All

Blessed, Spot

For further information about each instance type, visit the AWS website.

Spot instances

Spot instances are spare capacity that is sold at a discount but can be interrupted at any time.

On AWS, you lose a spot instance when it is drained and you have to bid for a new one. Clusterone automatically and continuously bids for spot instances, thus ensuring availability for you.

Jobs running on spot instances that are interrupted are automatically restarted, meaning you can run jobs on spot instances without constantly monitoring them. When a spot instance is drained, Clusterone will procure additional instances and resume the job.

Provided you are using checkpoints, even large-scale long-running workloads do not require any setup or monitoring when running on spot instances.