GPU Deployments

Run GPU-accelerated workloads like ML training, inference, rendering, and video encoding on Kova.

Overview

Kova providers can offer GPU resources alongside CPU and memory. You request GPUs in your SDL manifest, and the marketplace matches you with providers that have the hardware you need. Pricing is set by providers and varies based on GPU model and availability.

Specifying GPU Resources

Add a gpu section to your resource profile:

version: "2.0"
services:
  ml:
    image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
    expose:
      - port: 8888
        as: 8888
        to:
          - global: true

profiles:
  compute:
    ml:
      resources:
        cpu:
          units: 4
        memory:
          size: 16Gi
        storage:
          - size: 100Gi
        gpu:
          units: 1
          attributes:
            vendor:
              nvidia:

deployment:
  ml:
    anywhere:
      profile: ml
      count: 1

GPU Attributes

You can optionally specify a model to target specific hardware:

gpu:
  units: 1
  attributes:
    vendor:
      nvidia:
        - model: a100
        - model: rtx4090

When multiple models are listed, providers with any of the listed models can bid on your deployment.

Available GPU Models

GPU availability depends on what providers bring online. Common models include:

CategoryModels
Data CenterNVIDIA A100, A10, L4, L40S, H100
ConsumerNVIDIA RTX 4090, RTX 3090, RTX 3080

GPU availability varies throughout the day. Submit your deployment during off-peak hours (evenings and weekends UTC) for better pricing and faster matching.

Filtering Providers by GPU

When your deployment requests GPU resources, only providers with matching hardware will bid. In the dashboard:

  1. Submit your GPU-enabled SDL
  2. The bids tab will show only providers with matching GPUs
  3. Each bid shows the GPU model and price
  4. Accept the bid with the best price or specs for your workload

Common GPU Workloads

ML Training with PyTorch

services:
  train:
    image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
    env:
      - NVIDIA_VISIBLE_DEVICES=all
    params:
      storage:
        data:
          mount: /workspace
          source: uploads

Inference with Ollama

services:
  ollama:
    image: ollama/ollama:latest
    expose:
      - port: 11434
        as: 11434
        to:
          - global: true
    env:
      - NVIDIA_VISIBLE_DEVICES=all

Jupyter Notebook

services:
  jupyter:
    image: jupyter/tensorflow-notebook:latest
    expose:
      - port: 8888
        as: 8888
        to:
          - global: true
    env:
      - JUPYTER_TOKEN=your-token-here

CUDA and Driver Compatibility

GPU containers run with NVIDIA Container Runtime on the provider. Compatibility depends on the provider's driver version:

CUDA VersionMinimum Driver
CUDA 12.x525.60+
CUDA 11.8520.61+
CUDA 11.7515.43+

Use container images with CUDA bundled (e.g. pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime) rather than relying on the host CUDA installation. This ensures compatibility across different providers.

Resource Recommendations

WorkloadGPUCPUMemoryStorage
Small model inference1x RTX 308048Gi50Gi
Large model inference1x A100832Gi200Gi
Training (small)1x RTX 4090816Gi100Gi
Training (large)1x A100 / H1001664Gi500Gi
Rendering1x RTX 4090416Gi50Gi