GPU Deployments

Run GPU-accelerated workloads like ML training, inference, rendering, and video encoding on Kova.

Overview

Kova providers can offer GPU resources alongside CPU and memory. You request GPUs in your SDL manifest, and the marketplace matches you with providers that have the hardware you need. Pricing is set by providers and varies based on GPU model and availability.

Specifying GPU Resources

Add a gpu section to your resource profile:

version: "2.0"
services:
  ml:
    image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
    expose:
      - port: 8888
        as: 8888
        to:
          - global: true

profiles:
  compute:
    ml:
      resources:
        cpu:
          units: 4
        memory:
          size: 16Gi
        storage:
          - size: 100Gi
        gpu:
          units: 1
          attributes:
            vendor:
              nvidia:

deployment:
  ml:
    anywhere:
      profile: ml
      count: 1

GPU Attributes

You can optionally specify a model to target specific hardware:

gpu:
  units: 1
  attributes:
    vendor:
      nvidia:
        - model: a100
        - model: rtx4090

When multiple models are listed, providers with any of the listed models can bid on your deployment.

Available GPU Models

GPU availability depends on what providers bring online. Common models include:

Category	Models
Data Center	NVIDIA A100, A10, L4, L40S, H100
Consumer	NVIDIA RTX 4090, RTX 3090, RTX 3080

GPU availability varies throughout the day. Submit your deployment during off-peak hours (evenings and weekends UTC) for better pricing and faster matching.

Filtering Providers by GPU

When your deployment requests GPU resources, only providers with matching hardware will bid. In the dashboard:

Submit your GPU-enabled SDL
The bids tab will show only providers with matching GPUs
Each bid shows the GPU model and price
Accept the bid with the best price or specs for your workload

Common GPU Workloads

ML Training with PyTorch

services:
  train:
    image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
    env:
      - NVIDIA_VISIBLE_DEVICES=all
    params:
      storage:
        data:
          mount: /workspace
          source: uploads

Inference with Ollama

services:
  ollama:
    image: ollama/ollama:latest
    expose:
      - port: 11434
        as: 11434
        to:
          - global: true
    env:
      - NVIDIA_VISIBLE_DEVICES=all

Jupyter Notebook

services:
  jupyter:
    image: jupyter/tensorflow-notebook:latest
    expose:
      - port: 8888
        as: 8888
        to:
          - global: true
    env:
      - JUPYTER_TOKEN=your-token-here

CUDA and Driver Compatibility

GPU containers run with NVIDIA Container Runtime on the provider. Compatibility depends on the provider's driver version:

CUDA Version	Minimum Driver
CUDA 12.x	525.60+
CUDA 11.8	520.61+
CUDA 11.7	515.43+

Use container images with CUDA bundled (e.g. pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime) rather than relying on the host CUDA installation. This ensures compatibility across different providers.

Resource Recommendations

Workload	GPU	CPU	Memory	Storage
Small model inference	1x RTX 3080	4	8Gi	50Gi
Large model inference	1x A100	8	32Gi	200Gi
Training (small)	1x RTX 4090	8	16Gi	100Gi
Training (large)	1x A100 / H100	16	64Gi	500Gi
Rendering	1x RTX 4090	4	16Gi	50Gi