GPU Acceleration

Configure Veriafy for maximum performance with GPU support

Supported GPUs

NVIDIA (CUDA)

  • • RTX 3060 or higher (recommended)
  • • CUDA 11.8 or higher
  • • cuDNN 8.6 or higher
  • • 8GB+ VRAM recommended

Apple Silicon

  • • M1, M2, M3, M4 chips
  • • Metal Performance Shaders
  • • macOS 12.0 or higher
  • • Automatic optimization

Installation

NVIDIA GPU

pip install veriafy[gpu]

Apple Silicon

pip install veriafy # GPU enabled by default on Apple Silicon

Enable GPU

CLI

# Enable GPU globally
veriafy config set gpu true

# Or per-command
veriafy classify image.jpg --model veriafy/nsfw --gpu

Python SDK

from veriafy import Veriafy

# Enable GPU on initialization
client = Veriafy(gpu=True)

# Check GPU status
print(f"GPU available: {client.gpu_available}")
print(f"GPU name: {client.gpu_name}")
print(f"VRAM: {client.gpu_memory_mb} MB")

Performance Comparison

OperationCPUGPU (RTX 4090)Speedup
Single image15ms2ms7.5x
Batch (1000 images)12s0.8s15x
Video (1 min)8s0.5s16x
Model training (10k vectors)45min3min15x

Multi-GPU Support

from veriafy import Veriafy

# Automatic multi-GPU distribution
client = Veriafy(gpu=True, gpu_ids=[0, 1, 2, 3])

# Process with automatic load balancing
results = client.classify_batch(
    files=large_file_list,
    model="veriafy/classifier",
    batch_size=256,  # Larger batches for multi-GPU
)

# Or manually control GPU assignment
with client.gpu_context(gpu_id=0):
    result1 = client.classify(file1, model="veriafy/model1")

with client.gpu_context(gpu_id=1):
    result2 = client.classify(file2, model="veriafy/model2")

Docker with GPU

# Run with NVIDIA GPU
docker run --gpus all -d -p 8080:8080 veriafy/veriafy:latest-gpu

# Docker Compose
services:
  veriafy:
    image: veriafy/veriafy:latest-gpu
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    environment:
      - VERIAFY_GPU=1
      - CUDA_VISIBLE_DEVICES=0,1

Troubleshooting

GPU not detected

Run veriafy doctor to diagnose. Ensure CUDA drivers are installed and nvidia-smi works.

Out of memory

Reduce batch_size or useVERIAFY_GPU_MEMORY_FRACTION=0.8 to limit VRAM usage.

Slow first inference

First run compiles CUDA kernels. Use veriafy warmupto pre-compile for your hardware.

Next Steps

Veriafy - Universal File Classification Platform