AI Templates

GPU environment setup typically consumes significant development time. You install CUDA drivers, resolve version conflicts, and align library dependencies before writing a single line of model code. NeevCloud AI Templates eliminate this overhead by providing production-ready GPU images with pre-configured CUDA, frameworks, and access methods.

Each AI template launches within seconds, pre-configured with GPU drivers, CUDA libraries, notebooks, and web/ssh access - so engineers can immediately start training, fine-tuning, or serving models without touching the underlying infrastructure.

What You Get with AI Templates

AI Templates are pre-built GPU images optimized for specific AI workloads. Each template includes:

  • Base OS and CUDA runtime - Pre-installed and version-matched

  • Framework libraries - PyTorch, TensorFlow, or specialized tools with locked versions

  • System utilities - SSH, JupyterLab, and workflow-specific interfaces

  • Network configuration - Exposed ports for APIs, dashboards, and notebooks

  • Storage optimization - Compressed registry images that expand on deployment

You select a template during GPU deployment. You can switch templates between workloads without rebuilding environments.

Why This Matters for Your Workflow

Problems AI Templates Solve

  • Version drift across environments: Your PyTorch model trains locally but fails in CI or production due to CUDA version mismatches or conflicting dependencies.

  • Unpredictable setup time: What should take a minutes extends to hours debugging driver installations, library conflicts, or missing system packages.

  • Poor reproducibility: Your teammate's GPU runs the same code with different results because of environment inconsistencies.

  • Manual inference optimization: You spend time tuning batch sizes, memory limits, and GPU configurations through trial and error instead of starting with optimized defaults.

  • Hidden performance issues: Your GPU utilization stays low because of misconfigured runtimes or suboptimal library versions.

What You Gain

  • Consistent environments: Identical CUDA versions, drivers, and libraries across development, staging, and production eliminate environment-related bugs.

  • Immediate productivity: Your GPU is ready to run training scripts or serve models the moment it deploys.

  • Reduced debugging overhead: You spend less time troubleshooting infrastructure and more time improving model performance.

  • Team standardization: Everyone on your team builds on the same tested base, reducing onboarding friction and knowledge silos.

  • Performance-aware defaults: Templates ship with optimized settings for memory management, batching, and inference paths based on the intended workload.

Your Deployment Workflow

  1. Select your GPU type based on compute and memory requirements

  2. Choose an AI Template matching your workload

  3. Deploy the GPU - provisioning completes in seconds

  4. Connect via SSH, JupyterLab, or web interface

  5. Start working - train models, run inference, or build pipelines

No manual CUDA installation. No framework compilation. No dependency resolution.

Quick Selection Guide

Choose your template based on your primary workload:

  • Training and notebook-based work → NVIDIA CUDA or TensorFlow templates

  • LLM fine-tuning → Axolotl or H2O LLM Studio

  • Image generation workflows → ComfyUI

  • Production LLM inference → Triton vLLM or vLLM OpenAI

  • Local LLM experimentation → Ollama + Open WebUI

  • Custom builds → Ubuntu base templates


Template Specifications

PyTorch (with Ubuntu 22.04)

Use this for: General-purpose GPU development for PyTorch training, notebooks, and classical ML workflows.

Specification
Details

Key Libraries

torch (CUDA 12.8), torchvision, torchaudio, JupyterLab, notebook, NumPy, Pandas, Matplotlib, Seaborn, scikit-learn

Exposed Ports

22 (SSH), 8888 (JupyterLab)

Registry Size

10.81 GB

Deployed Size

17 GB

Access Methods

SSH, JupyterLab

This is your starting point for PyTorch-based deep learning. You get a complete scientific computing stack with visualization libraries and notebook support for interactive development.

PyTorch (with Ubuntu 24.04)

Use this for: Same capabilities as Ubuntu 22.04 template, but with newer OS for long-term standardization.

Specification
Details

Key Libraries

torch (CUDA 12.8), torchvision, torchaudio, JupyterLab, notebook, NumPy, Pandas, Matplotlib, Seaborn, scikit-learn

Exposed Ports

22 (SSH), 8888 (JupyterLab)

Registry Size

10.86 GB

Deployed Size

17.2 GB

Access Methods

SSH, JupyterLab

Use this if your team is standardizing on Ubuntu 24.04 LTS or if you need newer system package versions.

vLLM

Use this for: Fast OpenAI-compatible LLM inference with chat and completion API endpoints.

Specification
Details

Key Libraries

vLLM, JupyterLab, notebook, NumPy, Pandas, Matplotlib

Exposed Ports

22 (SSH), 8888 (JupyterLab), 8080 (API)

Registry Size

11.82 GB

Deployed Size

26.6 GB

Access Methods

SSH, JupyterLab

vLLM provides high-throughput serving for large language models with continuous batching and PagedAttention optimization. You get OpenAI API compatibility out of the box, making it easy to swap endpoints in existing applications.

TensorFlow

Use this for: TensorFlow-based training and research using NVIDIA-optimized containers.

Specification
Details

Key Libraries

TensorFlow, Keras ≥3.11.3, JupyterLab ≥4.4.8, notebook, ipykernel, protobuf ≥5.29.5

Exposed Ports

22 (SSH), 8888 (JupyterLab)

Registry Size

8.74 GB

Deployed Size

18.1 GB

Access Methods

SSH, JupyterLab

This template includes NVIDIA's optimized TensorFlow build with improved GPU performance for training and inference. Keras 3 provides a unified API for multi-framework workflows.

Triton Inference Server

Use this for: High-performance production LLM inference using NVIDIA Triton with vLLM backend.

Specification
Details

Key Libraries

Triton Server, vLLM, FastAPI ≥0.115.0, Starlette ≥0.47.2, JupyterLab, notebook

Exposed Ports

22 (SSH), 8000 (HTTP), 8001 (gRPC), 8002 (metrics), 8888 (JupyterLab)

Registry Size

8.21 GB

Deployed Size

18.4 GB

Access Methods

SSH, JupyterLab, Triton HTTP API, Triton gRPC API, Triton Metrics API

Triton supports concurrent model execution, dynamic batching, and multi-framework deployment. You can serve multiple models simultaneously with built-in monitoring and metrics endpoints for production observability.

ComfyUI

Use this for: Node-based Stable Diffusion and image generation workflows.

Specification
Details

Key Libraries

ComfyUI (master), PyTorch 2.5.1, JupyterLab, notebook, NumPy, Pandas

Exposed Ports

22 (SSH), 3000 (ComfyUI), 8888 (JupyterLab)

Registry Size

6.28 GB

Deployed Size

12.2 GB

Access Methods

SSH, JupyterLab, ComfyUI Web Interface

ComfyUI provides a visual workflow editor for diffusion model pipelines. You can chain models, upscalers, and post-processing steps without writing code, making it ideal for rapid prototyping and creative experimentation.

H2O LLM Studio

Use this for: Low-code LLM fine-tuning, evaluation, and experimentation.

Specification
Details

Key Libraries

torch ≥2.8.0, transformers ≥4.53.0, protobuf ≥6.31.1, tornado ≥6.5.0, requests ≥2.32.4, starlette ≥0.47.2

Exposed Ports

22 (SSH), 10101 (Dashboard)

Registry Size

18.96 GB

Deployed Size

31.7 GB

Access Methods

SSH, H2O LLM Studio Dashboard

H2O LLM Studio provides a GUI for fine-tuning large language models without writing training loops. You configure hyperparameters, monitor training metrics, and export models through a web interface.

Ollama + Open WebUI

Use this for: Local LLM serving and chat-based experimentation.

Specification
Details

Key Libraries

Ollama, Open WebUI

Exposed Ports

22 (SSH), 8080 (Web UI), 11434 (Ollama API)

Registry Size

3.52 GB

Deployed Size

10.55 GB

Access Methods

SSH, Open WebUI Dashboard

Ollama provides simple model management with a ChatGPT-style interface via Open WebUI. You can test open-source models like Llama, Mistral, or Phi with minimal configuration.

Axolotl

Use this for: Advanced LLM fine-tuning using LoRA, QLoRA, and full parameter fine-tuning.

Specification
Details

Key Libraries

torch ≥2.8.0, torchvision, torchaudio, TensorFlow, JupyterLab, notebook, ipywidgets, scikit-learn ≥1.5.0

Exposed Ports

22 (SSH), 8888 (JupyterLab)

Registry Size

10.62 GB

Deployed Size

22.7 GB

Access Methods

SSH, JupyterLab

Axolotl streamlines LLM fine-tuning with support for modern efficiency techniques. You can fine-tune models with parameter-efficient methods or full fine-tuning depending on your GPU memory and dataset size.

Ubuntu 22.04 Base

Use this for: Minimal OS when you need complete control over CUDA and framework installation.

Specification
Details

Key Libraries

None (minimal OS only)

Exposed Ports

22 (SSH)

Registry Size

78.25 MB

Deployed Size

236 MB

Access Methods

SSH

Start from scratch when you have specific version requirements or need to replicate an existing environment exactly.

Ubuntu 24.04 Base

Use this for: Minimal newer OS for custom GPU stacks and long-term projects.

Specification
Details

Key Libraries

None (minimal OS only)

Exposed Ports

22 (SSH)

Registry Size

218.79 MB

Deployed Size

638 MB

Access Methods

SSH

Use this for greenfield projects where you want the latest OS with extended support, or when you need specific system packages only available in Ubuntu 24.04.


Next Steps

You now understand what each template provides and when to use it. The next guide will walk you through deploying a GPU with an AI Template and connecting securely to start your workload.

If you have questions about which template fits your use case, consider your primary workload: are you training models, serving inference, or experimenting with new architectures? Match that to the template descriptions above.

Last updated