AI Templates
GPU environment setup typically consumes significant development time. You install CUDA drivers, resolve version conflicts, and align library dependencies before writing a single line of model code. NeevCloud AI Templates eliminate this overhead by providing production-ready GPU images with pre-configured CUDA, frameworks, and access methods.
Each AI template launches within seconds, pre-configured with GPU drivers, CUDA libraries, notebooks, and web/ssh access - so engineers can immediately start training, fine-tuning, or serving models without touching the underlying infrastructure.
What You Get with AI Templates
AI Templates are pre-built GPU images optimized for specific AI workloads. Each template includes:
Base OS and CUDA runtime - Pre-installed and version-matched
Framework libraries - PyTorch, TensorFlow, or specialized tools with locked versions
System utilities - SSH, JupyterLab, and workflow-specific interfaces
Network configuration - Exposed ports for APIs, dashboards, and notebooks
Storage optimization - Compressed registry images that expand on deployment
You select a template during GPU deployment. You can switch templates between workloads without rebuilding environments.
Why This Matters for Your Workflow
Problems AI Templates Solve
Version drift across environments: Your PyTorch model trains locally but fails in CI or production due to CUDA version mismatches or conflicting dependencies.
Unpredictable setup time: What should take a minutes extends to hours debugging driver installations, library conflicts, or missing system packages.
Poor reproducibility: Your teammate's GPU runs the same code with different results because of environment inconsistencies.
Manual inference optimization: You spend time tuning batch sizes, memory limits, and GPU configurations through trial and error instead of starting with optimized defaults.
Hidden performance issues: Your GPU utilization stays low because of misconfigured runtimes or suboptimal library versions.
What You Gain
Consistent environments: Identical CUDA versions, drivers, and libraries across development, staging, and production eliminate environment-related bugs.
Immediate productivity: Your GPU is ready to run training scripts or serve models the moment it deploys.
Reduced debugging overhead: You spend less time troubleshooting infrastructure and more time improving model performance.
Team standardization: Everyone on your team builds on the same tested base, reducing onboarding friction and knowledge silos.
Performance-aware defaults: Templates ship with optimized settings for memory management, batching, and inference paths based on the intended workload.
Your Deployment Workflow
Select your GPU type based on compute and memory requirements
Choose an AI Template matching your workload
Deploy the GPU - provisioning completes in seconds
Connect via SSH, JupyterLab, or web interface
Start working - train models, run inference, or build pipelines
No manual CUDA installation. No framework compilation. No dependency resolution.
Quick Selection Guide
Choose your template based on your primary workload:
Training and notebook-based work → NVIDIA CUDA or TensorFlow templates
LLM fine-tuning → Axolotl or H2O LLM Studio
Image generation workflows → ComfyUI
Production LLM inference → Triton vLLM or vLLM OpenAI
Local LLM experimentation → Ollama + Open WebUI
Custom builds → Ubuntu base templates
Template Specifications
PyTorch (with Ubuntu 22.04)
Use this for: General-purpose GPU development for PyTorch training, notebooks, and classical ML workflows.
Key Libraries
torch (CUDA 12.8), torchvision, torchaudio, JupyterLab, notebook, NumPy, Pandas, Matplotlib, Seaborn, scikit-learn
Exposed Ports
22 (SSH), 8888 (JupyterLab)
Registry Size
10.81 GB
Deployed Size
17 GB
Access Methods
SSH, JupyterLab
This is your starting point for PyTorch-based deep learning. You get a complete scientific computing stack with visualization libraries and notebook support for interactive development.
PyTorch (with Ubuntu 24.04)
Use this for: Same capabilities as Ubuntu 22.04 template, but with newer OS for long-term standardization.
Key Libraries
torch (CUDA 12.8), torchvision, torchaudio, JupyterLab, notebook, NumPy, Pandas, Matplotlib, Seaborn, scikit-learn
Exposed Ports
22 (SSH), 8888 (JupyterLab)
Registry Size
10.86 GB
Deployed Size
17.2 GB
Access Methods
SSH, JupyterLab
Use this if your team is standardizing on Ubuntu 24.04 LTS or if you need newer system package versions.
vLLM
Use this for: Fast OpenAI-compatible LLM inference with chat and completion API endpoints.
Key Libraries
vLLM, JupyterLab, notebook, NumPy, Pandas, Matplotlib
Exposed Ports
22 (SSH), 8888 (JupyterLab), 8080 (API)
Registry Size
11.82 GB
Deployed Size
26.6 GB
Access Methods
SSH, JupyterLab
vLLM provides high-throughput serving for large language models with continuous batching and PagedAttention optimization. You get OpenAI API compatibility out of the box, making it easy to swap endpoints in existing applications.
TensorFlow
Use this for: TensorFlow-based training and research using NVIDIA-optimized containers.
Key Libraries
TensorFlow, Keras ≥3.11.3, JupyterLab ≥4.4.8, notebook, ipykernel, protobuf ≥5.29.5
Exposed Ports
22 (SSH), 8888 (JupyterLab)
Registry Size
8.74 GB
Deployed Size
18.1 GB
Access Methods
SSH, JupyterLab
This template includes NVIDIA's optimized TensorFlow build with improved GPU performance for training and inference. Keras 3 provides a unified API for multi-framework workflows.
Triton Inference Server
Use this for: High-performance production LLM inference using NVIDIA Triton with vLLM backend.
Key Libraries
Triton Server, vLLM, FastAPI ≥0.115.0, Starlette ≥0.47.2, JupyterLab, notebook
Exposed Ports
22 (SSH), 8000 (HTTP), 8001 (gRPC), 8002 (metrics), 8888 (JupyterLab)
Registry Size
8.21 GB
Deployed Size
18.4 GB
Access Methods
SSH, JupyterLab, Triton HTTP API, Triton gRPC API, Triton Metrics API
Triton supports concurrent model execution, dynamic batching, and multi-framework deployment. You can serve multiple models simultaneously with built-in monitoring and metrics endpoints for production observability.
ComfyUI
Use this for: Node-based Stable Diffusion and image generation workflows.
Key Libraries
ComfyUI (master), PyTorch 2.5.1, JupyterLab, notebook, NumPy, Pandas
Exposed Ports
22 (SSH), 3000 (ComfyUI), 8888 (JupyterLab)
Registry Size
6.28 GB
Deployed Size
12.2 GB
Access Methods
SSH, JupyterLab, ComfyUI Web Interface
ComfyUI provides a visual workflow editor for diffusion model pipelines. You can chain models, upscalers, and post-processing steps without writing code, making it ideal for rapid prototyping and creative experimentation.
H2O LLM Studio
Use this for: Low-code LLM fine-tuning, evaluation, and experimentation.
Key Libraries
torch ≥2.8.0, transformers ≥4.53.0, protobuf ≥6.31.1, tornado ≥6.5.0, requests ≥2.32.4, starlette ≥0.47.2
Exposed Ports
22 (SSH), 10101 (Dashboard)
Registry Size
18.96 GB
Deployed Size
31.7 GB
Access Methods
SSH, H2O LLM Studio Dashboard
H2O LLM Studio provides a GUI for fine-tuning large language models without writing training loops. You configure hyperparameters, monitor training metrics, and export models through a web interface.
Ollama + Open WebUI
Use this for: Local LLM serving and chat-based experimentation.
Key Libraries
Ollama, Open WebUI
Exposed Ports
22 (SSH), 8080 (Web UI), 11434 (Ollama API)
Registry Size
3.52 GB
Deployed Size
10.55 GB
Access Methods
SSH, Open WebUI Dashboard
Ollama provides simple model management with a ChatGPT-style interface via Open WebUI. You can test open-source models like Llama, Mistral, or Phi with minimal configuration.
Axolotl
Use this for: Advanced LLM fine-tuning using LoRA, QLoRA, and full parameter fine-tuning.
Key Libraries
torch ≥2.8.0, torchvision, torchaudio, TensorFlow, JupyterLab, notebook, ipywidgets, scikit-learn ≥1.5.0
Exposed Ports
22 (SSH), 8888 (JupyterLab)
Registry Size
10.62 GB
Deployed Size
22.7 GB
Access Methods
SSH, JupyterLab
Axolotl streamlines LLM fine-tuning with support for modern efficiency techniques. You can fine-tune models with parameter-efficient methods or full fine-tuning depending on your GPU memory and dataset size.
Ubuntu 22.04 Base
Use this for: Minimal OS when you need complete control over CUDA and framework installation.
Key Libraries
None (minimal OS only)
Exposed Ports
22 (SSH)
Registry Size
78.25 MB
Deployed Size
236 MB
Access Methods
SSH
Start from scratch when you have specific version requirements or need to replicate an existing environment exactly.
Ubuntu 24.04 Base
Use this for: Minimal newer OS for custom GPU stacks and long-term projects.
Key Libraries
None (minimal OS only)
Exposed Ports
22 (SSH)
Registry Size
218.79 MB
Deployed Size
638 MB
Access Methods
SSH
Use this for greenfield projects where you want the latest OS with extended support, or when you need specific system packages only available in Ubuntu 24.04.
Next Steps
You now understand what each template provides and when to use it. The next guide will walk you through deploying a GPU with an AI Template and connecting securely to start your workload.
If you have questions about which template fits your use case, consider your primary workload: are you training models, serving inference, or experimenting with new architectures? Match that to the template descriptions above.
Last updated